Segmentation violation

From: Schaper, Soeren (Schaper@media-saturn.com)
Date: Mon Jan 16 2006 - 07:55:25 EST


Hi Group,

 all of a sudden my backupserver deceided to go fishy. I am running IBM's
Tivoli Storage Manager. After an reboot of our Spectralogic 64K Tape Library
TSM did not recognize the library. So we restartet the tsmsoftware also.
Since then the tsm startup nearly always ends up with the following message:

ANR7821S Thread 12 (tid 21) terminating on signal 11 (Segmentation
violation).

After that, the process hangs and kann only be stopped with an kill -9.
I tried trussing the process and while I can see the error I have no clue
how to find out why this happens. The funny thing is, if I try to start tsm
using it's default configuration the process comes up half of the time and
fails the other half of the time. Does someone has a clue how to look
further ?

truss:
13061/44: read(111, " c r w - r w - r w - ".., 5120) = 67
13061/44: read(111, 0x130E9A724, 5120) = 0
13061/44: read(111, 0x130E9A724, 5120) = 0
13061/44: lseek(111, 0, SEEK_CUR) Err#29
ESPIPE
13061/44: close(111) = 0
13061/44: waitid(P_PID, 13074, 0xFFFFFFFF789F8F80,
WEXITED|WTRAPPED|_WNOCHLD) = 0
13061/44: ioctl(110, 0x40046410, 0xFFFFFFFF789FB7BC) = 0
13061/44: ioctl(110, (('@'<<24)|('<'<<16)|('d'<<8)|17),
0xFFFFFFFF789FB7A0) = 0
13061/44: open("/var/adm/sun_fc.debug", O_WRONLY|O_APPEND) = 111
13061/44: time() = 1137407627
13061/44: write(111, " 1 1 3 7 4 0 7 6 2 7 : 4".., 54) = 54
13061/44: close(111) = 0
13061/44: Incurred fault #6, FLTBOUNDS %pc = 0xFFFFFFFF7DF4ED20
13061/44: siginfo: SIGSEGV SEGV_MAPERR addr=0x200000000
13061/44: Received signal #11, SIGSEGV [caught]
13061/44: siginfo: SIGSEGV SEGV_MAPERR addr=0x200000000
13061/44: sigprocmask(SIG_SETMASK, 0xFFFFFFFF789FAA90, 0x00000000) = 0
13061/44: time() = 1137407627
ANR13061/44: write(2, " A N R", 3) = 3
782113061/44: write(2, " 7 8 2 1", 4) = 4
S 13061/44: write(2, " S ", 2) = 2
Thread 13061/44: write(2, " T h r e a d ", 7) = 7
13061/44: write(2, " 3 4", 2) = 2
13061/44: write(2, " ( t i d ", 6) = 6
13061/44: write(2, " 4 4", 2) = 2
) terminating on signal 13061/9: lseek(10, 0x7134D000, SEEK_SET)
= 0x7134D000
13061/44: write(2, " ) t e r m i n a t i n".., 24) = 24
13061/44: write(2, " 1 1 (", 4) = 4
Segmentation violation13061/44: write(2, " S e g m e n t a t i o n".., 22)
= 22
).
13061/44: write(2, " ) .\n", 3) = 3
13061/44: open("dsmserv.err", O_WRONLY|O_CREAT, 0644) = 111
13061/44: lseek(111, 0, SEEK_END) = 95750
13061/44: fstat(111, 0xFFFFFFFF789F9530) = 0

core dump traceback:

root@rigel [3] 50 server/bin> adb dsmserv
/usr/schaper/core/dsmservcoredump.1154
core file = /usr/schaper/core/dsmservcoredump.1154 -- program ``dsmserv'' on
platform SUNW,Sun-Fire-280R
$c
libc.so.1`_libc_sigtimedwait+4(101ef5440, 101b61400, 101b61628, 101b61628,
101ef5440, 101b61628)
pkWaitShutdown+0x5c(0, 0, 0, ffffffff7fffdd00, ffffffff7fffdcc1,
ffffffffffffffff)
admStartServer+0x1a78(0, 0, 101b2a1bc, 101b2a1bc, 101b2a1dc, 101b2a1dc)
main+0x1acc(1, ffffffff7ffffa18, ffffffff7ffffa28, 0, 0, 100000000)
_start+0x17c(0, 0, 0, 0, 0, 0)

libc.so.1`_libc_sigtimedwait+4(101ef5440, 101b61400, 101b61628, 101b61628,
101ef5440, 101b61628)
pkWaitShutdown+0x5c(0, 0, 0, ffffffff7fffdd00, ffffffff7fffdcc1,
ffffffffffffffff)
admStartServer+0x1a78(0, 0, 101b2a1bc, 101b2a1bc, 101b2a1dc, 101b2a1dc)
main+0x1acc(1, ffffffff7ffffa18, ffffffff7ffffa28, 0, 0, 100000000)
_start+0x17c(0, 0, 0, 0, 0, 0)
$q

Greetings
Soeren
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:38:28 EDT