SUMMARY: Kernel panic

From: Renaud, Andre (andre.renaud@hp.com)
Date: Wed Jun 25 2003 - 16:36:04 EDT


I realised after I sent this that I had not included enough information. Just for completeness here are a few more details. The crash-data file in /var/adm/crash contained the following (slightly pruned) stack trace:
tset machine_slot[paniccpu].cpu_panic_thread:
Begin Trace for machine_slot[paniccpu].cpu_panic_thread:
thread 0xfffffc00f6d30a80 stopped at [stop_secondary_cpu:1205 ,0xfffffc00004eb7c0] Source not available
> 0 stop_secondary_cpu(do_lwc = (unallocated - symbol optimized away))
   1 panic(s = (unallocated - symbol optimized away))
   2 event_timeout(func = (unallocated - symbol optimized away), arg = (unallocated -
   3 printf(fmt = (unallocated - symbol optimized away))
   4 panic(s = (unallocated - symbol optimized away))
   5 trap(a0 = (...), a1 = (...), a2 = (...), code = (unallocated - symbol optimized
   6 _XentMM(0x4, 0xfffffc00002cd7ec, 0xfffffc00006ecbf0, 0xfffffc0089984548,
   7 enqueue_tail(0x4, 0xfffffc00002cd7ec, 0xfffffc00006ecbf0, 0xfffffc0089984548,
   8 lock_wait(l = (unallocated - symbol optimized away), wait_str = (unallocated -
   9 lock_write(l = (unallocated - symbol optimized away))
  10 solock(0xf84e352, 0x0, 0xfffffc00003cf55c, 0x1, 0x800000)
  11 tcp_handle_timers(0x6da17610, 0x7ca1c, 0xfffffc00f6d23e00, 0xfffffc0000827610,
  12 tcp_rad_slowtimo(0x0, 0x0, 0x0, 0x0, 0xfffffc0000978000)

We were running Tru64 5.1 Patch Kit 5. I had a look through the release notes for Patch Kit 6 (which was released in April), and found the following entry:
Patch 1163:
Fixes a kernel memory fault in tcp_rad_slowtimo. This patch also
fixes a kernel memory fault in soclose() before calling soabort for
listener sockets.

So it looks like the tcp_rad_slowtimo function caused the problems. This was confirmed by HP Support.

Thanks for all the replies I got regarding what to look for in the crash-data.

Andre

-----Original Message-----
From: Renaud, Andre
Sent: Wednesday, 25 June 2003 1:34 PM
To: 'tru64-unix-managers@ornl.gov'
Subject: Kernel panic

We received the following in the kern.log after a kernel panic. Does anyone know what the cause of this might be, or where I should start looking?

Jun 25 12:40:30 alpha vmunix:
Jun 25 12:40:30 alpha vmunix: trap: invalid memory write access from kernel mode
Jun 25 12:40:30 alpha vmunix:
Jun 25 12:40:30 alpha vmunix: faulting virtual address: 0x0000000000000000
Jun 25 12:40:30 alpha vmunix: pc of faulting instruction: 0xfffffc00002cd7ec
Jun 25 12:40:30 alpha vmunix: ra contents at time of fault: 0xfffffc00002bd3c8
Jun 25 12:40:30 alpha vmunix: sp contents at time of fault: 0xfffffe0544e978d0
Jun 25 12:40:30 alpha vmunix:
Jun 25 12:40:30 alpha vmunix: panic (cpu 2): kernel memory fault

Thanks,
Andre Renaud



This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 10:49:24 EDT