Re: Node crashed 888 102 300 0C0

From: Kumar, Praveen (cahoot) (Praveen.Kumar@CAHOOT.COM)
Date: Tue May 04 2004 - 12:19:33 EDT


Simon,
          To start with why don't you execute the command strings to know
the program which has caused the core dump. Also if you can have a look at
some basic steps given in AIX Problem determination guide on Page 85, this
might help you in proceeding further in your analysis. If you can send in
the output of those commands it would be helpful to do some analysis.

The steps mentioned in the Prob.Det.guide asks to invoke the crash command
on the core file that is created.

Hope this would be helpful.

Regards
Praveen.K

-----Original Message-----
From: Green, Simon [mailto:Simon.Green@EU.ALTRIA.COM]
Sent: 04 May 2004 16:45
To: aix-l@Princeton.EDU
Subject: Node crashed 888 102 300 0C0

We just had a node crash with the above LED. The dump is to a dedicated
dump device so it'll be available at least until the next time the node
crashes! The node itself is an SP2 Silver node, with PSSP 3.2 and AIX
4.3.3.0_08.

It's rebooted OK on the second attempt. Initially it hung on 539, after
showing 731. I had it powered off and also re-set the modem attached to it;
it booted up OK when the power was restored.

There's nothing of significance in the error log: not even anything
referring to the Data Storage Interrupt, (which is what the "300" indicates
as the proximate cause of the crash).

We had some problems with this node last year and never got anywhere with
it. At that time I didn't have a valid dump, because there was a problem
with the AIX level on there: a mismatch between /unix and the actual running
version. At that time I checked that it was properly at ML08, did a bosboot
and updated the microcode.

Now, I've got a valid dump but it's out of support!

Can anybody help me with this? My main interest is in confirming that this
is a software problem and determining what the active process was at the
time of the interrupt - always assuming it WAS actually a DSI. Regrettably
my knowledge of "crash" is very limited. I've got the "Introduction to
Reading Dumps" IBM document, but I don't really understand it.

--
Simon Green
Altria ITSC Europe Ltd
AIX-L Archive at https://new-lists.princeton.edu/listserv/aix-l.html
<https://new-lists.princeton.edu/listserv/aix-l.html>
New to AIX? http://publib-b.boulder.ibm.com/redbooks.nsf/portals/UNIX
<http://publib-b.boulder.ibm.com/redbooks.nsf/portals/UNIX>
N.B. Unsolicited email from vendors will not be appreciated.
Please post all follow-ups to the list.
.sophos.3.80.05.03.
*********************
Internet communications are not necessarily secure and may be intercepted or
changed after they are sent.  cahoot does not accept liability for any such
changes.
If you wish to confirm the origin or content of this communication, please
contact the sender using an alternative means of communication.
This communication does not create or modify any contract.
This email may contain confidential information intended solely for use by
the addressee.  If you are not the intended recipient of this communication
you should destroy it without copying, disclosing or otherwise using its
contents.
Please notify the sender immediately of the error.
cahoot is a division of Abbey National plc.
Abbey National plc is registered in England, registered number 2294747.
Registered Office: Abbey National House, 2 Triton Square, Regent's Place,
London, NW1 3AN.


This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 22:17:53 EDT