SUMMARY: "consvar" causes: panic: tb_shoot ack timeout

From: jreed@appliedtheory.com
Date: Wed Oct 23 2002 - 16:14:33 EDT


Thanks to the many people who replied. The most to-the-point response
comes from the venerable Dr. Blinn, who states:
                ---------------------
You are not supposed to be able to use "consvar" to manipulate the
NIC settings. It should not have panic-d the system, that's a bug,
but it isn't meant to work with the NIC settings.
                ---------------------
Shawn Cromer suggested using:

# cat /etc/inet.local
# /usr/sbin/lan_config -i ee0 -s 100 -x 1

to make changes in realtime, and modifying /etc/inet.local to make them
permanent.
                ----------------------
Denise McCracken says:
(tb_shoot ack timeout is) not a very common error and can
be tough to duplicate, which makes it equally tough to troubleshoot.
        :
tb_shoot ack timeout is caused when one CPU tries to ask the other
about its transaction buffer and it doesn't get a response. It can be
caused by a faulty CPU, memory, and, I think, a few software things.
        :
When I do a consvar -l on the machine I'm on, it doesn't come up with
ew*_mode, so I would not think that it would be a "safe" parameter to
change.
                ----------------------
Darren Browett agrees:

Trying to do a consvar -s on any variable that does NOT show up when you
do a consvar -l is really really bad.
 
Even though you can do a consvar -g ewa0_mode, which is what we did,
"we can see it, therefore we should be able to change it", when we
issued the consvar -s on ewa0_mode it crashed the member that we
had issued the command on, plus the other member (of the cluster).

The fix as far as I can tell is, "if it is not in consvar -l do not
attempt to change it."
                -----------------------
Stephen Hagan, from Australia HP Unix support says:
I'm told Firmware 6.3 for ES4X system should fix this.
This is available from:
http://ftp.digital.com/pub/DEC/Alpha/firmware/
                ---------------------
Kevin Jones states:
Whenever doing a consvar to set any parameters always put a "runon 0"
before the command. There is a bug in multi-processor systems that
if consvar happens to run on the non-primary CPU it will crash.
                ---------------------
Lots of interesting info, thanks to all!

Judith Reed



This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 10:48:57 EDT