SUMMARY: Multiple Weird Problems

From: Tru64 User (tru64user@yahoo.com)
Date: Mon May 19 2003 - 12:30:06 EDT


Much thanks to Dr. Tom Blinn and Joe Mario from HP.
Restarting kloadsrv solved the weirdness going on.
netstat is working once again. Theory of how it
happened exists, no proof though (a user pasted big
text file on command line, and immediately problems
started...something there started it, probably!)
::starting kload server::
/sbin/kloadsrv < /dev/console > /dev/console 2>&1

----------------------------------
Text of entire dialog with the respondees::
If I'm not mistaken, "netstat" itself may depend on
"rdsym" and I know
for sure it depends on being able to get kernel
symbols through the
kloadsrv service. I think kloadsrv is wedged on your
system. I did
find the source reference to "rdsym" and it's in a C
library routine:

./usr/ccs/lib/libc/alpha/knlist.c: args[0] =
"/usr/lbin/rdsym";

and as it says in the source code,

/*
 * The knlist() library function looks up addresses of
kernel symbols.
 *
 * Return code:
 * -1 : unable to connect to kloadsrv, no lookup has
been done.
 * 0 : all lookup has been successfully completed.
 * n : n is a positive integer. Some symbols are
undefined.
 * n is the number of symbols that have failed
in lookup.
 */

and as it says in the knlist(3) reference page,

RETURN VALUES

  The knlist() routine returns zero on success. The
routine returns -1
if it
  was unable to connect to the kloadsrv daemon. In
this case, the
routine
  was unable to determine any of the requested
addresses. The routine
  returns a positive integer if it successfully finds
some addresses
and
  fails to find others. The integer value indicates
the number of
addresses
  knlist() was unable to return.

if you are root, it talks to the kloadsrv directly, I
think, but if you
are not root, it tries to run "rdsym" in a subprocess
and talk to it
via
pipes, and if you managed to run enough copies and the
subprocessed did
not go away, you'd kill yourself.

Among the commands that invoke knlist() are these:

nfsstat, pfstat, arp, ogated, rarpd, route, sendmail,
srconfig,
strsetup,
trpt, and xntpd.

Most of these, with the exception perhaps of sendmail
under one or more
of its aliases, would not normally be invoked by a
non-privileged user.

I'm guessing kloadsrv isn't running correctly on your
system.

Also, make sure rdsym is protected correctly; on my
V4.0G system, it's
got this setup:

doctor[1682]> ls -l /usr/lbin/rdsym
-rws--x--x 1 root bin 16384 May 14 2000
/usr/lbin/rdsym

in other words, it's setuid root (the library logic
depends on it being
running as root when it gets invoked).

-------------------------------------------------------
The rdsym executable is invoked by the libc knlist
routine for
  applications that are trying to get a kernel symbol
but are not
  running as root. If the application is running as
root, then
  knlist communicates directly with kloadsrv.

  Since you stated that netstat now hangs, you might
want to look
  to see if the file permission/ownership of netstat
has changed.
  It should be "-rwxr-sr-x 1 bin mem ".

  If that doesn't work, can you type "ipcs -a" and
send me the output?

--------------------------------------------------------

You have reached the limit of outstanding message
queue requests
  on your system (note your QNUM length of 40).

  It is very very likely that the numerous "rdsym"
processes on your
  system are directly involved here.

  The rdsym routine is called by knlist when the
calling program
  isn't root or isn't being run from a suid root
program.

  Stopping and restarting kloadsrv will get your
system working again.
  You can kill it with the kill command and then
restart it with:

        /sbin/kloadsrv < /dev/console > /dev/console 2>&1

  Rebooting will also work.
  However, that doesn't get to the original problem of
why did the
  problem show up to begin with.

  What is the protection on the rdsym command?
Ignoring the size and
  date of my rdsym below, yours should be:
 
        -rws--x--x 1 root bin 16384 May 14 2000
/usr/lbin/rdsym

  There may be nothing wrong with rdsym but rather,
it's failing
because
  some other illbehaving application filled up the
kloadsrv message
queue
  beforehand. Once that happens, then an innocent
command like rdsym
  will fail because it can't use the msg queue.

-----------------------------------------

--- Tru64 User <tru64user@yahoo.com> wrote:
> Hi,
>
> I got two very informative replies on rdsym from HP
> personnel, Dr. Thomas Blinn and Roberto Romani.
> Basically both point to ::
> ---------------------------------------
> I can see that it seems to be used to find the
> address
> inside of the
> running kernel for a given kernel symbol (hence
> "read
> symbol" or to
> be cryptic "rdsym"). It may be used as a helper
> program for things
> that need to do this, and it may be in the critical
> path for things
> like logging in, I simply don't know (and it takes
> time to search
> the system's sources looking for references to it).
>
> "rdsym" wants to talk to the kernel loader service,
> aka "kloadsrv",
> which should have been started by init and usually
> is
> running as
> pid 3 or so from the /sbin/kloadsrv image.
>
> If kloadsrv dies, all kinds of strange things
> happen.
> It's not
> supposed to ever just die, and if it does, I don't
> think "init"
> will re-spawn it, it's supposed to be started even
> before init
> tries to access the system console.
>
> If "kloadsrv" isn't running, you may have to reboot
> to
> clean up
> whatever made it die. Check the console logs and
> the
> logs in
> the syslog.dated to see if you can find any error
> logged, but
> it's unlikely that you will find anything. You
> might
> find a
> core file in root (/).
> ------------------------------------
> Since it is a suid program, i have a scenario of how
> it could have happened. The user states that he
> performed a cut and paste of big text, in error, on
> to
> the command prompt. So "likely" it happened that
> way!
> Otherwise, kloadsrv is still running (since march
> 21st, last reboot). No core file in /.
> and netstat still hangs!
>
> Any other clues welcomed, and am planning for a
> shutdown over the weekend.
>
> _Thanks
>
> Richard
>
> --- Tru64 User <tru64user@yahoo.com> wrote:
> > Hi All,
> >
> > Tru64 v4.0G Patch Kit #3
> >
> > First we started getting
> > 1. vmunix: task_create() failed for pid 5002:
> > maxuprc
> > (=128) exceeded for uid 6232
> >
> > This user was not logged in, and ps did not show
> > anything running under their uid.
> >
> > 2. Then I found about 318 procs called rdsym
> running
> > on system, owned by root
> >
> > sample:
> > root 23523 1 0.0 12:04:28 ttys9
> > 0:00.01 /usr/lbin/rdsym
> > root 26293 1 0.0 12:04:29 ttys9
> > 0:00.01 /usr/lbin/rdsym
> >
> > There is no manpage for rdsym, and searching into
> > archives did not give me much. Not sure what
> > triggered
> > it, or what it does. Need a light on this.
> > Killing all rdsym procs, enabled the user to log
> in.
> > Remember, rdsym was owned by root, there were 318
> > procs running (using ps -ef|grep rdsym|wc -l).
> > How is that related to the user?? Mystery i am
> > trying
> > to solve.
> >
> > 3. Now, netstat hangs!! Running trace on it, spits
> > out
> >
> > #trace netstat
> > netstat
> > Tracing process /proc/10466
> > PIOCPSINFO: Operation would block
> >
> > Anybody has a quick clue of what is going on here?
> > They might all be related, or not.....by simple
> > uid/proc id, there is no link!
> >
> > All similar experience stories welcomed.
> >
> > _Thanks
> >
> > Richard
> >
> > =====
> >
> >
> > __________________________________
> > Do you Yahoo!?
> > The New Yahoo! Search - Faster. Easier. Bingo.
> > http://search.yahoo.com
>
>
> =====
>
>
> __________________________________
> Do you Yahoo!?
> The New Yahoo! Search - Faster. Easier. Bingo.
> http://search.yahoo.com

=====

__________________________________
Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo.
http://search.yahoo.com



This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 10:49:19 EDT