Server is hanging - Apparent INITTAB problem

From: Theresa Sarver (IFMC.tsarver@sdps.org)
Date: Mon Sep 09 2002 - 16:01:17 EDT


Hi all;

I've got an S80 running AIX 4.3.3 ML 10 that has decided it doesn't want to finish booting.

A little background...Due to the specifics of this contract I am going through a pretty rigerous audit trying to get this server "certified"...in doing so I've had to make TONS of changes from last Wednesday afternoon through Friday (when the auditors arrived). I've had to disable services, modifiy group/owner/permissions on system files, enable system auditing, enable syslogging, and some other stuff that I'm sure I've forgot. All the changes I've made are documented, and I do have mksysb's up the wazooo I can restore back to if it comes down to that. But I hope it doesn't.

Anyway, after varying on all VG's and mounting all filesystems, the last thing I see is "Multi User Initialization Complete" (last line of /etc/rc script)...and it just sits there. Assuming it is walking down the inittab (and as I have no /etc/firstboot file) it appears to be hanging on the srcmstr. Though I'm not real sure why it would be hanging here? I can ping the box, but I can't telnet/ftp as the tcpip daemons aren't loaded.

This first happened Firday night, I finally had to reboot the box into maint mode, remove all but the "brc", "init", and "cons" lines out of the inittab and then the server booted just fine. I then manually executed everything else in the inittab and I encountered NO problems. NOTHING hung...NOTHING errored out. Also, no filesystems were full and there were no errors to speak of in the errpt or the bootlog. I rebooted after I verified everything was up and running and the server REBOOTED JUST FINE! ?????

On Saturday I had to disable SNMPd as well as all associated daemons dpid2 and muxatmd (not using ATM or SNMP), I also had to comment the 2 ATM lines out of the inittab. Oh, and I had to move nfs-mountd onto a reserved port. The server is scheduled to reboot every Monday 5AM - this morning I came in and it was hung at the same spot. So I did a repeat of Friday night and it worked just fine. Though I haven't rebooted a second time to see if it would come back up.

If anyone has any insight into what might be going on I'd sure appreciate it.
Thanks;
Theresa

INITTAB:

init:2:initdefault:
brc::sysinit:/sbin/rc.boot 3 >/dev/console 2>&1 # Phase 3 of system boot
powerfail::powerfail:/etc/rc.powerfail 2>&1 | alog -tboot > /dev/console # Power
 Failure Detection
:mkatmpvc:2:once:/usr/sbin/mkatmpvc >/dev/console 2>&1
:atmsvcd:2:once:/usr/sbin/atmsvcd >/dev/console 2>&1
load64bit:2:wait:/etc/methods/cfg64 >/dev/console 2>&1 # Enable 64-bit execs
rc:2:wait:/etc/rc 2>&1 | alog -tboot > /dev/console # Multi-User checks
fbcheck:2:wait:/usr/sbin/fbcheck 2>&1 | alog -tboot > /dev/console # run /etc/fi
rstboot
srcmstr:2:respawn:/usr/sbin/srcmstr # System Resource Controller
rctcpip:2:wait:/etc/rc.tcpip > /dev/console 2>&1 # Start TCP/IP daemons
adsmsmext:2:wait:/etc/rc.adsmhsm > /dev/console 2>&1 # TSM SpaceMan
rcnfs:2:wait:/etc/rc.nfs > /dev/console 2>&1 # Start NFS Daemons
cron:2:respawn:/usr/sbin/cron
piobe:2:wait:/usr/lib/lpd/pio/etc/pioinit >/dev/null 2>&1 # pb cleanup
qdaemon:2:wait:/usr/bin/startsrc -sqdaemon
writesrv:2:wait:/usr/bin/startsrc -swritesrv
uprintfd:2:respawn:/usr/sbin/uprintfd
diagd:2:once:/usr/lpp/diagnostics/bin/diagd >/dev/console 2>&1
pmd:2:wait:/usr/bin/pmd > /dev/console 2>&1 # Start PM daemon
logsymp:2:once:/usr/lib/ras/logsymptom # for system dumps
httpdlite:2:once:/usr/IMNSearch/httpdlite/httpdlite -r /etc/IMNSearch/httpdlite/
httpdlite.conf & >/dev/console 2>&1
imnss:2:once:/usr/IMNSearch/bin/imnss -start imnhelp >/dev/console 2>&1
imqss:2:once:/usr/IMNSearch/bin/imq_start >/dev/console 2>&1
sybase:2:wait:su - sybase -c /stars/sybase/11.9.2/install/sybase start 2>&1
dt:2:wait:/etc/rc.dt
autoacs:2:once:/usr/tivoli/tsm/devices/bin/rc.acs_ssi quiet >/dev/console 2>&1 #
Start the ssi agent
cons:0123456789:respawn:/usr/sbin/getty /dev/console
autosrvr:2:respawn:/usr/tivoli/tsm/server/bin/rc.adsmserv >/dev/console 2>&1
adsm:2:respawn:/usr/bin/dsmc sched > /dev/null 2>&1 # TSM scheduler
tty0:2:off:/usr/sbin/getty /dev/tty0
tty1:2:off:/usr/sbin/getty /dev/tty1



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 22:16:11 EDT