Re: Permissions Problem

From: Green, Simon (SGreen@krafteurope.com)
Date: Mon Jul 29 2002 - 06:14:24 EDT


I can sympathise with your situation: I got thrown into the SP2 deep end
myself when the previous administrator left for pastures new. It's not so
bad, though: most of the time it's just the same as any other AIX box. For
the rest, the PSSP manuals are reasonably good. Get yourself on some
courses as well. IBM have re-structured them since I did them, but I think
the important ones are one (Installation) and three, (Problem
Determination). The performance course (four) isn't all that interesting if
you have a decent knowledge of general AIX performance issues. Course two,
(System Administration), isn't necessary if you're used to working on the
SP2. Course five (migration) might be reassuring when you come to do your
first major operating system or PSSP upgrade, but I don't think it's
actually needed: the Installation and Migration Guide is really very good
and has detailed, step-by-step instructions.

I haven't come across a problem quite like yours, so I think you'll have to
settle for old-fashioned detective work!

The fact that a restore from a mksysb fixes the problem, but only
temporarily, implies that whatever is doing the damage is outside the
rootvg; either an application script in an external VG, (or filesystem
excluded from the mksysb), or something coming from another node, or the
CWS.

It's unlikely to be the problem, but have a quick look at the File
Collections you have defined. (Look on it as a learning experience. :-))
Start with "/var/sysman/supper where" on NODE5. This will show the file
collections it knows about, (if any!), and where the master copy is;
probably your CWS.
On that node, have a look in /var/sysman/sup/<FileCollection>/list.
That has the definition of the collection: what file(s) are involved. If
it's a directory it implies everything underneath it. You might also look
for a "scan" file. Logs are /var/adm/SPlogs/filec/[sup|rsup] on the node.
(These two are symbolic links to the most recent log files; there will be
lots of log files, from each time supper was run.)

Like I said: it's unlikely to be the problem but it'll only take a few
minutes to check.

Moving on, I can only think of two things.
1. Go through every script in /etc/inittab and the root crontab to see if
they're doing anything nasty. Also check root crontab on CWS.
2. Comment out or remove all of the application startup commands from
inittab. Reboot and after making sure that everything is OK, run each
command individually until you find the one that's doing the damage.

You might also consider activating the audit system. I'm not certain it'll
help in your circumstances, but it should give you an accurate time for when
the permissions get changed.

I hope that's of help,
Simon Green
Philip Morris ITSC Europe

AIX-L Archive at http://marc.theaimsgroup.com/?l=aix-l&r=1&w=2
AIX FAQ at http://www.faqs.org/faqs/aix-faq/

N.B. Unsolicited email from vendors will seldom be appreciated.

> -----Original Message-----
> From: Theresa Sarver [mailto:IFMC.tsarver@SDPS.ORG]
> Sent: 26 July 2002 21:39
> To: aix-l@Princeton.EDU
> Subject: Permissions Problem
>
>
> Hello;
>
> Environment:
> SP Complex (9076) 1 frame, 7 nodes
> AIX 4.3.3 ML 8
> PSSP 3.2
> PTFSET 8
>
> I've been on leave for the past few weeks and have just returned to
> find that our SP Admin has left the firm and I'm now in
> charge of the SP
> complex. More importantly, I have "very limited" experience with the
> SP, so I'm be relying on all you SP experts quite a lot until
> I'm up to speed.
>
> The issue I'm currently having is NOT SP related (well, I
> don't think anyway). On July 11, around 11AM the (now gone)
> SP Admin updated the /etc/inetd.conf file and commeted out
> the following on NODE5:
> exec, ntalk, rusersd, sprayd, pcnfsd, time, dtspc, cmsd, ssalld
>
> About an hour later (so I'm told anyway) users started
> calling and saying that they couldn't get into NODE5. All
> users were getting the following error message:
> 3004-009: Failed Running Login Shell
>
> The SP Admin was able to log in as user root fom the CWS. At
> which point she called IBM for assistance. IBM immeidately
> noticed that the file permissions on /usr, /usr/bin, /etc
> were all 700 - they changed them to 755...still no one could
> log in. So the SP Admin restored from a mksysb image and all
> appeared to be fine.
>
> After ensuring people could log in, the Admin rebooted one
> final time to ensure everything was 'okay'. however, when the
> server came back up all the permissions were screwed up again
> and once again no one (other than root) could log in. She
> restored from mksysb again, and then left the firm shortly
> thereafter - without resolving the issue. This node has not
> been rebooted since, and I'm at a loss as to where to look to
> try to fix this issue.
>
> IBM is saying that an application, with root privilages,
> which is starting at boot-time is changing these permissions.
> The problem is that this is an SP Complex and almost
> everything in NODE5's /etc/inittab is starting on several
> other nodes as well - and they aren't experiencing problems?
> The only difference that I can see between the nodes is that
> NODE5 has the following line, why the other nodes do not:
> orapw:2:wait:/etc/loadext -l /etc/pw-syscall 2>&1
> NODE5 currently is not running Oracle, if that's relevant. -
> Anyone know what this line does? Is it safe to comment it out?
>
> Otherwise, all the remaining nodes are also loadind the
> following software applications.
>
> adsm:2:respawn:/usr/bin/dsmc sched > /dev/null 2>&1 # TSM scheduler
> connect:2:respawn:/usr/local/connect/start.cdpmgr
> /tmp/connect.log 2>&1
> orapw:2:wait:/etc/loadext -l /etc/pw-syscall 2>&1
> orakstat:2:wait:/etc/loadext -l /etc/ora_kstat 2>&1
> :oracle:2:wait:/c2f1n5in/u01/oracle/product/8.1.7/bin/start_or
> acle > /tmp/oralog
> express:2:wait:/home/oracle/expstart > /tmp/wwwexp.log 2>&1
> apache:2:wait:/scripts/apachestart.sh > /tmp/apache.log 2>&1
> imnss:2:once:/usr/IMNSearch/bin/imnss -start imnhelp
> >/dev/console 2>&1
> imqss:2:once:/usr/IMNSearch/bin/imq_start >/dev/console 2>&1
>
> Has anyone seen this before? Or does anyone have any ideas
> on where I can start? If this is an application issue - why
> would this start almost entirely "out of the blue"?
>
> Thanks in advance for the help - and I apologize for such a long post.



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 22:16:05 EDT