UDBL Syndrome and tmp file system full

From: Kostas Magkos (kmag@lab.epmhs.gr)
Date: Wed Apr 05 2006 - 08:19:50 EDT


Hi guys,

I have an Ultra1 with 512MB RAM acting as tacacs+ server. Here is the
uname output:

# uname -a
SunOS ul 5.8 Generic_117350-02 sun4u sparc SUNW,Ultra-1

For some time now the machine will occassionally complain about /tmp
being full and after one or two days it will deny any process forking:

Jan 20 11:01:55 xxxx tmpfs: [ID 518458 kern.warning] WARNING: /tmp:
File system full, swap space limit exceeded
Jan 20 11:07:51 xxxx last message repeated 2515 times
..
..
Jan 22 05:02:17 xxxx sshd[277]: [ID 800047 auth.error] error: fork: Not
enough space
Jan 23 05:05:56 xxxx genunix: [ID 470503 kern.warning] WARNING: Sorry,
no swap space to grow stack for pid 26285 (java)
Jan 23 05:05:56 xxxx last message repeated 4 times

The only remedy at this point is a system reboot.

During the more recent occurance of the said situation I noticed some
memory-related log entries which are immediately followed (well
immediately is somewhat relevant, as there is 2-days time interval) by
the usual tmp-file-system-full messages:

Mar 7 00:17:00 xxxx SUNW,UltraSPARC: [ID 275936 kern.info] [AFT0]
Corrected Memory Error detected by CPU0, errID 0x0006a42f.264bf2b0
Mar 7 00:17:00 xxxx AFSR 0x00000000.00100000<CE> AFAR
0x00000000.08f97498
Mar 7 00:17:00 xxxx AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00
Fault_PC 0x10026050
Mar 7 00:17:00 xxxx UDBL Syndrome 0x1c Memory Module U0701
Mar 7 00:17:00 xxxx SUNW,UltraSPARC: [ID 500550 kern.info] [AFT0] errID
0x0006a42f.264bf2b0 Corrected Memory Error on U0701 is Persistent
Mar 7 00:17:00 xxxx SUNW,UltraSPARC: [ID 128256 kern.info] [AFT0] errID
0x0006a42f.264bf2b0 ECC Data Bit 40 was in error and corrected
Mar 9 05:20:28 xxxx tmpfs: [ID 518458 kern.warning] WARNING: /tmp: File
system full, swap space limit exceeded
Mar 9 05:21:53 xxxx last message repeated 42 times
Mar 9 05:21:55 xxxx tmpfs: [ID 518458 kern.warning] WARNING: /tmp: File
system full, swap space limit exceeded
Mar 9 05:28:32 xxxx last message repeated 198 times
Mar 9 05:28:34 xxxx tmpfs: [ID 518458 kern.warning] WARNING: /tmp: File
system full, swap space limit exceeded
Mar 9 05:35:12 xxxx last message repeated 198 times
Mar 9 05:35:14 xxxx tmpfs: [ID 518458 kern.warning] WARNING: /tmp: File
system full, swap space limit exceeded
Mar 9 05:41:52 xxxx last message repeated 198 times
Mar 9 05:41:54 xxxx tmpfs: [ID 518458 kern.warning] WARNING: /tmp: File
system full, swap space limit exceeded
Mar 9 05:48:32 xxxx last message repeated 198 times

What is a "UDBL Syndrome 0x1c"? I did google for it but only "UDBL
Syndrome 0x3" came up.
Are the two events related (memory problem and tmp being full)?
If not how can I trace the tmp issue?
What is the recommended course of action for the memory problem?

Thanks in advance. Any hints will be greatly appreciated.

p.s. Sorry for overloading this post :-)

Kostas Magkos
Network Administrator
Internet Systematics Lab
NCSR Demokritos
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:39:29 EDT