Re: help! picld error - is it a hardware issue?

From: ktn (ktn@dodo.com.au)
Date: Fri Jun 11 2004 - 03:17:59 EDT


Dear managers,

Some updates, it seems that John Benjamins had a similar problem in Solaris
8...I don't see any similar patches for v880s with Solaris 9 though, other
than 113447-17 and 113573 that Sun pointed. With these two patches I can see
my memory information now, but prtdiag -v shows the following unusual
environmental status (similar to that mentioned in
http://sunportal.sunmanagers.org/pipermail/summaries/2003-June/004000.html):
as well as the same console errors by picld as mentioned before.

Fan Bank :
----------

Bank Speed Status Fan State
( RPMS )
---- -------- --------- ---------
CPU0_PRIM_FAN failed in picl_get_propval_by_name for fan speed
General system failure

I guess I'll wait for more updated patches from Sun now for Solaris 9.
Thanks Joe Fletcher for pointing out for me to look for picld patches. Very
flaky indeed.

Oh, 113573-05 recommends installing patches 113574-07 in the (I assume)
latest README, but the latter patch has been withdrawn. Oh well.

Original Q:

>
> Dear managers, need your prompt help!
>
> I've been getting these errors in /var/adm/messages constantly since a
> reboot a machine, a Sunfire v880 running Solaris 9 Generic_112233-12 (due
to
> a power failure by the way) --
>
> ....
> Jun 11 03:12:12 serv picld[93]: [ID 710302 daemon.error] I/O error
> Jun 11 03:12:13 serv picld[93]: [ID 478985 daemon.error] ERROR running
> psvc_fan_fault_check_policy_0 on CPU1_PRIM_FAN (249
> 9992)
> Jun 11 03:12:13 serv picld[93]: [ID 710302 daemon.error] I/O error
> Jun 11 03:12:15 serv picld[93]: [ID 478985 daemon.error] ERROR running
> psvc_fan_fault_check_policy_0 on IO_BRIDGE_PRIM_FAN
> (2500216)
> Jun 11 03:12:15 serv picld[93]: [ID 710302 daemon.error] I/O error
> Jun 11 03:12:48 serv picld[93]: [ID 478985 daemon.error] ERROR running
> psvc_fan_fault_check_policy_0 on CPU0_PRIM_FAN (249
> 9960)
> Jun 11 03:12:48 serv picld[93]: [ID 710302 daemon.error] I/O error
> Jun 11 03:12:49 serv picld[93]: [ID 478985 daemon.error] ERROR running
> psvc_fan_fault_check_policy_0 on CPU1_PRIM_FAN (249
> 9992)
> Jun 11 03:12:49 serv picld[93]: [ID 710302 daemon.error] I/O error
> Jun 11 03:12:51 serv picld[93]: [ID 478985 daemon.error] ERROR running
> psvc_fan_fault_check_policy_0 on IO_BRIDGE_PRIM_FAN
> (2500216)
> ....
>
>
> In the logs during the reboot, the "PS2 Device unplugged" is the
last error
> picld gives...could this be a cause of the problem? --
> ....
> May 30 20:37:57 serv eri: [ID 517527 kern.info] SUNW,eri0 : 100 Mbps full
> duplex link up
> May 30 20:38:00 serv last message repeated 1 time
> May 30 20:38:02 serv pseudo: [ID 129642 kern.info] pseudo-device: devinfo0
> May 30 20:38:02 serv genunix: [ID 936769 kern.info] devinfo0 is
> /pseudo/devinfo@0
> May 30 20:42:23 serv picld[93]: [ID 293134 daemon.error] Device PS2
> unplugged
> May 30 20:42:50 serv fsck[164]: [ID 293258 user.error] libsldap: Status: 2
> Mesg: Unable to load configuration '/var/ldap/
> ldap_client_file' ('').
> May 30 20:42:50 serv last message repeated 3 times
> May 30 20:42:50 serv picld[93]: [ID 478985 daemon.error] ERROR running
> psvc_fan_fault_check_policy_0 on CPU0_PRIM_FAN (249
> 9960)
> May 30 20:42:50 serv picld[93]: [ID 875627 daemon.error] No such file or
> directory
> May 30 20:42:51 serv fsck[164]: [ID 293258 user.error] libsldap: Status: 2
> Mesg: Unable to load configuration '/var/ldap/
> ldap_client_file' ('').
> May 30 20:42:51 serv last message repeated 5 times
> May 30 20:42:52 serv picld[93]: [ID 478985 daemon.error] ERROR running
> psvc_fan_fault_check_policy_0 on CPU1_PRIM_FAN (249
> 9992)
> May 30 20:42:52 serv picld[93]: [ID 875627 daemon.error] No such file or
> directory
> May 30 20:42:53 serv fsck[164]: [ID 293258 user.error] libsldap: Status: 2
> Mesg: Unable to load configuration '/var/ldap/
> ldap_client_file' ('').
> May 30 20:42:53 serv last message repeated 2 times
> ....
>
> Running prtdiag shows the following, and the "no memory" part is
giving me a
> heart attack. Could this just be (from the logs above), an incomplete
boot?
> I am thinking of rebooting the machine and seeing if it will be the same,
or
> do you think it's something failing for sure? Many thanks in advance for
> reading. Will summarise.
>
> >prtdiag -v
> System Configuration: Sun Microsystems sun4u Sun Fire 880
> System clock frequency: 150 MHz
> Memory size: 8192 Megabytes
>
> ========================= CPUs
> ===============================================
>
> Run E$ CPU CPU
> Brd CPU MHz MB Impl. Mask
> --- --- ---- ---- ------- ----
> A 0 750 8.0 US-III 5.4
> B 1 750 8.0 US-III 5.4
> A 2 750 8.0 US-III 5.4
> B 3 750 8.0 US-III 5.4
>
> ========================= Memory Configuration
> ===============================
>
> Logical Logical Logical
> MC Bank Bank Bank DIMM Interleave Interleaved
> Brd ID num size Status Size Factor with
> ---- --- ---- ------ ----------- ------ ---------- -----------
> Cannot find any memory bank/segment info.
>
> ========================= IO Cards =========================
>
>
> Bus Max
> IO Port Bus Freq Bus Dev,
> Brd Type ID Side Slot MHz Freq Func State Name
> Model
> ---- ---- ---- ---- ---- ---- ---- ---- -----
> -------------------------------- ----------------------
> I/O PCI 9 A 8 33 66 1,0 ok SUNW,m64B
> SUNW,370-4362
>
> No failures found in System
> ===========================
>
>
> ========================= Environmental Status =========================
>
> System Temperatures (Celsius):
> -------------------------------
> Device Temperature Status
> ---------------------------------------
> CPU0 68 OK
> CPU1 73 OK
> CPU2 59 OK
> CPU3 61 OK
> MB 31 OK
> IOB 26 OK
> DBP0 28 OK
>
> =================================
>
> Front Status Panel:
> -------------------
> Keyswitch position: NORMAL
>
> System LED Status:
> GEN FAULT REMOVE
> [OFF] [OFF]
>
> DISK FAULT POWER FAULT
> [OFF] [OFF]
>
> LEFT THERMAL FAULT RIGHT THERMAL FAULT
> [OFF] [OFF]
>
> LEFT DOOR RIGHT DOOR
> [OFF] [OFF]
>
> =================================
>
> Disk Status:
> Presence Fault LED Remove LED
> DISK 0: [PRESENT] [OFF] [OFF]
> DISK 1: [PRESENT] [OFF] [OFF]
> DISK 2: [PRESENT] [OFF] [OFF]
> DISK 3: [PRESENT] [OFF] [OFF]
> DISK 4: [PRESENT] [OFF] [OFF]
> DISK 5: [PRESENT] [OFF] [OFF]
> DISK 6: [ EMPTY]
> DISK 7: [ EMPTY]
> DISK 8: [ EMPTY]
> DISK 9: [ EMPTY]
> DISK 10: [ EMPTY]
> DISK 11: [ EMPTY]
>
> =================================
>
> Fan Bank :
> ----------
>
> Bank Speed Status Fan State
> ( RPMS )
> ---- -------- --------- ---------
> CPU0_PRIM_FAN 1298089537 [ENABLED] OK
> CPU1_PRIM_FAN 1298089537 [ENABLED] OK
> CPU0_SEC_FAN 0 [DISABLED] OK
> CPU1_SEC_FAN 0 [DISABLED] OK
> IO0_PRIM_FAN 4000 [ENABLED] OK
> IO1_PRIM_FAN 3947 [ENABLED] OK
> IO0_SEC_FAN 0 [DISABLED] OK
> IO1_SEC_FAN 0 [DISABLED] OK
> IO_BRIDGE_PRIM_FANfailed in picl_get_propval_by_name for fan speed
> General system failure
> Power Supplies:
> ---------------
>
> Supply Status Fan Fail Temp Fail CS Fail 3.3V 5V 12V 48V
> ------ ------------ -------- --------- ------- ---- -- --- ---
> PS0 GOOD 9 4 3 5
> PS1 GOOD 9 3 3 5
> PS2 UNPLUGGED
>
>
> ========================= HW Revisions
> =======================================
>
> System PROM revisions:
> ----------------------
> OBP 4.5.6 2002/01/04 12:30
>
> IO ASIC revisions:
> ------------------
> Port
> Brd Model ID Status Version
> ---- --------------- ---- ------ -------
> IB-1 unknown 8 ok 4
> IB-1 unknown 9 ok 4
>
>
> ________________________________________________
>
> Message sent using Dodo
> Internet Webmail Server
> _______________________________________________
> sunmanagers mailing list
> sunmanagers@sunmanagers.org
> http://www.sunmanagers.org/mailman/listinfo/sunmanagers
>
>
>
>
>
>

________________________________________________

Message sent using
Dodo Internet Webmail Server
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:28:50 EDT