Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-79-1597715.1
Update Date:2017-10-05
Keywords:

Solution Type  Predictive Self-Healing Sure

Solution  1597715.1 :   How to investigate the Auto Service Request "ASR: Probable power supply failure (PS ) or AC input not present" for Sun Fire T1000/T2000 / Sun SPARC Enterprise T1000/T2000  


Related Items
  • Sun Fire T1000 Server
  •  
  • Sun SPARC Enterprise T1000 Server
  •  
  • Sun SPARC Enterprise T2000 Server
  •  
  • SPARC T5-2
  •  
  • Sun Fire T2000 Server
  •  
Related Categories
  • PLA-Support>Sun Systems>SPARC>CMT>SN-SPARC: Tx000
  •  




In this Document
Purpose
Scope
Details
 Description of the event
 How to verify if there is a genuine Power Supply issue
 Physical checks
 Software checks
References


Applies to:

Sun SPARC Enterprise T2000 Server - Version All Versions and later
Sun SPARC Enterprise T1000 Server - Version All Versions and later
Sun Fire T1000 Server - Version All Versions and later
Sun Fire T2000 Server - Version All Versions and later
SPARC T5-2 - Version All Versions and later
Information in this document applies to any platform.

Purpose

This article describes activity required by a System Administrator to verify whether action has to be taken on a reported loss of input power.
SunMC Event 1.3.6.1.4.1.42.2.12.2.2.1.1.10.2.2.1.1.6

Scope

This document is intended for System Administrators and support personnel.

Details

Auto Service Request (ASR) provides automatic processing of failure detection telemetry and SR creation for certain Oracle systems.  See http://www.oracle.com/asr for more information on ASR.

Description of the event

Power supply events can be either transient or persistent. They can be generated by external changes and actions most notably by the removal of AC from a power supply.

Additional checks need to be performed in order to understand the cause of this event received by ASR.

How to verify if there is a genuine Power Supply issue

Physical checks

Please check the power feeds for the system's PSUs

  • Are the power cords securely connected to all the system's PSUs?
  • Are the power feeds' breakers switched on?
  • You may need to consult your local electrician.

Software checks

Please run /usr/sbin/prtdiag -v on the Solaris command line.

Please compare the relevant parts of the output to the example of a failed PS0 provided below.

If all Fault LEDs are OFF, and all PSUs show NO_FAULT, then the fault is most likely transient and can be ignored for now.

Otherwise please collect and upload an Explorer as outlined in "What data is needed in order to troubleshoot my software or hardware problem?".

explorer.84cbd5e0.qstpvhs190-2013.09.09.16.03
Kernel version: SunOS 5.10 Generic 148888-03 NNL082903G 3-7786344841

Output of prtdiag -v:


System Configuration:  Oracle Corporation  sun4v SPARC Enterprise T2000
Memory size: 65408 Megabytes
...
============================ Environmental Status ============================
...
LEDs:
----------------------------------------------------------------
Location                           LED                State   
----------------------------------------------------------------
5678GHIJKL:CH/FT0/FM0              SERVICE            off     
5678GHIJKL:CH/FT0/FM1              SERVICE            off     
5678GHIJKL:CH/FT0/FM2              SERVICE            off     
5678GHIJKL:CH/FT2                  SERVICE            off     
5678GHIJKL:CH/SYS                  ACT                steady  
5678GHIJKL:CH/SYS                  LOCATE             off     
5678GHIJKL:CH/SYS                  SERVICE            steady  
5678GHIJKL:CH/SYS                  REAR_FAULT         steady  
...
============================ FRU Status ============================
Location                           Name      Status  
------------------------------------------------------
...
5678GHIJKL:CH/PS0                  PS        disabled
5678GHIJKL:CH/PS1                  PS        enabled  

<no PS entries found in /var/adm/messages>

Data from the Service Processor:


High frequency of FAILED messages for a single PSU in the output of "showlogs -v"
.../Tx000 $  grep PS showlogs_-v |tr -s " "|cut -d: -f1,5|cut -d" " -f1,2,4-255|sort -M|uniq -c
   1 JAN 01 "Input power unavailable for PSU at PS0."
   2 JAN 01 "Input power unavailable for PSU at PS1."
  99 JAN 01 "PSU at PS0 has FAILED."
   2 JAN 01 "PSU at PS0 has been inserted."
   2 JAN 01 "PSU at PS0 has been removed."
   1 JAN 01 "PSU at PS1 has FAILED."
   2 JAN 01 "PSU at PS1 has been inserted."
   2 JAN 01 "PSU at PS1 has been removed."
  43 JAN 02 "PSU at PS0 has FAILED."

JAN 01 00:03:29: 00044711: "PSU at PS0 has FAILED."

.../Tx000 $ cat showfaults_-v
sc> showfaults -v
  ID Time              FRU               Fault
   0 JAN 02 06:27:27   PS0               PSU at PS0 has FAILED.

.../Tx000 $ cat showenvironment
sc> showenvironment


=============== Environmental Status ===============
...
--------------------------------------------------------
System Indicator Status:
--------------------------------------------------------
SYS/LOCATE           SYS/SERVICE          SYS/ACT             
OFF                  ON                   ON                  
--------------------------------------------------------
SYS/REAR_FAULT       SYS/TEMP_FAULT       SYS/TOP_FAN_FAULT   
ON                   OFF                  OFF                 
--------------------------------------------------------
...
------------------------------------------------------------------------------
Power Supplies:
------------------------------------------------------------------------------
Supply  Status          Underspeed  Overtemp  Overvolt  Undervolt  Overcurrent
------------------------------------------------------------------------------
PS0     FAILED          OFF         OFF       OFF       OFF        OFF
PS1     OK              OFF         OFF       OFF       OFF        OFF

 

For Sun Fire T2000 / Sun SPARC Enterprise T2000 there is a special failure mode when both power supplies are reported as failing as shown in the example below.

Please consult the document "T2000 Shows Both Power Supplies as Failed but System Still Operational" for detailed instructions on how to handle this situation.

explorer.84e5ee4e.mm1legb-2013.07.13.15.15 0814NNN01C 3-7712281881
Kernel version: SunOS 5.10 Generic 138888-02

Output of prtdiag -v:


System Configuration:  Sun Microsystems  sun4v Sun Fire T200
Memory size: 32640 Megabytes
...
Temperature sensors:
------------------------------------------------------------
Location                           Sensor         Status    
------------------------------------------------------------
1234ABCDEF:CH/IOBD/IOB             T_CORE         ok
1234ABCDEF:CH/IOBD                 T_AMB          ok
1234ABCDEF:CH/MB/CMP0              T_TCORE        ok
1234ABCDEF:CH/MB/CMP0              T_BCORE        ok
1234ABCDEF:CH/MB                   T_AMB          ok
1234ABCDEF:CH/PDB                  T_AMB          failed (0degC )
...
LEDs:
------------------------------------------------------------
Location                           LED            State   
------------------------------------------------------------
1234ABCDEF:CH/FT0/FM0              SERVICE        off     
1234ABCDEF:CH/FT0/FM1              SERVICE        off     
1234ABCDEF:CH/FT0/FM2              SERVICE        off     
1234ABCDEF:CH/FT2                  SERVICE        off     
1234ABCDEF:CH/SYS                  ACT            steady  
1234ABCDEF:CH/SYS                  LOCATE         off     
1234ABCDEF:CH/SYS                  SERVICE        steady  
1234ABCDEF:CH/SYS                  REAR_FAULT     steady  
1234ABCDEF:CH/SYS                  TEMP_FAULT     off     
1234ABCDEF:CH/SYS                  TOP_FAN_FAULT  off     
1234ABCDEF:CH/HDD0                 SERVICE        steady  
...
============================ FRU Status ============================
Location                           Name      Status  
------------------------------------------------------
...
1234ABCDEF:CH/PS0                  PS        disabled
1234ABCDEF:CH/PS1                  PS        disabled

Example entries in /var/adm/messages:
Jul 13 16:15:27 oursystem SC Alert: [ID 474711 daemon.error] PSU at PS1 has FAILED.
Jul 13 16:15:39 oursystem SC Alert: [ID 474711 daemon.error] PSU at PS0 has FAILED.

High frequency of FAILED messages for both PSUs in /var/adm/messages:
messages $ grep PS mess* |tr -s " "|cut -d: -f2,5,6|cut -d" " -f1,2,7-255|sort -M|uniq -c
  83 Jun 10 PSU at PS0 has FAILED.
  83 Jun 10 PSU at PS1 has FAILED.
  95 Jun 11 PSU at PS0 has FAILED.
  96 Jun 11 PSU at PS1 has FAILED.
  96 Jun 12 PSU at PS0 has FAILED.
  95 Jun 12 PSU at PS1 has FAILED.
...

 Please consult Wayne Titchen for approval when making changes to this document

References

<NOTE:1547088.2> - How to Upload Files to Oracle Support
<NOTE:1010045.1> - Troubleshooting Power Supply failures on V210/V240/V215/V245/V440/V445, T1000/T2000, V480/V490/V880/V890 servers
<NOTE:1484421.1> - T2000 Shows Both Power Supplies as Failed but System Still Operational
<NOTE:1006990.1> - Oracle Explorer Data Collector Implementation Best Practice
<NOTE:1019144.1> - Data Requirements reference: What data is needed in order to troubleshoot my software or hardware problem?
<NOTE:1153444.1> - Oracle Services Tools Bundle (STB) - RDA/Explorer, SNEEP, ACT

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback