![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||
Solution Type Troubleshooting Sure Solution 2377203.1 : Troubleshooting SPT-8002-QD error reported on S7-2 SPARC server
In this Document
Applies to:SPARC S7-2SPARC S7-2L Information in this document applies to any platform. PurposeThis document provides a guidance in troubleshooting SPT-8002-QD - alert.ilom.chassis.config.fan.capacity-insufficient with probability=100. Impact: The chassis will power off immediately and subsequent power on will be inhibited.
Symptoms
1. fma/@usr@local@bin@fmadm_faulty.out from ILOM snapshot 2017-07-31/07:56:33 3aa28b88-8848-4f03-b7ad-abcdabcd SPT-8000-3R Major Fan tachometer speed is below its normal operating range. 2017-07-31/07:56:33 dfeb259e-0e76-6bc0-9147-abcdabcd SPT-8000-3R Major Fan tachometer speed is below its normal operating range. 2017-07-31/07:56:35 f65047a3-8083-4220-9490-abcdabcd SPT-8002-QD Critical Insufficient cooling capacity due to multiple faulted or missing fans.
1. The output of show faulty command from ILOM prompt: -> show faulty /SP/faultmgmt/1 | fru | /SYS/MB/FM0 Or display the current system faults from the Fault Management Shell : -> start /SP/faultmgmt/shell
faultmgmtsp> fmadm faulty
2.Check the history of all events logged in the event log ILOM snapshot ilom/@usr@local@bin@spshexec_show_-script_@X@logs@event@list.out -> show /SP/logs/event/list 13 Mon Jul 31 07:56:35 2017 Fault Fault critical Fault detected at time = Mon Jul 31 07:56:35 2017. The suspect component: /SYS has alert.ilom.chassis.config.fan.capacity-insufficientwith probability=100. Refer to http://support.oracle.com/msg/SPT-8002-QD for details. 12 Mon Jul 31 07:56:35 2017 Power Off major Power to /SYS has been turned off by: SP, Reason: Fault 11 Mon Jul 31 07:56:33 2017 Fault Fault critical Fault detected at time = Mon Jul 31 07:56:33 2017. The suspect component: /SYS/MB/FM2 has fault.chassis.device.fan.fail with probability=100. Refer to http://support.oracle.com/msg/SPT-8000-3R for details. 10 Mon Jul 31 07:56:33 2017 Fault Fault critical Fault detected at time = Mon Jul 31 07:56:33 2017. The suspect component: /SYS/MB/FM0 has fault.chassis.device.fan.fail with
3.The output of fmdump -v from the Fault Management Shell fma/@usr@local@bin@fmdump_-v.out from ILOM snapshot -> start /SP/faultmgmt/shell 2018-03-01/19:47:37 d043fb9c-22ca-ee37-83df-abcdabcd SPT-8000-3R Fan Speed Below Normal Range
4. The output of fmdump -ev from the Fault Management Shell fma/@usr@local@bin@fmdump_-ev.out from ILOM snapshot faultmgmtsp> fmdump -ev 2018-02-14/11:58:27 ereport.chassis.config.fan.toofew-asserted@/SYS
2018-02-14/11:59:27 ereport.chassis.config.fan.toofew-deasserted@/SYS
Troubleshooting Steps1. Please gather an ILOM snapshot SRDC - SPARC T3-x, T4-x, T5-x, T7-x, S7-x, T8-x servers: Simple instructions to collect ILOM snapshot (Doc ID 2077387.1) 2. Displays the environmental status of the host server. -> show -o table -level all /SYS Target | Property | Value --------------------------------------------------------------------- /SYS/MB/FM0 | type | Front Fan /SYS/MB/FM0 | fault_state | Faulted /SYS/MB/FM0 | clear_fault_action | (none) /SYS/MB/FM0/F0 | type | Fan /SYS/MB/FM0/F0/TACH | type | Fan /SYS/MB/FM0/F0/TACH | ipmi_name | FM0/F0/TACH /SYS/MB/FM0/F0/TACH | class | Threshold Sensor /SYS/MB/FM0/F0/TACH | value | 11500.000 RPM /SYS/MB/FM0/F0/TACH | upper_nonrecov_threshold | N/A /SYS/MB/FM0/F0/TACH | upper_critical_threshold | N/A /SYS/MB/FM0/F0/TACH | upper_noncritical_threshold | N/A /SYS/MB/FM0/F0/TACH | lower_noncritical_threshold | N/A /SYS/MB/FM0/F0/TACH | lower_critical_threshold | N/A /SYS/MB/FM0/F0/TACH | lower_nonrecov_threshold | 1000.000 RPM /SYS/MB/FM0/F0/TACH | alarm_status | cleared /SYS/MB/FM0/F1 | type | Fan /SYS/MB/FM0/F1/TACH | type | Fan /SYS/MB/FM0/F1/TACH | ipmi_name | FM0/F1/TACH /SYS/MB/FM0/F1/TACH | class | Threshold Sensor /SYS/MB/FM0/F1/TACH | value | 0 RPM /SYS/MB/FM0/F1/TACH | upper_nonrecov_threshold | N/A /SYS/MB/FM0/F1/TACH | upper_critical_threshold | N/A /SYS/MB/FM0/F1/TACH | upper_noncritical_threshold | N/A /SYS/MB/FM0/F1/TACH | lower_noncritical_threshold | N/A /SYS/MB/FM0/F1/TACH | lower_critical_threshold | N/A /SYS/MB/FM0/F1/TACH | lower_nonrecov_threshold | 1000.000 RPM /SYS/MB/FM0/F1/TACH | alarm_status | cleared /SYS/MB/FM0/F2 | type | Fan /SYS/MB/FM0/F2/TACH | type | Fan /SYS/MB/FM0/F2/TACH | ipmi_name | FM0/F2/TACH /SYS/MB/FM0/F2/TACH | class | Threshold Sensor /SYS/MB/FM0/F2/TACH | value | 11400.000 RPM /SYS/MB/FM0/F2/TACH | upper_nonrecov_threshold | N/A /SYS/MB/FM0/F2/TACH | upper_critical_threshold | N/A /SYS/MB/FM0/F2/TACH | upper_noncritical_threshold | N/A /SYS/MB/FM0/F2/TACH | lower_noncritical_threshold | N/A /SYS/MB/FM0/F2/TACH | lower_critical_threshold | N/A /SYS/MB/FM0/F2/TACH | lower_nonrecov_threshold | 1000.000 RPM /SYS/MB/FM0/F2/TACH | alarm_status | cleared /SYS/MB/FM0/F3 | type | Fan /SYS/MB/FM0/F3/TACH | type | Fan /SYS/MB/FM0/F3/TACH | ipmi_name | FM0/F3/TACH /SYS/MB/FM0/F3/TACH | class | Threshold Sensor /SYS/MB/FM0/F3/TACH | value | 9900.000 RPM /SYS/MB/FM0/F3/TACH | upper_nonrecov_threshold | N/A /SYS/MB/FM0/F3/TACH | upper_critical_threshold | N/A /SYS/MB/FM0/F3/TACH | upper_noncritical_threshold | N/A /SYS/MB/FM0/F3/TACH | lower_noncritical_threshold | N/A /SYS/MB/FM0/F3/TACH | lower_critical_threshold | N/A /SYS/MB/FM0/F3/TACH | lower_nonrecov_threshold | 1000.000 RPM /SYS/MB/FM0/F3/TACH | alarm_status | cleared /SYS/MB/FM0/SERVICE | type | Indicator /SYS/MB/FM0/SERVICE | ipmi_name | FM0/SERVICE /SYS/MB/FM0/SERVICE | value | Off
2. Please check the output from IPMI: /ipmi/@usr@local@bin@ipmiint_sensor_list.out from ILOM snapshot FM0/F0/TACH | 12100.000 | RPM | ok | 1000.000 | na | na | na | na | na
FM0/F1/TACH | 9100.000 | RPM | ok | 1000.000 | na | na | na | na | na FM0/F2/TACH | 12300.000 | RPM | ok | 1000.000 | na | na | na | na | na FM0/F3/TACH | 9000.000 | RPM | ok | 1000.000 | na | na | na | na | na FM0/PRSNT | 0x2 | discrete | 0x0200| na | na | na | na | na | na FM1/F0/TACH | 11900.000 | RPM | ok | 1000.000 | na | na | na | na | na FM1/F1/TACH | 9300.000 | RPM | ok | 1000.000 | na | na | na | na | na FM1/F2/TACH | 12300.000 | RPM | ok | 1000.000 | na | na | na | na | na FM1/F3/TACH | 9300.000 | RPM | ok | 1000.000 | na | na | na | na | na FM1/PRSNT | 0x2 | discrete | 0x0200| na | na | na | na | na | na FM2/F0/TACH | 12500.000 | RPM | ok | 1000.000 | na | na | na | na | na FM2/F1/TACH | 9300.000 | RPM | ok | 1000.000 | na | na | na | na | na FM2/F2/TACH | 12400.000 | RPM | ok | 1000.000 | na | na | na | na | na FM2/F3/TACH | 9400.000 | RPM | ok | 1000.000 | na | na | na | na | na FM2/PRSNT | 0x2 | discrete | 0x0200| na | na | na | na | na | na FM3/F0/TACH | 12300.000 | RPM | ok | 1000.000 | na | na | na | na | na FM3/F1/TACH | 9200.000 | RPM | ok | 1000.000 | na | na | na | na | na FM3/F2/TACH | 12300.000 | RPM | ok | 1000.000 | na | na | na | na | na FM3/F3/TACH | 9100.000 | RPM | ok | 1000.000 | na | na | na | na | na FM3/PRSNT | 0x2 | discrete | 0x0200| na | na | na | na | na | na
3. Please check and replace the fans that are reported as faulty and that doesn't have any RPM value. 4. If the values of rotation per minute (RPM) are ok in -> show -o table -level all /SYS , please proceed with the following actions: A. Clear the errors from FMA Solaris and from the service processor (ILOM prompt): Commands To Clear FMA faults on the T5-x, T7-x, S7-x Servers (Doc ID 2216293.1) B. Reset the service processor -> reset /SP 5. If all the fans are reported as working ok but the error SPT-8002-QD is still reported - on rare cases you will need to replace the Left Indicator Assembly (FRU) The above component can also cause unexpected power down of the server (but no-one physically pressed the power button)
References<NOTE:2077387.1> - SRDC - SPARC T3-x, T4-x, T5-x, T7-x, S7-x, T8-x servers: Simple instructions to collect ILOM snapshot<NOTE:2216293.1> - Commands To Clear FMA faults on the T5-x, T7-x, S7-x Servers <NOTE:2130436.1> - SPT-8002-QD - there are insufficient operational fans present <NOTE:1120673.1> - SPT-8000-3R - Fan Speed Below Normal Range Attachments This solution has no attachment |
||||||||||||||||
|