![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||
Solution Type Problem Resolution Sure Solution 1567000.1 : Multiple Fan Modules failing at the same time, on Sun SPARC Enterprise M4000 and Sun SPARC Enterprise M5000 servers
Applies to:Sun SPARC Enterprise M5000 Server - Version All Versions and laterSun SPARC Enterprise M4000 Server - Version All Versions and later Information in this document applies to any platform. SymptomsIf a Fan Module suffers from Fan rotation speed getting below its predefined threshold, or if a Fan Module has its fans stopped completely, FMA will report SCF-8005-YH - Fan rotation speed lower than its predefined threshold or fan stopped completely . If multiple Fan Modules report this error at about the same point in time, it is very unlikely that multiple Fan modules are failing at the same time. Further analysis is needed, to determine if it is the Fan modules that are actually failing, or if it is the Fan Module Controller that is failing. Multiple Fan Modules failing at about the same point in time, can be recognized at various data locations at the XSCF. XSCF> showstatus FANBP_C Status:Normal; * FAN_A#0 Status:Faulted; * FAN_A#2 Status:Faulted; XSCF> XSCF> fmdump -e TIME CLASS . . Jun 19 18:09:47.5747 ereport.chassis.device.fan.tooslow Jun 19 18:10:46.5616 ereport.chassis.device.fan.tooslow . . XSCF> XSCF> fmdump -V TIME UUID MSG-ID . . Jun 19 18:09:48.6441 5f362716-ad19-46cd-a6ff-24cae9c87a3c SCF-8005-YH TIME CLASS ENA Jun 19 18:09:47.5747 ereport.chassis.device.fan.tooslow 0x59a612a199400001 . location = /FAN_A#2 . . . TIME UUID MSG-ID Jun 19 18:10:48.3930 54fdba45-e8a5-496e-a975-9d50720ac2a1 SCF-8005-YH TIME CLASS ENA Jun 19 18:10:46.5616 ereport.chassis.device.fan.tooslow 0x5a81d06c09200001 . location = /FAN_A#0. . XSCF> XSCF> showlogs error Date: Jun 19 18:09:48 CEST 2013 Code: 80002000-ccff0000-0104340600000000 Status: Alarm Occurred: Jun 19 18:09:47.389 CEST 2013 FRU: /FAN_A#2 Msg: Abnormal FAN rotation speed. Insufficient rotation Date: Jun 19 18:10:48 CEST 2013 Code: 80006000-ccff0000-0104340100000000 Status: Alarm Occurred: Jun 19 18:10:46.550 CEST 2013 FRU: /FAN_A#0 Msg: Abnormal FAN rotation speed. Insufficient rotation XSCF> XSCF> showlogs monitor . . Jun 19 18:09:53 m5000-sum505-p2 Alarm: /FAN_A#2:SCF:Abnormal FAN rotation speed. Insufficient rotation Jun 19 18:10:50 m5000-sum505-p2 Alarm: /FAN_A#0:SCF:Abnormal FAN rotation speed. Insufficient rotation . . XSCF>
CauseThe issue is most likely caused by a broken Fan Module Controller, it is not the individual Fan Modules that are failing. Sun SPARC M4000The FANBP_B contains the Fan Module Controllers for two defined sets of fans. One Fan Module Controller has control over FAN_A#0 and FAN_B#0, the other Fan Module Controller has control over FAN_A#1 and FAN_B#1. If you have one or both of these pairs of Fan Modules fail at about the same point in time, it is very likely you are looking at one or both Fan Module Controllers failing, and you will need to replace FANBP_B. Sun SPARC M5000The FANBP_C contains the Fan Module Controllers for two defined sets of fans. One Fan Module Controller has control over FAN_A#0 and FAN_A#2, the other Fan Module Controller has control over FAN_A#1 and FAN_A#3. If you have one or both of these pairs of Fan Modules fail at about the same point in time, it is very likely you are looking at one or both Fan Module Controllers failing, and you will need to replace FANBP_C.
SolutionTo get the appropriate Fan Backplane replaced, customers need to create a Service Request in My Oracle Support . The replacement of the Fan Backplane is urgent and should be scheduled immediately, because if the 2nd Fan Module controller develops a problem, it will completely bring the platform down. The expected service action (replace FANBP_B on Sun SPARC Enterprise M4000, replace FANBP_C on Sun SPARC Enterprise M5000 servers) requires a complete platform outage. Be aware that due to the Fan Backplane failing, secondary errors can remain in the BDB for the Fan Modules behind the broken Fan Module Controller. Affected Fan Modules will have their state changed from Faulted to Degraded. Make sure the Field Engineer runs clearfault against those Fan Modules, provide the service password if customer is running XCP1114 or earlier.
References<NOTE:1021830.1> - SCF-8005-YH - Fan rotation speed lower than its predefined threshold or fan stopped completely.<NOTE:1008229.1> - Gathering diagnostic data for SPARC Enterprise M3000/M4000/M5000/M8000/M9000 (OPL) Servers Attachments This solution has no attachment |
||||||||||||
|