![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||||
Solution Type Problem Resolution Sure Solution 1637430.1 : Exadata X3-2 : A processor gets an "Intel Thermtrip Failure" after a SP reset
On Exadata X3-2, a cpu thermal trip fault SPX86-8003-K5 can be observed after a SP/ILOM reset. The document is intended to provide a workaround and fix for customers hitting this issue. In this Document
Created from <SR 3-8658214941> Applies to:Exadata X3-2 Hardware - Version All Versions to All Versions [Release All Releases]Information in this document applies to any platform. SymptomsOn an Exadata X3-2, CPU Thermal Trip fault.cpu.intel.thermtrip with ILOM fault SPX86-8003-K5 a short time after an ILOM reset. This is a known issue which has been seen mostly on /SYS/MB/P1 but it has also been occasionally seen on P0 and sometimes on both CPU's at similar time. Ex : faultmgmtsp> fmadm faulty
------------------- ------------------------------------ -------------- -------- Time UUID msgid Severity ------------------- ------------------------------------ -------------- -------- 2014-03-06/07:29:15 220d2d69-6878-e897-97e4-b81ca34ae521 SPX86-8003-K5 Critical Fault class : fault.cpu.intel.thermtrip ASRU : /SYS/MB/P1 faulted FRU : /SYS/MB/P1 (Part Number: 060D) (Serial Number: unknown) 100% faulty Description : A thermtrip signal has occurred on a server component. Response : The service-required LEDs for the affected component, TEMP_FAULT, and chassis will be illuminated. Impact : The server will be powered down immediately. Action : Please refer to the associated reference document at http://www.sun.com/msg/SPX86-8003-K5 for the latest service procedures and policies regarding this diagnosis.
ChangesThe customer did a SP reset. CauseDuring the ILOM reset the fan control is not working properly and a CPU can actually temporarily heat up and trip the threshold. SolutionWorkaround: Clear the fault with "set clear_fault_action=true" and reboot the server. -> set /SYS/MB/P1 clear_fault_action=true
Are you sure you want to clear /SYS/MB/P1 (y/n)? y Set 'clear_fault_action' to 'true' REFERENCE http://docs.oracle.com/cd/E19860-01/E21549/z40013e61440963.html References<NOTE:1501450.1> - INTERNAL Exadata Database Machine Hardware Current Product Issues (X3-2, X4-2, X3-8, X4-8 w/X4-2L)<BUG:16634342> - X3-2: THERMTRIP EVENT ON P1 JUST AFTER SP RESET <NOTE:1433134.1> - SPX86-8003-K5 - Intel Thermtrip Failure. Attachments This solution has no attachment |
||||||||||||||||||||
|