![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||||
Solution Type Problem Resolution Sure Solution 2321249.1 : Exalogic X3-2: "fault.cpu.intel.thermtrip" Fault In Compute Node ILOM After SP Reset
In this Document
Created from <SR 3-15990793781> Applies to:Exalogic Elastic Cloud X3-2 Hardware - Version X3 to X3 [Release X3]Oracle Exalogic Elastic Cloud Software - Version 2.0.0.0.1 to 2.0.6.2.0 Linux x86-64 Oracle Virtual Sever (64-bit) SymptomsOn Exalogic X3-2 Compute Nodes CPU Thermal Trip ILOM fault message "fault.cpu.intel.thermtrip" is seen after ILOM SP reset. Almost the same time when getting an error "Intel Thermtrip Failure", system may reboot to prevent overheat. From fmdump_-v.out in ILOM Snapshot, we can see the error 'fault.cpu.intel.thermtrip' as follows which shows as repaired/resolved several minutes after system reboots. 2017-10-24/01:07:15 5e17fe16-7506-6b75-b404-a0f838ebb787 SPX86-8003-K5 fault = fault.cpu.intel.thermtrip@/SYS/MB/P1 2017-10-24/01:14:33 5e17fe16-7506-6b75-b404-a0f838ebb787 SPX86-8003-K5 Repaired 2017-10-24/01:14:33 5e17fe16-7506-6b75-b404-a0f838ebb787 SPX86-8003-K5 Resolved Above error been seen mostly on /SYS/MB/P1 but it has also been occasionally seen on P0 and sometimes on both CPU's at similar time. From host_debug_err.log in ILOM Snapshot, ILOM and system reset can be verified as follows: Tue Oct 24 01:01:21 2017 ID ffff Tue Oct 24 01:07:14 2017 ID ffff Tue Oct 24 01:14:50 2017 ID ffff P0 Fatals GFERRST 0x00000000,GFFERRST 0x00000000,GFNERRST 0x00000000, 0 **** Host Boot **** ChangesSP reset CauseThis is an ILOM known issue Sun Server X3-3 Bug 16634342 - X3-2: Thermtrip event on P1 just after SP reset
During the ILOM reset the fan control is not working properly and a CPU can actually temporarily heat up and trip the threshold. SolutionILOM/BIOS version 3.1.2.10.C includes the fix. In Exalogic environment, PSU 2.0.6.2.1 or above have fix for this issue. To resolve this issue upgrade to PSU 2.0.6.2.1 or later. More information about Exalogic PSU, please refer below document. References<BUG:16634342> - X3-2: THERMTRIP EVENT ON P1 JUST AFTER SP RESET<NOTE:1314535.1> - Exalogic Patch Set Updates (PSU) Master Note <NOTE:1530781.1> - Exalogic Infrastructure Physical and Virtual Releases/PSUs – Software and Firmware Version Information <NOTE:1268557.1> - Exalogic Elastic Cloud Software Known Issues Attachments This solution has no attachment |
||||||||||||||||||||
|