![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||||
Solution Type Problem Resolution Sure Solution 2316670.1 : All DIMM's reported as failed with fault "SPX86-8001-QX fault.memory.intel.dimm.tempsensor-failed" during Exadata image and/or firmware upgrade.
In this Document
Created from <SR 3-15915627461> Applies to:Exadata Database Machine X2-2 Hardware - Version All Versions to All Versions [Release All Releases]x86 SymptomsOn certain X86 model systems (primarily Exadata V2/X2, Exalogic X2, X4170/X4270 and X4170 M2/X4270 M2), under rare circumstances during firmware and/or Engineered Systems (Exadata, Exalogic, etc), image upgrade you may receive error messages showing some or (typically) all DIMM's failed due to a faulty temp sensor like or similar to below: 2017-02-24/21:20:56 89ad837b-d10d-663f-f7d1-8cf391c74018 SPX86-8001-QX Critical
Fault class : fault.memory.intel.dimm.tempsensor-failed FRU : /SYS/MB/P0/D1 Description : A Memory DIMM's temperature sensor has failed. Response : None. Impact : DIMM will be used and enabled, but will no longer be Action : Please refer to the associated reference document at
ChangesSoftware image and/or firmware upgrade Cause
There are several known bugs logged about this behavior. In general, it's a transient issue that typically occurs as a result of the ILOM being unable to read DIMM temp sensor data for a short period of time during/after firmware upgrade. This can lead the ILOM Fault Manager incorrectly diagnosing one or more (typically all) DIMM's as having faulty temperature sensors. SolutionILOM 3.0 - "Using the Oracle ILOM Fault Management Shell" 1)Login to the ILOM CLI and type the commands: ->set /SYS/MB/P0/D0 clear_fault_action=true
Repeat this for each faulted DIMM (typically all of them) Or, use the ILOM Fault Management Shell: (It doesn't matter, use whichever you like better.) -> start /SP/faultmgmt/shell faultmgmtsp> fmadm faulty (lists all open fault events/components) [faultmgmtsp>fmadm repaired [UUID OR COMPONENT]
Repeat for each faulted DIMM. Note you can use "repair" OR "repaired", both serve the same purpose which is to inform the ILOM Fault Manager that the issue was fixed without replacing hardware. [faultmgmtsp>exit
The system's Service LED ~SHOULD~ turn off once all faults are cleared. Confirm with: -> reset /SP
Are you sure you want to reset /SP (y/n)? y Performing reset on /SP NOTE - the ILOM will become temporarily unavailable while it reboots (approx 1-3 minutes). There is NO impact or affect on the host. References<NOTE:1966568.1> - Platinum Service Delivery - Password Management for Customers starting OASG version 4.0<BUG:15758839> - SUNBT7117637 FAULT.MEMORY.INTEL.DIMM.TEMPSENSOR-FAILED SEEN DURING EXADATA FW UP <BUG:17263114> - ALL DIMMS ARE FAULTED - FAULT.MEMORY.INTEL.DIMM.TEMPSENSOR-FAILED https://docs.oracle.com/cd/E19860-01/E21549/z400015e1400653.html <NOTE:1309092.1> - How to use the Oracle ILOM 3.x Fault Management Shell Attachments This solution has no attachment |
||||||||||||||||||||
|