Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1390664.1
Update Date:2016-08-21
Keywords:

Solution Type  Problem Resolution Sure

Solution  1390664.1 :   Exadata Storage/Cell Node Hung and Rebooted Due to Temporary IO Stall Caused by Drive Medium Errors  


Related Items
  • Exadata Database Machine V2
  •  
Related Categories
  • PLA-Support>Sun Systems>x86>Engineered Systems HW>SN-x64: EXADATA
  •  




In this Document
Symptoms
Cause
Solution
References


Created from <SR 3-5117237331>

Applies to:

Exadata Database Machine V2 - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.
***Checked for relevance on 15-Oct-2103***

Symptoms

Exadata Storage/Cell Node has been completely hung/frozen due to temporary I/O stall caused by drive medium errors and system gets rebooted (forced power cycle).

Command "cellcli -e list alerthistory" lists following alert:

63 2011-12-29T01:42:10-02:00 info "IO hang detected on CD_09_cell06. Power cycle forced."


"ipmitool sel list" contains following event (with message "OEM record c0") at the time of failure:

214 | 12/29/2011 | 01:42:10 | OEM record c0 | 004301 | 97cd9c999865
215 | 12/29/2011 | 01:42:11 | System Boot Initiated | System Restart | Asserted


File "$CELLTRACE/ms-odl.trc" shows following error (with time stamp of after cell node reboot):

[2011-12-29T09:50:04.539-02:00] [ossmgmt] [NOTIFICATION] [] [common.hwadapter.HardwareImpl] [tid: 15] [ecid: 180.128.211.98:90252:1325159256252:5,0] Adding alert: time: 1325130130000 msg: OEM Record :: IO hang detected. Force Power cycle. Detail: 43 0 - 97 cd - 9c 99 - 98 65
[2011-12-29T09:50:04.549-02:00] [ossmgmt] [NOTIFICATION] [] [ms.hwadapter.MSHardwareImpl] [tid: 15] [ecid: 180.128.211.98:90252:1325159256252:5,0] IO hang detected on CD_09_cell06. Power cycle forced. Detail: 43 0 - 97 cd - 9c 99 - 98 65
[2011-12-29T09:50:04.550-02:00] [ossmgmt] [NOTIFICATION] [] [ms.core.MSAlertHistory] [tid: 15] [ecid: 180.128.211.98:90252:1325159256252:5,0] AlertHistory 63 created. Severity: info. Message: IO hang detected on CD_09_dm02cel13. Power cycle forced.



Cause

Hitting following unpublished bug:
<Bug: 12626126> STBH:IO HANG DETECTED AND POWER CYCLE FORCED

Solution

This issue has been fixed in following patch:

   <Patch: 13517481> EXADATA COLLECTION OF ONE OFFS FOR RELEASE OLDER THAN 11.2.2.4.2

 

Install the <Patch 13517481> by following the installation instructions mentioned in the patch README.

Details about inclusion of the fix for the bug 12626126 is documented in following alert note:

   <Document: 1386617.1> ALERT - FLASH CARDS OFFLINE AFTER 6 MONTHS OF UPTIME

References

<BUG:12626126> - STBH:IO HANG DETECTED AND POWER CYCLE FORCED.
<NOTE:1386617.1> - ALERT - FLASH CARDS OFFLINE AFTER 6 MONTHS OF UPTIME

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback