![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||||||||
Solution Type Sun Alert Sure Solution 2238782.1 : Alert:ODA X5-2 Physical Disk(s) Which are Good May Be Dropped from ASM Because of an IO Error
Because of disk firmware bug 25114213, at ODA X5-2 platform, ASM may drop disk because of IO error but the disk itself is good. This same problem can occur for more than one disk. Some evidence includes OS Messages include: "issue target reset:" "_scsi_send_scsi_io: timeout" . ASM alert.log may show "Time waited on I/O: 0 usec" as well as the disk being offlined and not brought automatically back online. In this Document
Applies to:Oracle Database Appliance X5-2 - Version All Versions to All Versions [Release All Releases]Linux x86-64 This happen only on ODAHA X5-2 platform using 8T disk at version 12.1.2.9 or earlier. DescriptionBecause of disk firmware bug 25114213, ODA X5-2 platform, ASM may drop disk because of a transient and not terminal IO error but the disk itself is physically good. OccurrenceThis problem only happens on the ODA X5-2 platform at version 12.1.2.9 or earlier. SymptomsDisk checks show the physical disk is good and the disk can be added back manually to ASM Various OS/HW checks show that the disk is good even though it is dropped from the ASM diskgroups. OS Messages sd X:0:X:0: device_blocked, handle(0x000d)
kernel: mpt3sas0: log_info(0x31120101):originator(PL), code(0x12), sub_code(0x0101) [sdXX] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK mpt3sas0: issue target reset: handle(0x000d) mpt3sas0: _scsi_send_scsi_io: timeout mpt3sas0: TEST_UNIT_READY: handle(0x002c), lun(0)
ASM alert log WARNING: Write Failed. group:3 disk:30 AU:2 offset:602112 size:4096
path:/dev/mapper/HDD_E1_S14_1XXXXp2 incarnation:0xe96894db synchronous result:'I/O error' subsys:System krq:0x7f7918ff3908 bufp:0x7be8f000 osderr1:0x69b5 osde IO elapsed time: 0 usec Time waited on I/O: 0 usec
Note:"Write Failed" messages from asm alert log can be related to multiple issues.
Confirm if this issue is related by reviewing the OS log and OSW (no outstanding io on any disks during the time) to confirm you have hit this problem. WorkaroundManually online the disk after confirming the disk is good using oakcli stordiag e#_pd_<slot#> e.g. PatchesODA 12.1.2.11 and higher includes the fixed disk firmware. History28-Feb-2017 created. 03-Mar-2017 reviewed with minor editorial changes and added comments for clarification 19-Jul-2017 Minor changes to sentence structure + changed the fixed version to 12.1.2.11.0 References<BUG:25114213> - MULTIPLE DISK OFFLINE FROM ASM IN SHORT PERIODAttachments This solution has no attachment |
||||||||||||||||||||||||
|