Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-2332807.1
Update Date:2017-12-28
Keywords:

Solution Type  Problem Resolution Sure

Solution  2332807.1 :   MiniCluster: Disk Failure of HDD1 Flags All Disks Prefixed with "HDD1" as Faulty  


Related Items
  • MiniCluster S7-2 Hardware
  •  
Related Categories
  • PLA-Support>Eng Systems>Exadata/ODA/SSC>MiniCluster>DB: MiniCluster_EST
  •  




Created from <SR 3-16214991351>

Applies to:

MiniCluster S7-2 Hardware - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.

Symptoms

When HDD1 is failed, MCMU flags all disks prefixed with "HDD1..." AS FAULTY

% mcmu diskutil -s
[INFO ] Log file path : xxxxx-n1:/var/opt/oracle.minicluster/setup/logs/mcmu_112017_072713.log

[INFO ] Invoked by OS user: root
[INFO ] Find log at: xxxxx-n1:/var/opt/oracle.minicluster/setup/logs/omc_diskutil_112017_072713.log
[INFO ] Set environment variable NOINUSE_CHECK started.
[INFO ] Set environment variable NOINUSE_CHECK succeeded.
DISK PATH LOCAL/JBOD STATE FAULT ERROR
/HDD0 /dev/dsk/c0t5000CCA0803C3110d0 JBOD OK -
/HDD1 /dev/dsk/c0t5000CCA0803C2978d0 JBOD OK -
/HDD2 /dev/dsk/c0t5000CCA0803C14D4d0 JBOD OK -
/HDD3 /dev/dsk/c0t5000CCA0803C5B38d0 JBOD OK -
/HDD4 /dev/dsk/c0t5000CCA0803C5010d0 JBOD OK -
/HDD5 /dev/dsk/c0t5000CCA0803C606Cd0 JBOD OK -
/HDD6 - JBOD OK -
/HDD7 - JBOD OK -
//SYS/MB/EUSB_DISK /dev/dsk/c1t0d0 Local OK -
ORACLE-DE3-24C.1704NMQ03Y/HDD0 /dev/dsk/c0t5000CCA260CFDA54d0 JBOD OK -
ORACLE-DE3-24C.1704NMQ03Y/HDD1 /dev/dsk/c0t5000CCA260D02284d0 JBOD FAULTY faulty
ORACLE-DE3-24C.1704NMQ03Y/HDD2 /dev/dsk/c0t5000CCA260D036E4d0 JBOD OK -
ORACLE-DE3-24C.1704NMQ03Y/HDD3 /dev/dsk/c0t5000CCA260CFE240d0 JBOD OK -
ORACLE-DE3-24C.1704NMQ03Y/HDD4 /dev/dsk/c0t5000CCA260CFE3ACd0 JBOD OK -
ORACLE-DE3-24C.1704NMQ03Y/HDD5 /dev/dsk/c0t5000CCA260D00308d0 JBOD OK -
ORACLE-DE3-24C.1704NMQ03Y/HDD6 /dev/dsk/c0t5000CCA0536DD53Cd0 JBOD OK -
ORACLE-DE3-24C.1704NMQ03Y/HDD7 /dev/dsk/c0t5000CCA0536DC8A0d0 JBOD OK -
ORACLE-DE3-24C.1704NMQ03Y/HDD8 /dev/dsk/c0t5000CCA0536DC728d0 JBOD OK -
ORACLE-DE3-24C.1704NMQ03Y/HDD9 /dev/dsk/c0t5000CCA0536DB858d0 JBOD OK -
ORACLE-DE3-24C.1704NMQ03Y/HDD10 /dev/dsk/c0t5000CCA0536DC868d0 JBOD FAULTY faulty
ORACLE-DE3-24C.1704NMQ03Y/HDD11 /dev/dsk/c0t5000CCA0536DB82Cd0 JBOD FAULTY faulty
ORACLE-DE3-24C.1704NMQ03Y/HDD12 /dev/dsk/c0t5000CCA0536DCB58d0 JBOD FAULTY faulty
ORACLE-DE3-24C.1704NMQ03Y/HDD13 /dev/dsk/c0t5000CCA0536DD190d0 JBOD FAULTY faulty
ORACLE-DE3-24C.1704NMQ03Y/HDD14 /dev/dsk/c0t5000CCA0536DC30Cd0 JBOD FAULTY faulty
ORACLE-DE3-24C.1704NMQ03Y/HDD15 /dev/dsk/c0t5000CCA0536DD6B4d0 JBOD FAULTY faulty
ORACLE-DE3-24C.1704NMQ03Y/HDD16 /dev/dsk/c0t5000CCA0536DBA34d0 JBOD FAULTY faulty
ORACLE-DE3-24C.1704NMQ03Y/HDD17 /dev/dsk/c0t5000CCA0536DBB04d0 JBOD FAULTY faulty
ORACLE-DE3-24C.1704NMQ03Y/HDD18 /dev/dsk/c0t5000CCA0536DD284d0 JBOD FAULTY faulty
ORACLE-DE3-24C.1704NMQ03Y/HDD19 /dev/dsk/c0t5000CCA0536DC2C0d0 JBOD FAULTY faulty

 

Cause

If a disk with the name "HDD1" fails and is reported by "fmadm list-fault -fs", then "mcmu diskutil -l" will flag all disks prefixed with the name "HDD1" as faulty, (eg. HDD10, HDD11 etc.) also because the pattern matching isn't strict enough.

 

# fmadm list-fault -fs
"/ORACLE-DE3-24C.1704NMQ03Y/HDD1" (hc://:chassis-mfg=Oracle-Corporation:chassis-name=ORACLE-DE3-24C:chassis-part=7319827:chassis-serial=1704NMQ03Y:fru-mfg=HG ST:fru-name=H7280A520SUN8.0T:fru-serial=001653PPGGMV--------VLKPGGMV:fru-part=HGST-H7280A520SUN8.0T:fru-revision=P9E2:devid=id1,sd@n5000cca260d02284/ses-encl osure=0/bay=1/disk=0) faulty
36de1b1b-474c-4423-9f34-f50453669250 1 suspects in this FRU total certainty 100%


In actual there is only one device HDD1 that is really reported as faulted:

root@xxxxx-n1:~# fmadm faulty
-
-
  Problem class : fault.io.disk.predictive-failure
  Certainty : 100%
  Affects : dev:///:devid=id1,sd@n5000cca260d02284//scsi_vhci/disk@g5000cca260d02284
  Status : faulted but still in service
  FRU
  Status : faulty
  Location : "/ORACLE-DE3-24C.1704NMQ03Y/HDD1"
  Location Alias : "/JBODARRAY1/HDD1"
  Manufacturer : HGST
  Name : H7280A520SUN8.0T
  Part_Number : HGST-H7280A520SUN8.0T
  Serial_Number : 001653PPGGMV--------VLKPGGMV

 

Solution

This is bug in the mcmu code and is fixed via Bug 27159826 in MC version -1.2.4.6

Work Around

==============

Disk faultness disappeared (faulted to OK) with the replacement of HDD1.

% mcmu diskutil -l
[INFO ] Log file path :
xxxxx-n1:/var/opt/oracle.minicluster/setup/logs/mcmu_112717_023855.log

[INFO ] Invoked by OS user: root
[INFO ] Find log at:
xxxxx-n1:/var/opt/oracle.minicluster/setup/logs/omc_diskutil_112717_023856.lo
g
[INFO ] Set environment variable NOINUSE_CHECK started.
[INFO ] Set environment variable NOINUSE_CHECK succeeded.
DISK STATE
/HDD0 OK
/HDD1 OK
/HDD2 OK
/HDD3 OK
/HDD4 OK
/HDD5 OK
/HDD6 OK
/HDD7 OK
//SYS/MB/EUSB_DISK OK
ORACLE-DE3-24C.1704NMQ03Y/HDD0 OK
ORACLE-DE3-24C.1704NMQ03Y/HDD1 OK
ORACLE-DE3-24C.1704NMQ03Y/HDD2 OK
ORACLE-DE3-24C.1704NMQ03Y/HDD3 OK
ORACLE-DE3-24C.1704NMQ03Y/HDD4 OK
ORACLE-DE3-24C.1704NMQ03Y/HDD5 OK
ORACLE-DE3-24C.1704NMQ03Y/HDD6 OK
ORACLE-DE3-24C.1704NMQ03Y/HDD7 OK
ORACLE-DE3-24C.1704NMQ03Y/HDD8 OK
ORACLE-DE3-24C.1704NMQ03Y/HDD9 OK
ORACLE-DE3-24C.1704NMQ03Y/HDD10 OK
ORACLE-DE3-24C.1704NMQ03Y/HDD11 OK
ORACLE-DE3-24C.1704NMQ03Y/HDD12 OK
ORACLE-DE3-24C.1704NMQ03Y/HDD13 OK
ORACLE-DE3-24C.1704NMQ03Y/HDD14 OK
ORACLE-DE3-24C.1704NMQ03Y/HDD15 OK
ORACLE-DE3-24C.1704NMQ03Y/HDD16 OK
ORACLE-DE3-24C.1704NMQ03Y/HDD17 OK
ORACLE-DE3-24C.1704NMQ03Y/HDD18 OK
ORACLE-DE3-24C.1704NMQ03Y/HDD19 OK
ORACLE-DE3-24C.1704NMQ03Y/HDD20 OK
ORACLE-DE3-24C.1704NMQ03Y/HDD21 OK
ORACLE-DE3-24C.1704NMQ03Y/HDD22 OK
ORACLE-DE3-24C.1704NMQ03Y/HDD23 OK



References

<BUG:27159826> - MCMU DISKUTIL -L WILL MARK TOO MANY DISKS AS FAILED
<NOTE:2239536.1> - MiniCluster MC-S7 Data to Collect for MCMU Hardware and Software Issues
<NOTE:2182293.1> - Oracle Storage DE3-24C - How to replace a Faulted HDD/SSD [VCAP]

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback