![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||||
Solution Type Problem Resolution Sure Solution 1420126.1 : ODA (Oracle Database Appliance) Different Disks Randomly Disappear After a Reboot
In this Document
Created from <SR 3-5334656327> Applies to:Oracle Database Appliance - Version All Versions to All Versions [Release All Releases]Oracle Database Appliance X4-2 - Version All Versions to All Versions [Release All Releases] Oracle Database Appliance X3-2 - Version All Versions to All Versions [Release All Releases] Oracle Database Appliance Software - Version 2.1.0.2 to 12.1.2.9 [Release 2.1 to 12.1] Information in this document applies to any platform. ***Checked for relevance on 21-Oct-2013*** Symptoms FOR SUPPORT ONLY - there are many flavors of this problem across several ODA versions: Please review INTERNAL information at the bottom of those note for further discussion and information
NOTE: This problem was originally associated with a problem alerted in 2.1.0.0 and fixed by 2.1.0.3.1
However, the problem symptoms can be considered more generic and can happen on most any ODA version: This does not mean you are hitting the same bug stated as fixed in 2.1.0.3.1: It does mean that similar / same symptoms can use the same corrective actions on most all versions Example: For the 2.1.0. the symptoms were as follows:
The problematic Node has been rebooted several times and has come back up with different disks missing each time: DATA dg - missing disks --------------- /dev/mapper/HDD_E1_S19_993871319p1 /dev/mapper/HDD_E1_S11_1196820151p1 /dev/mapper/HDD_E0_S13_1196881379p1 /dev/mapper/HDD_E0_S04_1196963151p1 RECO dg - missing disks --------------------- /dev/mapper/HDD_E1_S19_993871319p2 /dev/mapper/HDD_E1_S11_1196820151p2 /dev/mapper/HDD_E0_S13_1196881379p2 /dev/mapper/HDD_E0_S04_1196963151p2 Missing disks before a reboot included pd_04; pd_11; pd_13; pd_19 (RECO and DATA) However: After rebooting the node you confirm that different disks are missing (different ones): ASMCMD> mount all ORA-15032: not all alterations performed ORA-15040: diskgroup is incomplete ORA-15042: ASM disk "21" is missing from group number "3" ORA-15042: ASM disk "20" is missing from group number "3" ORA-15040: diskgroup is incomplete ORA-15042: ASM disk "17" is missing from group number "2" ORA-15042: ASM disk "16" is missing from group number "2" ORA-15040: diskgroup is incomplete ORA-15042: ASM disk "17" is missing from group number "1" ORA-15042: ASM disk "16" is missing from group number "1" (DBD ERROR: OCIStmtExecute) After reboot after a reboot included pd_21; pd_22 (REDO); and pd_16; pd_17 (RECO and DATA) ls -l /dev/mapper/HDD* is a method to quickly confirm the available HDD disks
Note: Your counts should take the version into account as V1, X3-2 and X4-2 ODA have different disk counts Also: use SSD for the REDO or *D* to include both the SSD and HDD counts Example - this shows two different counts on the same ODA [grid@svp-oda1 ~]$ ls -l /dev/mapper/HDD* |wc -l
57 [root@svp-oda2 ~]# ls -l /dev/mapper/HDD* |wc -l 51 Node 1 ------------ 57 disks Node 2 ---------- 51 disks As a result of these missing disks, ASM disks and Grid are not coming up Commands for determining missing disks : # oakcli show disk NAME PATH TYPE STATE STATE_DETAILS pd_00 /dev/sdam HDD ONLINE Good pd_01 /dev/sdaw HDD ONLINE Good pd_02 /dev/sdaa HDD ONLINE Good pd_03 /dev/sdak HDD ONLINE Good pd_04 /dev/sdan HDD ONLINE Good pd_05 /dev/sdax HDD ONLINE Good pd_06 /dev/sdab HDD ONLINE Good pd_07 /dev/sdal HDD ONLINE Good pd_08 /dev/sdao HDD ONLINE Good pd_09 /dev/sdau HDD ONLINE Good pd_10 /dev/sdac HDD ONLINE Good pd_11 /dev/sdai HDD ONLINE Good pd_12 /dev/sdap HDD ONLINE Good pd_13 /dev/sdav HDD ONLINE Good pd_14 /dev/sdad HDD ONLINE Good pd_15 /dev/sdaj HDD ONLINE Good pd_16 /dev/sdaq HDD ONLINE Good pd_17 /dev/sdas HDD ONLINE Good pd_18 /dev/sdae HDD ONLINE Good pd_19 /dev/sdag HDD ONLINE Good pd_20 /dev/sdar SSD ONLINE Good pd_21 /dev/sdat SSD ONLINE Good pd_22 /dev/sdaf SSD ONLINE Good pd_23 /dev/sdah SSD ONLINE Good # oakcli show diskgroup data ASM_DISK PATH DISK STATE STATE_DETAILS data_00 /dev/mapper/HDD_E0_S00_975071251p1 pd_00 ONLINE Good data_01 /dev/mapper/HDD_E0_S01_973074223p1 pd_01 ONLINE Good data_02 /dev/mapper/HDD_E1_S02_975283211p1 pd_02 ONLINE Good data_03 /dev/mapper/HDD_E1_S03_975067947p1 pd_03 ONLINE Good data_04 /dev/mapper/HDD_E0_S04_975277007p1 pd_04 ONLINE Good data_05 /dev/mapper/HDD_E0_S05_975080611p1 pd_05 ONLINE Good data_06 /dev/mapper/HDD_E1_S06_975276063p1 pd_06 ONLINE Good data_07 /dev/mapper/HDD_E1_S07_975284323p1 pd_07 ONLINE Good data_08 /dev/mapper/HDD_E0_S08_970712075p1 pd_08 ONLINE Good data_09 /dev/mapper/HDD_E0_S09_975061523p1 pd_09 ONLINE Good data_10 /dev/mapper/HDD_E1_S10_975282083p1 pd_10 ONLINE Good data_11 /dev/mapper/HDD_E1_S11_975281571p1 pd_11 ONLINE Good data_12 /dev/mapper/HDD_E0_S12_975274931p1 pd_12 ONLINE Good data_13 /dev/mapper/HDD_E0_S13_977596619p1 pd_13 ONLINE Good data_14 /dev/mapper/HDD_E1_S14_975053527p1 pd_14 ONLINE Good data_15 /dev/mapper/HDD_E1_S15_975284719p1 pd_15 ONLINE Good data_16 /dev/mapper/HDD_E0_S16_975268647p1 pd_16 ONLINE Good data_17 /dev/mapper/HDD_E0_S17_975283679p1 pd_17 ONLINE Good data_18 /dev/mapper/HDD_E1_S18_975281159p1 pd_18 ONLINE Good data_19 /dev/mapper/HDD_E1_S19_975279427p1 pd_19 ONLINE Good
ChangesThis problem can occur after:
Cause#1 If you are on ODA 2.1.x this problem has been identified as a bug:
<Bug: 13728921> - PHYSICAL DISKS DISAPPEAR AFTER REBOOTING NODE -closed as a duplicate of <Bug: 13618428> - AFTER LOSING ONE ASM DISK, MULTIPLE DISKS BECAME UNRESPONSIVE
CR 7132662 - P1 erie/firmware Cluster outage resulted from a single HDD failure - X4370M2 with Erie Solution
Apply the ODA 2.1.0.3.0. Patch Bundle Patch 13622348
- then Apply the ODA 2.1.0.3.1 Emergency Patch:13817532 -- single patch applied on top of 2.1.0.3.0
Please refer to - ODA (Oracle Database Appliance): The Steps to replace failing disks (Doc ID 1496114.1)
Add the missing entries (EXAMPLE for HDD S13) You can check as ASM is running the "rebalance":
References<NOTE:1438089.1> - ALERT - Urgent Mandatory OAK Patch 2.1.0.3.1 for ODA - (Oracle Database Appliance)<BUG:13728921> - PHYSICAL DISKS DISAPPEAR AFTER REBOOTING NODE 2 <BUG:13618428> - AFTER LOSING ONE ASM DISK, MULTIPLE DISKS BECAME UNRESPONSIVE <NOTE:1496114.1> - ODA (Oracle Database Appliance): The Steps to replace multiple disks failing concurrently Attachments This solution has no attachment |
||||||||||||||||||||
|