Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-77-2182832.1
Update Date:2017-04-25
Keywords:

Solution Type  Sun Alert Sure

Solution  2182832.1 :   ALERT - ODA ASM Diskgroup Suddenly Dismounting with Potential Corruption  


Related Items
  • Oracle Database Appliance X4-2
  •  
  • Oracle Database Appliance
  •  
  • Oracle Database - Enterprise Edition
  •  
  • Oracle Database Appliance X5-2
  •  
  • Oracle Database Appliance X3-2
  •  
  • Oracle Database Appliance Software
  •  
Related Categories
  • PLA-Support>Eng Systems>Exadata/ODA/SSC>Oracle Database Appliance>DB: ODA_EST
  •  




In this Document
Description
Occurrence
Symptoms
Workaround
Patches
 ASM rebalance will cause disk offline unexpected (Doc ID 2174728.1)
History
References


Applies to:

Oracle Database - Enterprise Edition - Version 12.1.0.1 to 12.1.0.2 [Release 12.1]
Oracle Database Appliance X5-2 - Version All Versions to All Versions [Release All Releases]
Oracle Database Appliance X4-2 - Version All Versions to All Versions [Release All Releases]
Oracle Database Appliance X3-2 - Version All Versions to All Versions [Release All Releases]
Oracle Database Appliance - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.
ODA, AU:134660, ASM,

Description

When an ASM allocation is not able to find two contiguous Allocation Units (AU) within the disk for volume file allocation (2xAU,) ASM will trigger a disk defrag operation to find contiguous space. During the check, the OS may return an I/O error. The AU error is interpreted by ASM as an IO error and offlines the Disk. If too many disks are offlined at the same time then the Diskgroup itself can be dismounted.

Occurrence

 Any Oracle Database Appliance hardware platform on ODA version 12.1.2.x including 12.1.2.0.0 up to 12.1.2.8.0. *

NOTE: ONCE the patch for 19818513 is applied rollback the patch before upgrading to 12.1.2.9 (or higher) to avoid a patch conflict as the fix exists in the newer versions.

 

  • V1, X3-2, X4-2 and X5-2
  • ODA version 12.1.2.x
  • ASM diskgroup: DATA, RECO, REDO or FLASH with a total ASM disk size showing an attempted AU allocation which cannot be divided by 8.


*  This is for version 12.x ASM database version and has not seen on version 11.x

Symptoms

ASM DATA, RECO, REDO or FLASH diskgroups may dismount unexpectedly. When trying to mount the diskgroups back again the same diskgroup will dismount again.
The ASM alert.log will show one or more disks in the asm diskgroup as offline because of an IO write error.

On the ODA systems affected by this issue will see disks offline and possibly diskgroups dismount during rebalance


Evidence points to an ASM TRACE FILE referenced in ASM ALERT.LOG

e.g. ASM Alert.log
...

...
ORA-15130: diskgroup "" is being dismounted
ORA-15066: offlining disk "HDD_E0_S0#_1565793848P2" in group "RECO" may result in a data loss
...

 

While _most cases_ due to this bug will have this signature this symptom can be caused by other issues

subsys:System krq:0x7efc2ce4a920 bufp:0x7efc2cf2e000 osderr1:0x69b5 osderr2:0x0


To confirm hitting this bug need the "AU:" in the related trace will be the last AU.

Example from ASM arb trace file

e.g. ASM1_arb0_1234.trc:

WARNING: Write Failed. group:3 disk:4 AU:134660 offset:7340032 size:1048576   <- The AU:134660 is the Last AU

How to confirm it is the last AU?

ASMCMD> lsdsk -k -G REDO    <-- or related diskgroup DATA;RECO or FLASH

 

To get the asm disk size DISK_SIZE calculate the DISK_SIZE/4-1 
In our example the size is AU: 134660 here.
If the asm disk size is 538644M we calculate the last AU using 538644/4-1 = 134660  -- That confirmed the write IO is on last AU which hit the bug.

Workaround

 None

Patches

This issue fixed at ODA 12.1.2.9  for older oda version we can try to apply OOB.

 Download and apply the fix for Internal bug 19818513 to prevent the problem from occurring.

If the customer already hit the issue please reference below Note for how to mount the diskgroup :

ASM rebalance will cause disk offline unexpected (Doc ID 2174728.1)

 

Also: PSEs are available for all of the following versions

24607812 - (O) - 93 - PSE FOR BASE BUG 19818513 ON TOP OF DATABASE PSU 12.1.0.2.160119 FOR LINUX X86-6
24621795 - (O) - 93 - PSE FOR BASE BUG 19818513 ON TOP OF DATABASE PSU 12.1.0.2.160419 FOR LINUX X86-6
24691970 - (O) - 93 - PSE FOR BASE BUG 19818513 ON TOP OF DATABASE PSU 12.1.0.2.160719 FOR LINUX X86-6
24625494 - (O) - 93 - PSE FOR BASE BUG 19818513 ON TOP OF DATABASE PSU 12.1.0.2.161018 FOR LINUX X86-6
24692063 - (O) - 93 - PSE FOR BASE BUG 19818513 ON TOP OF DATABASE PSU 12.1.0.2.2 FOR LINUX X86-64 [22
24692064 - (O) - 93 - PSE FOR BASE BUG 19818513 ON TOP OF DATABASE PSU 12.1.0.2.3 FOR LINUX X86-64 [22
24711940 - (O) - 93 - PSE FOR BASE BUG 19818513 ON TOP OF DATABASE PSU 12.1.0.2.4 FOR LINUX X86-64 [22
24685039 - (O) - 93 - PSE FOR BASE BUG 19818513 ON TOP OF DATABASE PSU 12.1.0.2.5 FOR LINUX X86-64 [22

 

History

14-Sep-2016     First draft internal review
   - Comment: OOB already filed and created for this specific problem

15-Sep-2016 Reviewed by Jiong, Tammy and Ravi R
16-Sep-2016 Final verbiage completed
    - waiting for the Fix to be made available prior to posting externally

13-Oct-2016  Added PSE list for internal review plus added comments that PSEs are needed for 12.1.x versions at this time

10-Oct-2016  per comment added database version range of 12.x as previously the range was not defined although inferred as 12.x ODA version which requires 12.x ASM database version

 

27-Sep-2016 Published

References

<BUG:22596552> - REPEATED DISMOUNT OF A DISKGROUP REDO WITH A HIGH REDUNDANCY
<BUG:24493735> - ODA : NOT ABLE TO MOUNT DISKGROUP ORA-15130 ORA-15066
<BUG:24568341> - ODA RECO DISKGROUP PRESENTS 5 MISSING FAILGROUPS
<BUG:24602446> - +REDO HAS BEEN DISMOUNTED WHEN PULL OUT ONE A SSD
<BUG:21037224> - ODA: REDO DG BEING DISMOUNTED VIA PST
<BUG:19818513> - ASM ARB0 HIT ORA-700[KFFRELOCATE...]/ORA-600[KFDVAVOPINITAPE_NXTN
<BUG:22294722> - ODA REDO DISKGROUP DISMOUNTED DUE TO WRITE ERRORS ON SSD DISKS
<BUG:23534228> - UNABLE TO MOUNT FLASH DISK GROUP IN ODA

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback