Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-77-2254892.1
Update Date:2017-11-15
Keywords:

Solution Type  Sun Alert Sure

Solution  2254892.1 :   (RA18) ZDLRA X6 storage server flash failure may lead to corruption in primary and/or secondary ASM mirror copies due to flash firmware issue  


Related Items
  • Zero Data Loss Recovery Appliance Software
  •  
  • Zero Data Loss Recovery Appliance X6 Hardware
  •  
  • Oracle Exadata Storage Server Software
  •  
Related Categories
  • PLA-Support>Eng Systems>Exadata/ODA/SSC>ZDLRA>DB: ZDLRA_EST
  •  




In this Document
Description
Occurrence
 Pre-requisite Conditions for Bug 25595250
Symptoms
Workaround
Patches
History
References


Applies to:

Zero Data Loss Recovery Appliance Software - Version 12.1.1.1.1 and later
Zero Data Loss Recovery Appliance X6 Hardware - Version All Versions and later
Oracle Exadata Storage Server Software - Version 12.1.2.3.1 to 12.1.2.3.3 [Release 12.1]
Linux x86-64

Description

Due to bug 25595250, a flash failure on an Exadata X6 storage server may lead to corruption in primary and/or secondary ASM mirror copies, and may propagate to other storage servers during certain ASM rebalance operations.

Occurrence

Pre-requisite Conditions for Bug 25595250

The following conditions must exist for this issue to occur:

  1. Storage server must be ZDLRA X6 hardware. Earlier hardware generations are not affected.
  2. Exadata software version on storage servers is lower than 12.1.2.3.4.
  3. A flash predictive failure occurs, after which database block corruptions occur due to a flash firmware issue, which may be encountered by the database and reported in the database alert log.

Confirm that a flash disk predictive failure has occurred.

This information can be obtained from the storage server alert history.  For example:

CellCLI> list alerthistory
4_1 2017-03-18T04:12:38+01:00 critical "Flash disk entered predictive failure status. Status : WARNING - PREDICTIVE FAILURE Manufacturer : Oracle Model Number : Flash Accelerator F320 PCIe Card Size : 2981GB Serial Number : XXXXXXXXXXXXXX Firmware : KPYABR3Q Slot Number : PCI Slot: 5; FDOM: 1 Cell Disk : FD_03_dbm01celadm01 Grid Disk : Not configured Flash Cache : Present Flash Log : Present"

 

This issue applied ONLY to flash drive predictive failure.  A critical alert that reports "Flash disk failed. Status : FAILED" is a real flash drive failure, not a predictive failure.

 

Symptoms

Symptoms include the following:

  1. After a flash drive fails, database block corruption is reported for data contained in the storage server with the failed flash drive. The following is an example of a what the database alert.log will contain:
Corrupt block relative dba: 0x01ebfd41 (file 9, block 32243009)
Fractured block found during buffer read
Data in bad block:
 type: 6 format: 2 rdba: 0x01ebfd41
 last change scn: 0x09c3.361fc6a0 seq: 0x1 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x8e237e58
 check value in block header: 0x15f5
 computed block checksum: 0x32a6
Reading datafile '+CATALOG/zdlra/datafile/dbfile.123.123456789' for corruption at rdba: 0x01ebfd41 (file 9, block 32243009)
Read datafile mirror 'CATALOG_CD_04_DBM01CELADM01' (file 9, block 32243009) found same corrupt data (no logical check)
Read datafile mirror 'VATALOG_CD_08_DBM01CELADM01' (file 9, block 32243009) found valid data
Hex dump of (file 9, block 32243009) in trace file /u01/app/oracle/diag/rdbms/zdlra/zdlra1/trace/zdlkra1_ora_12345.trc
Repaired corruption at (file 9, block 32243009)

Workaround

Refers to Exadata Critical Issue (EX37) X6 storage server flash failure may lead to corruption in primary and/or secondary ASM mirror copies due to flash firmware issue for detailed workaround procedure

In Step 5: Repair the corruption. Option 1 - Repair the corruption with ASM disk scrubbing, there is a reference to installing 3 Bug Fixes. These bugs are available for customers running 12.1.1.1.7 as part of Cumulative Patch 11 and for customers running 12.1.1.1.8 as part of April-2017 Cumulative Patch 1 and can be download via the link below. Until the patch is applied the option to repair the corruption with ASM disk scrubbing is not possible

In Step 5: Repair the corruption. Option 3 - Restore the affected files from backup, this is not applicable to ZDLRA.

 

Patches

To prevent this issue, update flash firmware by updating ZDLRA X6 storage servers to Exadata 12.1.2.3.4 or higher. 

History

14-Apr-2017 - Created

17-Apr-2017 - Patch 25896906: PLACEHOLDER FOR 12.1.1.1.7 CUMULATIVE PATCH 11

17-Apr-2017 - Patch 25896731: ZDLRA Patch FOR 12.1.1.1.8.201704 CUMULATIVE PATCH 1


Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback