Configuration Prerequisites and Operational Steps for Higher Availability for a RECO disk group or Fast Recovery Area Failure

Asset ID:	1-79-2059780.1
Update Date:	2017-10-27
Keywords:

Solution Type Predictive Self-Healing Sure

Solution 2059780.1 : Configuration Prerequisites and Operational Steps for Higher Availability for a RECO disk group or Fast Recovery Area Failure

Applies to:

Oracle Exadata Storage Server Software - Version 11.1.0.3.0 and later
SPARC SuperCluster T4-4 - Version All Versions to All Versions [Release All Releases]
Oracle Exadata Hardware - Version 11.1.0.7 and later
Information in this document applies to any platform.

Purpose

Oracle recommends using High Redundancy Disk groups to provide the highest level of storage data protection and availability for mission critical applications. On Exadata, the default Exadata deployment is HIGH redundancy DATA and normal redundancy RECO and DBFS_DG where DATA contains all the data files, log files, control files, spfiles, OCR and possible voting files (if 5 cells or more otherwise OCR and voting files reside in DBFS_DG with optional quorum disks). With this Exadata deployment, RECO or the Fast Recovery Area (FRA) normally will contain archive logs, flashback logs and possible incremental backups. For non-Exadata, customers normally use ASM with external redundancy and provide additional redundancy to the DATA disk group versus RECO disk group.

This document describes the configuration prerequisites and operational steps required to keep the database running if you lose your RECO or FRA. It uses the Exadata best practice case where a customer chooses the new default Exadata deployment of high redundancy DATA and normal redundancy RECO as an initial configuration, but applies to any configuration with RECO deployed with normal redundancy. It describes the operational best practices after losing RECO (the default FRA on Exadata) and DBFS_DG disk groups at a later stage. This description can be generally applicable to non-Exadata.

Scope

The information in this document applies to Exadata Storage Server Software 11g Release 2 (11.2) and higher, Oracle Grid Infrastructure 11g Release 2 (11.2) and higher and Oracle RDBMS 11g Release 2 (11.2) and higher. Although it is recommended that earlier releases follow the same guidelines, behavior and error messages can be different in those earlier releases.

Details

After losing RECO or FRA, the database may either crash or eventually hang. In all cases, the database is vulnerable to an outage as a subsequent failure may result in very long recovery times and much higher than expected data loss. There are configuration prerequisites that may prevent the crash and prevent the database hang which is described in Section A. In Section B, we describe the minimal steps post RECO/FRA failure to prevent a future hang and mitigate recovery time and data loss if database files residing in the DATA disk group is subsequently lost. In Section C, we describe the steps required to restore operation from loss of DBFS_DG when it contains voting disks and the OCR. In Section D, we summarize the Data Guard implications.

Section A: Configuration Prerequisites

These configuration prerequisites are best suited for a configuration such that DATA is much more protected than RECO and you need to handle the potential situation where you lose RECO or the designated Fast Recovery Area, however also apply to any configuration with RECO deployed in normal redundancy. The default Exadata and database deployment on Exadata already handles many of these configuration prerequisites. They are the following:

DATA should contain all data files, log files, controlfiles, OCR, voting and spfiles. If voting files cannot reside in DATA because it requires a minimum of five failure groups or five Exadata storage cells, create a script to move the voting files or restart a RAC cluster in the case where DBFS_DG disk group (the voting files are stored in DBFS_DG by default) is dismounted or, if your Exadata software version supports, use quorum disks. You should also create periodic backups of the ASM spfile to ensure it can be restored.
Set up an alternate archive log destination to DATA which is part of the Exadata database parameter template. Use a command similar to the following to configure an alternative archive log destination.

SQL> REM Define the local destination to use the FRA and be a MANDATORY destination
SQL> ALTER SYSTEM SET LOG_ARCHIVE_DEST_1='LOCATION=USE_DB_RECOVERY_FILE_DEST VALID_FOR=(ALL_LOGFILES,ALL_ROLES) MANDATORY' SCOPE=BOTH;
SQL> ALTER SYSTEM SET LOG_ARCHIVE_DEST_STATE_1='ENABLE' SCOPE=BOTH;
SQL> REM Define a second destination, also as MANDATORY
SQL> ALTER SYSTEM SET LOG_ARCHIVE_DEST_2='LOCATION=+DATA VALID_FOR=(ALL_LOGFILES,ALL_ROLES) MANDATORY' SCOPE=BOTH;
SQL> REM Set the scope for this second location as ALTERNATE
SQL> ALTER SYSTEM SET LOG_ARCHIVE_DEST_STATE_2='ALTERNATE' SCOPE=BOTH;
SQL> REM Modify the original local destination to include the alternate destination
SQL> ALTER SYSTEM SET LOG_ARCHIVE_DEST_1='LOCATION=USE_DB_RECOVERY_FILE_DEST VALID_FOR=(ALL_LOGFILES,ALL_ROLES) MANDATORY MAX_FAILURE=1 ALTERNATE=LOG_ARCHIVE_DEST_2' SCOPE=BOTH;


Ensure you have allocated sufficient space for this alternate archive log destination to contain at least 24 hours of archives. You can use asmcmd and the lsdg command to verify available space. In the following example note the Useable_file_MB to determine free space

$ asmcmd –p
ASMCMD [+] > lsdg
State   Type Rebal Sector Logical_Sector Block AU Total_MB Free_MB Req_mir_free_MB Usable_file_MB Offline_disks Voting_files Name
MOUNTED HIGH N        512     512   4096 4194304 72769536 66414996          866304       21849564             0 N           DATA/
MOUNTED HIGH N        512     512   4096 4194304   4175360 4148456           59648        1362936             0 Y           DBFS_DG/
Monitor and alert when the alternate archive destination is being used since the archive backups and purge mechanisms are not done automatically since it resides outside RECO and the fast recovery area.

Section B: Operational Steps after losing RECO

This section will apply to any configuration with RECO deployed with normal redundancy.
If the RAC cluster fails because the DBFS_DG disk group where the voting files reside is also dismounted, execute steps from Section C. It will provide an example of the operational steps to restart the database.
If an alternate archive log destination has not been defined and the database is hung and cannot archive, execute these steps to create an alternate archive destination outside RECO.

If the database is running and you are now archiving to your alternate destination (e.g. DATA), then you need to change your archive backup operations to backup from this new destination and to manually purge the backed up archives. Since the alternate destination is outside the FRA, it’s not actively managed by Oracle and RMAN. You can use an RMAN command similar to the following to backup and purge your archive logs in a database:

RMAN> backup archivelog all not backed up delete input;

Monitoring free space in DATA also becomes very critical.
To mitigate data loss after losing RECO, create a new RMAN backup as soon as possible and change your backup procedures since RECO is not accessible. For example, Oracle recommends:

1. 1. Issue RMAN crosscheck so the RMAN catalog is aware of any missing or inaccessible backups and archive files
    
    The RMAN catalog and/or database controlfile will be unaware of the loss of and backups and archive logs that existed in the damaged FRA. Run the RMAN crosscheck command to compare the catalog with the actual existing files. For any file not found, RMAN will mark the object as Expired, you can then delete these expired objects from the catalog.
    1. Run crosscheck backupset to verify backup accessibility
      RMAN> crosscheck backupset;
      
      
      List expired will show the missing backups that were marked expired by the crosscheck operation
      
      RMAN> list expired backupset;
      
      
      Remove the expired backups from the catalog
      
      RMAN> delete expired backupset;
    2. Run crosscheck copy to verify archivelogs, data file and control file copy accessibility
      RMAN> crosscheck copy;
      
      
      List expired will show the missing archive logs, data file copies and control file copies that were marked expired by the crosscheck operation
      
      RMAN> list expired copy;
      
      
      Remove the expired archive logs, data file copies and control file copies from the catalog
      
      RMAN> delete expired copy;
  2. Issue new RMAN backup (cumulative or incremental)
    
    Depending on your configuration, you will have at a minimum archive logs and flashback logs stored in the FRA. It is also possible you may have backups stored locally as well. Loss of the archive logs and backups can leave a potential hole in your recovery capability so you should take a database backup as soon as possible after restoring operation.
    1. If backups are stored on offline storage, take a level 1 incremental backup to backup all of the changes that have occurred since the last level 0 or level 1 incremental backup taken
      
      RMAN> backup incremental level 1 database plus archivelog;
    2. If backups are stored on the FRA, take a level 0 backup to ensure the entire database is backed up
      
      RMAN> backup incremental level 0 database plus archivelog;
  3. Change RMAN procedures to address lack of RECO. For example, you should modify any scripts that perform backup to external storage to access the correct disk group. Also you should ensure that auto backup of the control file is sent to the correct disk group and file name.
    
    RMAN> show CONTROLFILE AUTOBACKUP FORMAT;
    RMAN configuration parameters for database with db_unique_name FREDSTBY are:
    CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '+RECO';
    
    RMAN> CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '+DATA';
    old RMAN configuration parameters:
    CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '+RECO';
    new RMAN configuration parameters:
    CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '+DATA';
    new RMAN configuration parameters are successfully stored
    
    RMAN> show SNAPSHOT CONTROLFILE NAME;
    RMAN configuration parameters for database with db_unique_name FREDSTBY are:
    CONFIGURE SNAPSHOT CONTROLFILE NAME TO '+reco/fredstby/snapcf_fredstby.f';
    
    RMAN> configure SNAPSHOT CONTROLFILE NAME to '+data/fredstby/snapcf_fredstby.f';
    old RMAN configuration parameters:
    CONFIGURE SNAPSHOT CONTROLFILE NAME TO '+reco/fredstby/snapcf_fredstby.f';
    new RMAN configuration parameters:
    CONFIGURE SNAPSHOT CONTROLFILE NAME TO
    ‘+data/fredstby/snapcf_fredstby.f';
    new RMAN configuration parameters are successfully stored

Loss of the FRA impacts flashback operation in the following ways.

If flashback database is enabled and no Guaranteed Restore Points have been created, the database will continue processing but it will disable flashback logging. The alert log will display a message similar to the following:

Errors in file /u01/app/oracle/diag/rdbms/dbm/dbm1/trace/dbm1_rvwr_16261.trc:
ORA-38701: Flashback database log 13 seq 6 thread 1: "+RECO/dbm/flashback/log_13.355.757093499"
ORA-15081: failed to submit an I/O operation to a disk
*************************************
RVWR encountered an error when writing flashback database logs.
See error stack in alert log. To avoid crashing the instance,
this instance has turned off flashback database.
*************************************

As the message states, the database will not crash, it will continue operating, but the database will not be logging for flashback, thus removing the capability to perform a flashback operation. If the issue is resolved at a later point or a new FRA is defined, flashback database can be re-enabled, which will establish a new starting point for flashback retention. As with any time flashback database is enabled, flashing back the database to a point in time prior to this starting point will not be allowed.
If you have defined Guaranteed Restore Points (GRP) the database will crash. Due to the definition of a GRP - Oracle guarantees the ability to flashback to this point - the database must stop processing if it is unable to write to the DB_RECOVERY_FILE_DEST. You will see messages similar to the following in the alert log:

Errors in file /u01/app/oracle/diag/rdbms/dbm/dbm1/trace/dbm1_rvwr_11492.trc:
ORA-38701: Flashback database log 293 seq 160 thread 1: "+RECO"
ORA-17502: ksfdcre:4 Failed to create file +RECO
RVWR (ospid: 11492): terminating the instance due to error 38701
Tue Jun 11 11:08:27 2015
System state dump requested by (instance=1, osid=11492 (RVWR)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/oracle/diag/rdbms/dbm/dbm1/trace/dbm1_diag_9212.trc
Tue Jun 11 11:08:27 2015
ORA-1092 : opitsk aborting process
Tue Jun 11 11:08:28 2015
ORA-1092 : opitsk aborting process
Tue Jun 11 11:08:29 2015
ORA-1092 : opitsk aborting process
Tue Jun 11 11:08:29 2015
License high water mark = 140
Instance terminated by RVWR, pid = 11492
USER (ospid: 11599): terminating the instance
Instance terminated by USER, pid = 11599

If access to the Fast Recovery Area is restored without data loss, the database can be opened and can resume processing without issue. However, if access to the FRA cannot be restored without data loss, the GRP will be lost and the database cannot be flashed back. Prior to being allowed to open the database in this situation, the database must be mounted, the GRP dropped and flashback database be disabled.

Section C: Operational Steps after losing DBFS_DG

NOTE: The DBFS_DG group may also contain the GI Management Repository. This repository should be recreated if DBFS_DG is lost. Please see <Document 1589394.1> and <Document 2065175.1> for more information on this process.

NOTE: The ASM spfile is also stored in the DBFS_DG disk group and will need to be restored prior to restarting the GI stack. Please ensure you backup your ASM spfile periodically to allow for easy restoration if required.

Once ASM detects the loss of access to the VOTE disks, ASM will crash taking down all of the databases using that ASM environment. Depending on the type of error experienced, ASM may attempt to restart, if the issue persists, ASM will crash again.

Manually stop CRS processes with force option
After ASM has crashed, many of the Grid Infrastructure processes will remain running and need to be stopped manually. To do this, issue the crsctl stop crs command on each node with the -f (force) option as the root user.

# dcli -g ~/dbs_group -l root /u01/app/12.1.0/grid/bin/crsctl stop crs –f

This command will clean up all resources in use by Grid Infrastructure and stop any CRS running processes. Due to the nature of the cleanup needing to be performed, the stop may take a few minutes to complete.
Start Grid Infrastructure on one node with exclusive option

Without access to the VOTE disks, Grid Infrastructure may only be started on one node and in such a way as to not expect to be in shared mode. This is accomplished by using the -excl (exclusive) option of the crsctl start crs command as the root user.

# /u01/app/12.1.0/grid/bin/crsctl start crs –excl

# /u01/app/12.1.0/grid/bin/crsctl query css votedisk
Located 0 voting disk(s).

The shared resources such as the VIP and SCAN will not and cannot be started when Grid Infrastructure is started using the -excl option. Database instances on this node can be started using SQL*Plus, but only local connections can be made as the SCAN listeners will not be started.
Relocate VOTE and OCR to a new disk group

Starting Grid Infrastructure in exclusive mode will start ASM and allow you to relocate the VOTE to another diskgroup.
1. As root, relocate to an existing diskgroup
  
  If running a 1/4 rack or 1/8 rack Exadata and 1/2 rack SPARC SuperCluster environment, this diskgroup must be NORMAL redundancy.
  
  # /u01/app/12.1.0/grid/bin/crsctl replace votedisk +VOTEDG
  
  # /u01/app/12.1.0/grid/bin/crsctl query css votedisk
  ## STATE File Universal Id File Name Disk group
  -- ----- ----------------- --------- ---------
  1. ONLINE a1759782b1a84f5fbf52191d66fc63d9 (o/192.168.227.252/TEMPDG_CD_05_test05cel02) [VOTEDG]
  2. ONLINE 74520e6e3c204f28bfea448ce5f3f42f (o/192.168.227.253/TEMPDG_CD_05_test05cel03) [VOTEDG]
  3. ONLINE 18fd862f1e514f9cbfed7283bb6b919a (o/192.168.227.251/TEMPDG_CD_05_test05cel01) [VOTEDG]
  Located 3 voting disk(s).
2. As root, relocate to another diskgroup
  If running on a 1/4 rack or 1/8 rack Exadata or 1/2 rack SPARC SuperCluster environment and all remaining diskgroups are HIGH redundancy, to get back to full operations without the original diskgroup, a new diskgroup with NORMAL redundancy must be created. Do the following:
3. As oracle, connect to the ASM instance and drop 1 disk from each cell from one of the HIGH redundancy diskgroups
  $ sqlplus / as sysasm
  SQL> select name from v$asm_disk where name like ‘RECO%’;
  SQL> alter diskgroup RECO drop disk reco_cd_05_<cell name> power 32 wait;
4. Using cellcli inactivate the griddisk from each corresponding cell
  CELLCLI> alter griddisk reco_cd_05<cellname> inactive;
5. On each cell, use cellcli to drop the griddisk from each cell
  CELLCLI> drop griddisk reco_cd_05_<cellname>;
6. As root, recreate the griddisk with a name to match the new diskgroup being created
  #dcli –g ~/cell_group –l root dcli cellcli –e ‘create griddisk all harddisk prefix=VOTEDG;
  ’
7. As oracle in ASM, create the new diskgroup with NORMAL redundancy
  $ sqlplus / as sysasm
  SQL> create diskgroup data normal redundancy disk 'o/*/VOTEDG*'
  attribute 'cell.smart_scan_capable'='true',
  'au_size'='4M',
  'compatible.asm'='12.1.0.2.0',
  'compatible.rdbms'='12.1.0.2.0';
8. As root, recreate the VOTE disks in the new diskgroup
  # /u01/app/12.1.0/grid/bin/crsctl replace votedisk +VOTEDG
  
  # /u01/app/12.1.0/grid/bin/crsctl query css votedisk
  ## STATE File Universal Id File Name Disk group
  -- ----- ----------------- --------- ---------
  1. ONLINE a1759782b1a84f5fbf52191d66fc63d9 (o/192.168.227.252/TEMPDG_CD_05_test05cel02) [VOTEDG]
  2. ONLINE 74520e6e3c204f28bfea448ce5f3f42f (o/192.168.227.253/TEMPDG_CD_05_test05cel03) [VOTEDG]
  3. ONLINE 18fd862f1e514f9cbfed7283bb6b919a (o/192.168.227.251/TEMPDG_CD_05_test05cel01) [VOTEDG]
  Located 3 voting disk(s).
9. As root, relocate the OCR to the new diskgroup
  1. Check the current location with the ocrcheck command
    
    # <GI_HOME>/bin/ocrcheck
    Status of Oracle Cluster Registry is as follows :
    Version : 4
    Total space (kbytes) : 409568
    Used space (kbytes) : 1912
    Available space (kbytes) : 407656
    ID : 1257099246
    Device/File Name : +DBFS_DG
    Device/File integrity check succeeded
    Device/File not configured
    Device/File not configured
    Device/File not configured
    Device/File not configured
    Cluster registry integrity check succeeded
    Logical corruption check succeeded
  2. Add the new disk group as an OCR location. This command will create a coy of the OCR in the new disk group
    # <GI_HOME>/bin/ocrconfig -add +VOTEDG
  3. Check the locations to see the new OCR
    
    # <GI_HME>/bin/ocrcheck
    Status of Oracle Cluster Registry is as follows :
    Version : 4
    Total space (kbytes) : 409568
    Used space (kbytes) : 1912
    Available space (kbytes) : 407656
    ID : 1257099246
    Device/File Name : +DBFS_DG
    Device/File integrity check succeeded
    Device/File Name : +VOTEDG
    Device/File integrity check succeeded
    Device/File not configured
    Device/File not configured
    Device/File not configured
    Cluster registry integrity check succeeded
    Logical corruption check succeeded
  4. Remove the non-existent DBFS_DG from the OCR locations
    # <GI_HOME>/bin/ocrconfig -delete +DBFS_DG
    
    # <GI_HOME>/bin/ocrcheck
    Status of Oracle Cluster Registry is as follows :
    Version : 4
    Total space (kbytes) : 409568
    Used space (kbytes) : 1912
    Available space (kbytes) : 407656
    ID : 1257099246
    Device/File Name : +VOTEDG
    Device/File integrity check succeeded
    Device/File not configured
    Device/File not configured
    Device/File not configured
    Device/File not configured
    Cluster registry integrity check succeeded
    Logical corruption check succeeded
10. If you have taken a backup of the ASM spfile, that backup should be used to restore to the new disk group. If a backup for the impacted system is not available, you can create a backup from an ASM instance on another running environment, modifying the cluster_interconnects parameter accordingly, to be used as the basis for a new spfile.
  
  Connect to the ASM instance as sysasm and restore the backup ASM spfile
  
  SQL> create spfile='+VOTEDG' from pfile='/home/oracle/asm/asm_pfile_bkup.ora';
11. As root, stop and restart Grid Infrastructure
  
  # /u01/app/12.1.0/grid/bin/crsctl stop crs –f
  # dcli -g ~/dbs_group -l root /u01/app/12.1.0/grid/bin/crsctl start crs

Section D: Data Guard or GoldenGate implications

To maintain best availability and protection with lowest Recovery Time Objective (RTO) and Recovery Point Objective (RPO), setup a Data Guard or GoldenGate replica on a separate system and cluster which does not share the same storage.
You can also setup Data Guard Fast-Start Failover (FSFO) and initiate an automatic failover on “Stuck Archiver” event; however this event may not fire if RECO disk group or FRA is not available (refer to Bug 21970498 - STUCK ARCHIVER IS NOT DECLARED WHEN INSUFFICIENT DESTINATIONS AVAILABLE). Also this event may not fire if flashback error occurs disabling Data Guard FSFO.
Manual Data Guard or GoldenGate failover is always an option if primary database fails.

References

<NOTE:1339373.1> - Operational Steps for Recovery after Losing a Disk Group in an Exadata Environment
<BUG:14115041> - RMAN RECOVERY FAILS WITH ORA-1124 WHEN CREATING FILE FROM REDO

Attachments

This solution has no attachment