![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||||||||||||||||||||
Solution Type Sun Alert Sure Solution 1991445.1 : Bug 19695225 - Running Many Create or Alter Griddisk Commands Over Time Causes Cell Disk Metadata Corruption (ORA-600 [addNewSegmentsToGDisk_2]) and Loss of Cell Disk Content
In this Document
Applies to:Oracle Exadata Storage Server Software - Version 11.2.1.2.0 to 12.1.1.1.1 [Release 11.2 to 12.1]Oracle SuperCluster T5-8 Half Rack Information in this document applies to any platform. DescriptionOn Exadata Storage Server version 12.1.1.1.1 and earlier, cell disk metadata corruption and loss of cell disk content (i.e. grid disk, ASM disk) will occur if many CREATE GRIDDISK or ALTER GRIDDISK commands that modify cell disk space configuration are run over time for the same cell disk. If CellCLI griddisk commands are typically run in parallel on all storage servers simultaneously, which is a common maintenance practice, and the issue occurs on multiple storage servers at the same time such that all redundant disk extents are lost for files in an ASM disk group, then the disk group will dismount and database will crash, and will require restoring files from backup. Rolling cell maintenance commands that change grid disk state, such as ALTER GRIDDISK INACTIVE and ALTER GRIDDISK ACTIVE, do not contribute to this issue. This problem is filed as <bug 19695225>. OccurrenceSince initial system deployment if you have recreated or reconfigured grid disks using CellCLI commands CREATE GRIDDISK or ALTER GRIDDISK more than 31 times, then the likelihood of occurrence is high. Risk and DetectionThe risk to test and development systems is expected to be higher than production systems due to the dynamic manner in which they may be reconfigured. To determine if your system is exposed to this issue, and how close the system is to having cell disk metadata corruption, download and run the script attached to this document on all storage servers as the root user. # ./check_bug19695225.sh
or via dcli # dcli -l root -g cell_group -x check_bug19695225.sh
The script produces additional details about each cell disk in /tmp/check_bug19695225.log on each storage server. Reported for each cell disk is the number of records in the last segmap sector, which increases when CREATE GRIDDISK or ALTER GRIDDISK commands that modify cell disk space configuration are run. A command that causes the number of records to exceed 31 will introduce bug 19695225. The script will report ALERT when it detects 25 or more records and the fix is not yet applied, and WARNING when it detects less than 25 records and the fix is not yet applied. The number of records can only be reset by recreating cell disks, which requires dropping grid disks first. This is not a recommended course of action.
SymptomsPossible symptoms that cell disk metadata corruption has occurred as a result of this bug include the following:
WorkaroundThe cell disk corruption cannot be repaired once it occurs. Recovery requires recreating cell disks, grid disks, and ASM disk groups, then restoring affected databases from backup. PatchesPerform one of the following actions to prevent bug 19695225:
History13-May-2015 - Add additional detail about the number of records in the last segmap sector References<BUG:19695225> - SPECIFIC ORDER OF CREATE/ALTER GRIDDISK CAUSES ORA-600 [ADDNEWSEGMENTSTOGDISK_2]Attachments This solution has no attachment |
||||||||||||||||||||||||||||||||||||
|