![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||||||
Solution Type Sun Alert Sure Solution 2195194.1 : Alert : ODA V1: SSD Disks Used for the ODA ASM +REDO Diskgroup Are Turning Into "Write Protected" Mode Preventing I/O Write Operations
In this Document
Applies to:Oracle Database Appliance - Version All Versions to All Versions [Release All Releases]Oracle Database Appliance Software - Version 2.10.0.0 to 12.1.2.7 Oracle Database - Enterprise Edition - Version 12.1.0.1 to 12.1.0.2 [Release 12.1] Information in this document applies to any platform. DescriptionSSD disks used for the ODA ASM +REDO diskgroup are turning into "Write Protected" mode preventing I/O write operations. OccurrenceAt the moment the problem is only occurring and being reported in ODA V1 configurations, example: [root@asmcloud1 ~]# oakcli show env_hw
BM ODA V1
Symptoms1) ODA ASM +REDO diskgroup cannot be mounted due to the next "WARNING: Write Failed" errors: SQL> ALTER DISKGROUP REDO MOUNT /* asm agent *//* {1:21142:425} */
NOTE: cache registered group REDO number=3 incarn=0xcf4880b4 NOTE: cache began mount (first) of group REDO number=3 incarn=0xcf4880b4 NOTE: Assigning number (3,23) to disk (/dev/mapper/SSD_E1_S23_805696743p1) NOTE: Assigning number (3,22) to disk (/dev/mapper/SSD_E1_S22_805699136p1) NOTE: Assigning number (3,21) to disk (/dev/mapper/SSD_E0_S21_805699139p1) NOTE: Assigning number (3,20) to disk (/dev/mapper/SSD_E0_S20_805699133p1) Sat Aug 27 17:01:09 2016 NOTE: cache closing disk 20 of grp 3: (not open) SSD_E0_S20_805699133P1 NOTE: cache closing disk 22 of grp 3: (not open) SSD_E1_S22_805699136P1 WARNING: Write Failed. group:3 disk:23 AU:1 offset:4190208 size:4096 WARNING: Hbeat write to PST disk 23.3915935947 in group 3 failed. [4] ERROR: GMON could not set any hearbeat (grp 3) NOTE: cache dismounting (clean) group 3/0xCF4880B4 (REDO) NOTE: messaging CKPT to quiesce pins Unix process pid: 52344, image: oracle@asmcloud1 NOTE: dbwr not being msg'd to dismount NOTE: lgwr not being msg'd to dismount NOTE: cache dismounted group 3/0xCF4880B4 (REDO) NOTE: cache ending mount (fail) of group REDO number=3 incarn=0xcf4880b4 NOTE: cache deleting context for group REDO 3/0xcf4880b4 GMON dismounting group 3 at 19 for pid 31, osid 52344 NOTE: Disk SSD_E0_S20_805699133P1 in mode 0x1 marked for de-assignment NOTE: Disk SSD_E0_S21_805699139P1 in mode 0x0 marked for de-assignment NOTE: Disk SSD_E1_S22_805699136P1 in mode 0x1 marked for de-assignment NOTE: Disk SSD_E1_S23_805696743P1 in mode 0x7f marked for de-assignment ERROR: diskgroup REDO was not mounted ORA-15032: not all alterations performed ORA-15017: diskgroup "REDO" cannot be mounted
2) Also, trying to recreate the ODA ASM REDO diskgroup using the original SSD disks also fails with the next "Input/output" write errors: SQL> create diskgroup REDO HIGH REDUNDANCY DISK
' 2 /dev/mapper/SSD_E1_S23_805696743p1' NAME SSD_E1_S23_805696743p1 FORCE, 3 '/dev/mapper/SSD_E1_S22_805699136p1' NAME SSD_E1_S22_805699136p1 FORCE, 4 '/dev/mapper/SSD_E0_S21_805699139p1' NAME SSD_E0_S21_805699139p1 FORCE, 5 '/dev/mapper/SSD_E0_S20_805699133p1' NAME SSD_E0_S20_805699133p1 FORCE 6 attribute 'compatible.asm'='11.2.0.4', 'compatible.rdbms'='11.2.0.2','sector_size'='512','AU_SIZE'='4M','content.type'='redo'; create diskgroup REDO HIGH REDUNDANCY DISK * ERROR at line 1: ORA-15018: diskgroup cannot be created Linux-x86_64 Error: 5: Input/output error Additional information: -1 Additional information: 65536 ORA-27061: waiting for async I/Os failed Linux-x86_64 Error: 5: Input/output error Additional information: -1 Additional information: 65536 ORA-27061: waiting for async I/Os failed Linux-x86_64 Error: 5: Input/output error Additional information: -1 Additional information: 65536 ORA-27061: waiting for async I/Os failed Linux-x86_64 Error: 5: Input/output error Additional information: -1 Additional information: 65536 ORA-27061: waiting for async I/Os failed Linux-x86_64 Error: 5: Input/output error Additional information: -1 Additional information: 65536 ORA-27061: waiting for async I/Os failed Linux-x86_64 Error: 5: Input/output error Additional information: -1 Additional information: 65536 ORA-27061: waiting for async I/Os failed Linux-x86_64 Error: 5: Input/output error Additional information: -1 Additional information: 65536 ORA-27061: waiting for async I/Os failed Linux-x86_64 Error: 5: Input/output error Additional information: -1 Additional information: 65536 ORA-27061: waiting for async I/Os failed Linux-x86_64 Error: 5: Input/output error Additional information: -1 Additional information: 65536 ORA-27061: waiting for async I/Os failed Linux-x86_64 Error: 5: Input/output error Additional information: -1 Additional information: 65536 ORA-15080: synchronous I/O operation to a disk failed ORA-27061: waiting for async I/Os failed Linux-x86_64 Error: 5: Input/output error Additional information: -1 Additional information: 65536 ORA-15080: synchronous I/O operation to a disk failed ORA-27061: waiting for async I/Os failed Linux-x86_64 Error: 5: Input/output error Additional information: -1 Additional information: 65536 ORA-15080: synchronous I/O operation to a disk failed ORA-27061: waiting for async I/Os failed Linux-x86_64 Error: 5: Input/output error Additional information: -1 Additional information: 65536 ORA-15080: synchronous I/O operation to a disk failed ORA-27061:
3) "oakcli show disk" ODA command reports the SSD disk in "Good” state: NAME PATH TYPE STATE STATE_DETAILS
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ pd_00 /dev/sdc HDD ONLINE Good pd_01 /dev/sdm HDD ONLINE Good pd_02 /dev/sdo HDD ONLINE Good pd_03 /dev/sdy HDD ONLINE Good pd_04 /dev/sdd HDD ONLINE Good pd_05 /dev/sdn HDD ONLINE Good pd_06 /dev/sdp HDD ONLINE Good pd_07 /dev/sdz HDD ONLINE Good pd_08 /dev/sde HDD ONLINE Good pd_09 /dev/sdk HDD ONLINE Good pd_10 /dev/sdq HDD ONLINE Good pd_11 /dev/sdw HDD ONLINE Good pd_12 /dev/sdf HDD ONLINE Good pd_13 /dev/sdl HDD ONLINE Good pd_14 /dev/sdr HDD ONLINE Good pd_15 /dev/sdx HDD ONLINE Good pd_16 /dev/sdg HDD ONLINE Good pd_17 /dev/sdi HDD ONLINE Good pd_18 /dev/sds HDD ONLINE Good pd_19 /dev/sdu HDD ONLINE Good ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ pd_20 /dev/sdh SSD ONLINE Good <(=== pd_21 /dev/sdj SSD ONLINE Good <(=== pd_22 /dev/sdt SSD ONLINE Good <(=== pd_23 /dev/sdv SSD ONLINE Good <(===
4) Nevertheless, the SSD disks are reporting ["Write protected"] mode errors in the OS logs ("/var/log/messages"), which confirm for sure this physical disk failure: a) pd_23 = /dev/sdv: Oct 14 14:53:07 asmcloud1 kernel: sd 6:0:20:0: [sdv] Unhandled sense code
Oct 14 14:53:07 asmcloud1 kernel: sd 6:0:20:0: [sdv] Result: hostbyte=invalid driverbyte=DRIVER_SENSE Oct 14 14:53:07 asmcloud1 kernel: sd 6:0:20:0: [sdv] Sense Key : Data Protect [current] Oct 14 14:53:07 asmcloud1 kernel: sd 6:0:20:0: [sdv] Add. Sense: Write protected <(==== Oct 14 14:53:07 asmcloud1 kernel: sd 6:0:20:0: [sdv] CDB: Write(10): 2a 00 00 00 17 00 00 00 80 00 b) pd_22 = /dev/sdt: Oct 14 14:53:07 asmcloud1 kernel: sd 6:0:18:0: [sdt] Unhandled sense code
Oct 14 14:53:07 asmcloud1 kernel: sd 6:0:18:0: [sdt] Result: hostbyte=invalid driverbyte=DRIVER_SENSE Oct 14 14:53:07 asmcloud1 kernel: sd 6:0:18:0: [sdt] Sense Key : Data Protect [current] Oct 14 14:53:07 asmcloud1 kernel: sd 6:0:18:0: [sdt] Add. Sense: Write protected <(==== Oct 14 14:53:07 asmcloud1 kernel: sd 6:0:18:0: [sdt] CDB: Write(10): 2a 00 00 00 17 00 00 00 80 00 c) pd_21 = /dev/sdj: Oct 14 14:53:07 asmcloud1 kernel: sd 6:0:7:0: [sdj] Unhandled sense code
Oct 14 14:53:07 asmcloud1 kernel: sd 6:0:7:0: [sdj] Result: hostbyte=invalid driverbyte=DRIVER_SENSE Oct 14 14:53:07 asmcloud1 kernel: sd 6:0:7:0: [sdj] Sense Key : Data Protect [current] Oct 14 14:53:07 asmcloud1 kernel: sd 6:0:7:0: [sdj] Add. Sense: Write protected <(==== Oct 14 14:53:07 asmcloud1 kernel: sd 6:0:7:0: [sdj] CDB: Write(10): 2a 00 00 00 17 00 00 00 80 00 d) pd_20= /dev/sdh: Oct 14 14:53:07 asmcloud1 kernel: sd 6:0:5:0: [sdh] Unhandled sense code
Oct 14 14:53:07 asmcloud1 kernel: sd 6:0:5:0: [sdh] Result: hostbyte=invalid driverbyte=DRIVER_SENSE Oct 14 14:53:07 asmcloud1 kernel: sd 6:0:5:0: [sdh] Sense Key : Data Protect [current] Oct 14 14:53:07 asmcloud1 kernel: sd 6:0:5:0: [sdh] Add. Sense: Write protected <(==== Oct 14 14:53:07 asmcloud1 kernel: sd 6:0:5:0: [sdh] CDB: Write(10): 2a 00 00 00 17 00 00 00 80 00
5) The problem occurred due to the SSD disks are faulty and need to be replaced (end of life disks).
Workaround1) Open a Service Request with Oracle Support to replace the faulty SSD disk(s) right away. 2) If all the SSD disks are affected at the same time, then you will need to recreate the +REDO diskgroup on brand new SSD disks and recreate the associated ACFS filesystems.
Community Discussions ODAStill have questions? Use the communities window below to search for similar discussions or start a new discussion on this subject. (Window is the live community not a screenshot) Click here to open in main browser window History[16-OCT-2016] - [Alert: 2195194.1 was created] Attachments This solution has no attachment |
||||||||||||||||||||||
|