Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1637898.1
Update Date:2016-08-17
Keywords:

Solution Type  Problem Resolution Sure

Solution  1637898.1 :   ODA : Old Disk Path information still exist in ASM  


Related Items
  • Oracle Database Appliance
  •  
Related Categories
  • PLA-Support>Eng Systems>Exadata/ODA/SSC>Oracle Database Appliance>DB: ODA_EST
  •  




Applies to:

Oracle Database Appliance - Version All Versions and later
Information in this document applies to any platform.

Symptoms

After a disk replacement in ODA machine , old disk information still exist in environment.

You can get following symptoms for the old disk.

 

1. Output  of v$asm_disk will still show old disk path with group_number=0 .    < group 0 means unowned

2. vgs output will still show IO errors for old disk.

3. /etc/multipath.conf will show old disk information.

4. /dev/mapper will show old disk information.

5. /dev will also show device information.

6. /dev/mpath will also be having entry for old  disk.

7. /var/log/message  will also record IO errors.

8. ASM alert logs  also display IO errors.

 Along with  entries for the old disk , new disk information will also available and it can be added back to diskgroup successfully.

 

*****These symptoms will be on both the nodes.*****

 

Example :

------------

Check in /dev/mapper to check old information -->>


[root@NODE1 mapper]# ls -altr *S03*
brw-rw---- 1 grid asmadmin 253, 24 Mar 27  2013 HDD_E1_S03_1211732875
brw-rw---- 1 grid asmadmin 253, 30 Mar  7 13:23 HDD_E1_S03_1211732875p1
brw-rw---- 1 grid asmadmin 253, 31 Mar  7 13:30 HDD_E1_S03_1211732875p2
brw-rw---- 1 grid asmadmin 253, 72 Mar 10 14:19 HDD_E1_S03_1797259843
brw-rw---- 1 grid asmadmin 253, 74 Mar 10 16:29 HDD_E1_S03_1797259843p2
brw-rw---- 1 grid asmadmin 253, 73 Mar 10 16:29 HDD_E1_S03_1797259843p1

 

Disk pd_03 was replaced in this environment but ls output  from /dev/mapper is still showing old and new disk information.

PATH HDD_E1_S03_1211732875 is for old disk and PATH HDD_E1_S03_1797259843 is representing new disk.

 

Output of "vgs" will show IO errors like given below -->>

[root@NODE1 mapper]# vgs
  /dev/mpath/HDD_E1_S03_1211732875: read failed after 0 of 4096 at 600127176704: Input/output error
  /dev/mpath/HDD_E1_S03_1211732875: read failed after 0 of 4096 at 600127258624: Input/output error
  /dev/mpath/HDD_E1_S03_1211732875: read failed after 0 of 4096 at 0: Input/output error
  /dev/mpath/HDD_E1_S03_1211732875: read failed after 0 of 4096 at 4096: Input/output error
  /dev/mpath/HDD_E1_S03_1211732875p1: read failed after 0 of 4096 at 515396009984: Input/output error
  /dev/mpath/HDD_E1_S03_1211732875p1: read failed after 0 of 4096 at 515396067328: Input/output error
  /dev/mpath/HDD_E1_S03_1211732875p1: read failed after 0 of 4096 at 0: Input/output error
  /dev/mpath/HDD_E1_S03_1211732875p1: read failed after 0 of 4096 at 4096: Input/output error
  /dev/mpath/HDD_E1_S03_1211732875p2: read failed after 0 of 512 at 84726382592: Input/output error
  /dev/mpath/HDD_E1_S03_1211732875p2: read failed after 0 of 512 at 84726472704: Input/output error
  /dev/mpath/HDD_E1_S03_1211732875p2: read failed after 0 of 512 at 0: Input/output error
  /dev/mpath/HDD_E1_S03_1211732875p2: read failed after 0 of 512 at 4096: Input/output error
  /dev/mpath/HDD_E1_S03_1211732875p2: read failed after 0 of 2048 at 0: Input/output error
  VG          #PV #LV #SN Attr   VSize   VFree
  VolGroupSys   1   4   0 wz--n- 465.66G 251.66G

 

These error messages are related with old disk.

 

ASM alert log error message :--

 

Mon Mar 10 13:59:39 2014
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_pz99_29322.trc:
ORA-27061: waiting for async I/Os failed
Linux-x86_64 Error: 5: Input/output error
Additional information: -1
Additional information: 4096
WARNING: Read Failed. group:0 disk:51 AU:0 offset:0 size:4096
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_pz99_29322.trc:
ORA-27061: waiting for async I/Os failed
Linux-x86_64 Error: 5: Input/output error
Additional information: -1
Additional information: 4096
ORA-15080: synchronous I/O operation to a disk failed

 

V$ASM_DISK output :--

SQL> select path, name, header_status, mode_status, mount_status, state, failgroup, group_number from v$asm_disk order by path;

PATH                                                               NAME                   HEADER_STATUS      MODE_ST    MOUNT_S       STATE            FAILGROUP                                  GROUP_NUMBER
-------------------------------------                    -----------------           ----------------       ----------    ------------     ---------          ----------------                            ------------------------ ------------
/HDD_E0_S19_1130281880p1                 HDD_E0_S19_1130281880P1      MEMBER          ONLINE       CACHED         NORMAL        HDD_E0_S19_1130281880P1                     2   << --- New disk already added back to diskgroup 2
/HDD_E0_S19_1230211442p1                 HDD_E0_S19_1230211442p1      MEMBER          ONLINE       CLOSED         NORMAL                                                                     0   << --- Group #0 (Old disk )
 

***Where Group 0 means the disk does not belong to any of the expected disk groups.   

Changes

Failed disk is replaced. 

Cause

Old disk information is not removed.

This must be removed automatically but due to following bug OLD disk information is not removed.

   Bug 16964646 : DISK REPLACEMENT/FCO ISSUES WITH BOTH OLD AND NEW DISK SEEN   ---->> This bug is closed as duplicate of Bug 14223113
   Bug 14223113 : ASM DISK NOT RELEASED BY CRSD.BIN PROCESS AFTER DROPPING DISK

Solution

 To remove the old disk entry  follow the steps  from any one option.

  

Option 1:

  Reboot of the both nodes can solve this issue automatically. 

 OR

 In case  reboot is not possible follow the steps given in option 2.


Option 2:

  Remove the old disk information manually.

*****These steps  must be run  on both the nodes.*****


STEP 1.

 

--- Remove multipath information from "/dev/mapper" ---


[root@NODE1 mapper]# ls -altr *S03*
brw-rw---- 1 grid asmadmin 253, 24 Mar 27  2013 HDD_E1_S03_1211732875    ------>> OLD DISK
brw-rw---- 1 grid asmadmin 253, 30 Mar  7 13:23 HDD_E1_S03_1211732875p1 ------>> OLD DISK
brw-rw---- 1 grid asmadmin 253, 31 Mar  7 13:30 HDD_E1_S03_1211732875p2 ------>> OLD DISK

brw-rw---- 1 grid asmadmin 253, 72 Mar 10 14:19 HDD_E1_S03_1797259843    ------>> NEW DISK
brw-rw---- 1 grid asmadmin 253, 74 Mar 10 16:29 HDD_E1_S03_1797259843p2 ------>> NEW DISK
brw-rw---- 1 grid asmadmin 253, 73 Mar 10 16:29 HDD_E1_S03_1797259843p1 ------>> NEW DISK

Remove old disk files.


[root@NODE1 mapper]# rm HDD_E1_S03_1211732875p2 HDD_E1_S03_1211732875p1 HDD_E1_S03_1211732875

 

STEP 2.

--- Remove entry from "/dev/mpath" ---


[root@NODE1 mpath]# ls -altr *S03*
lrwxrwxrwx 1 root root 8 Mar 27  2013 HDD_E1_S03_1211732875 -> ../dm-24       ------>> OLD DISK
lrwxrwxrwx 1 root root 8 Mar 27  2013 HDD_E1_S03_1211732875p2 -> ../dm-31   ------>> OLD DISK
lrwxrwxrwx 1 root root 8 Mar 27  2013 HDD_E1_S03_1211732875p1 -> ../dm-30   ------>> OLD DISK
lrwxrwxrwx 1 root root 8 Mar 10 13:15 HDD_E1_S03_1797259843p1 -> ../dm-73  ------>> NEW DISK
lrwxrwxrwx 1 root root 8 Mar 10 13:16 HDD_E1_S03_1797259843 -> ../dm-72     ------>> NEW DISK
lrwxrwxrwx 1 root root 8 Mar 10 13:16 HDD_E1_S03_1797259843p2 -> ../dm-74  ------>> NEW DISK

[root@NODE1 mpath]# rm HDD_E1_S03_1211732875 HDD_E1_S03_1211732875p1 HDD_E1_S03_1211732875p2

rm: remove symbolic link `HDD_E1_S03_1211732875'? y
rm: remove symbolic link `HDD_E1_S03_1211732875p1'? y
rm: remove symbolic link `HDD_E1_S03_1211732875p2'? y

 
STEP 3.
 

--- Remove device Information from "/dev" ---


[root@NODE1 dev]# rm dm-24 dm-30 dm-31
rm: remove block special file `dm-24'? y
rm: remove block special file `dm-30'? y
rm: remove block special file `dm-31'? y

 

  

OR

 

Option 3:

  Apply solution which is not specific to ODA from  Document 1485163.1   Disks of Dismounted Diskgroup Are Still Hold / Lock By Oracle Process on 11.2.0.3 
- Note that this workaround can be used on versions of the ODA which are higher than 11.2.0.3

  

Steps To verify :

  

Now   verify the following to check that old disk information is removed

1. Now there is no IO error in vgs output.

[root@NODE1 dev]# vgs
  VG          #PV #LV #SN Attr   VSize   VFree
  VolGroupSys   1   4   0 wz--n- 465.66G 251.66G

 

2. Check v$asm_disk output old disk information  is  be removed.

3. Check in /var/log/messages file , IO errors  is stopped now.

4.Check ASM alert log , there will be no new IO error.

 

References

<NOTE:1485163.1> - Disks of Dismounted Diskgroup Are Still Hold / Lock By Oracle Process on 11.2.0.3
<BUG:13869294> - DISMOUNTING DISKGROUP IN ASM BUT DEVICE STILL IN USE BY AN ASM PROCESS
<BUG:14223113> - ASM DISK NOT RELEASED BY CRSD.BIN PROCESS AFTER DROPPING DISK
<NOTE:1644043.1> - ODA : 24 Extra Disk paths exists in ASM with group_number 0 and without suffix p1 or p2
<NOTE:1981125.1> - Oracle Database Appliance (ODA) Reference to Disk and Storage Issue Notes

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback