Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-2065338.1
Update Date:2015-10-12
Keywords:

Solution Type  Problem Resolution Sure

Solution  2065338.1 :   ODA: ACFS Volumes Are Not Mounting After Node Reboot Due to ORA-15477: Cannot Communicate with the Volume Driver  


Related Items
  • Oracle Database - Enterprise Edition
  •  
  • Oracle Database Appliance X5-2
  •  
Related Categories
  • PLA-Support>Eng Systems>Exadata/ODA/SSC>Oracle Database Appliance>DB: ODA_EST
  •  




In this Document
Symptoms
Cause
Solution
 ODA Community Discussions
 ACFS Community Discussions
References


Created from <SR 3-11463923061>

Applies to:

Oracle Database Appliance X5-2 - Version All Versions to All Versions [Release All Releases]
Oracle Database - Enterprise Edition - Version 11.2.0.1 to 12.1.0.2 [Release 11.2 to 12.1]
Information in this document applies to any platform.

Symptoms

 
1) After rebooting ODA nodes, the ADVM volumes became UNKNOWN status and the ACFS filesystem could not be mounted:

[root@asmcloud1 ~]# crsctl stat res -t  
ora.DATA.DATASTORE.advm
              ONLINE  UNKNOWN      asmcloud1               STABLE
              ONLINE  UNKNOWN      asmcloud2               STABLE
--
ora.FLASH.FLASHDATA.advm
              ONLINE  UNKNOWN      asmcloud1               STABLE
              ONLINE  UNKNOWN      asmcloud2               STABLE
--
ora.RECO.ACFSVOL.advm
              ONLINE  UNKNOWN      asmcloud1               STABLE
              ONLINE  UNKNOWN      asmcloud2               STABLE
ora.RECO.DATAFSVOL.advm
              ONLINE  UNKNOWN      asmcloud1               STABLE
              ONLINE  UNKNOWN      asmcloud2               STABLE
ora.RECO.DATASTORE.advm
              ONLINE  UNKNOWN      asmcloud1               STABLE
              ONLINE  UNKNOWN      asmcloud2               STABLE
--
ora.REDO.ACLDATSTORE.advm
              ONLINE  UNKNOWN      asmcloud1               STABLE
              ONLINE  UNKNOWN      asmcloud2               STABLE
ora.REDO.DATASTORE.advm
              ONLINE  UNKNOWN      asmcloud1               STABLE
              ONLINE  UNKNOWN      asmcloud2               STABLE
--
ora.proxy_advm
              ONLINE  ONLINE       asmcloud1               STABLE
              ONLINE  ONLINE       asmcloud2               STABLE


[root@asmcloud1 ~]# crsctl stat res -t  
ora.data.datastore.acfs
              ONLINE  OFFLINE      asmcloud1               volume /u02/app/orac
                                                            le/oradata/datastore
                                                            offline,STABLE
              ONLINE  OFFLINE      asmcloud2               volume /u02/app/orac
                                                            le/oradata/datastore
                                                            offline,STABLE
ora.flash.flashdata.acfs
              ONLINE  OFFLINE      asmcloud1               volume /u02/app/orac
                                                            le/oradata/flashdata
                                                            offline,STABLE
              ONLINE  OFFLINE      asmcloud2               volume /u02/app/orac
                                                            le/oradata/flashdata
                                                            offline,STABLE
--
ora.reco.acfsvol.acfs
              ONLINE  OFFLINE      asmcloud1               volume /cloudfs offl
                                                            ine,STABLE
              ONLINE  OFFLINE      asmcloud2               volume /cloudfs offl
                                                            ine,STABLE
ora.reco.datafsvol.acfs
              ONLINE  OFFLINE      asmcloud1               volume /odadatafs of
                                                            fline,STABLE
              ONLINE  OFFLINE      asmcloud2               volume /odadatafs of
                                                            fline,STABLE
ora.reco.datastore.acfs
              ONLINE  OFFLINE      asmcloud1               volume /u01/app/orac
                                                            le/fast_recovery_are
                                                            a/datastore offline,
                                                            STABLE
              ONLINE  OFFLINE      asmcloud2               volume /u01/app/orac
                                                            le/fast_recovery_are
--
ora.redo.datastore.acfs
              ONLINE  OFFLINE      asmcloud1               volume /u01/app/orac
                                                            le/oradata/datastore
                                                            offline,STABLE
              ONLINE  OFFLINE      asmcloud2               volume /u01/app/orac
                                                            le/oradata/datastore
                                                            offline,STABLE

 


2) ASM alert.log is reporting the next error:

.
.
.

Fri Oct 09 10:13:38 2015
ERROR: /* asm agent */
Fri Oct 09 10:19:20 2015
SQL> /* asm agent */
Fri Oct 09 10:19:20 2015
kfvxVolOnOff: Cannot open device file
ORA-15032: not all alterations performed
ORA-15477: cannot communicate with the volume driver

Fri Oct 09 10:19:20 2015
ERROR: /* asm agent */
.
.
.

 

3) Manual ACFS filesystem mounting also reports the same problem:

[root@asmcloud1 ~]# srvctl start filesystem -d /dev/asm/datastore-511
PRCR-1079 : Failed to start resource ora.data.datastore.acfs
CRS-2680: Clean of 'ora.DATA.DATASTORE.advm' on 'asmcloud1' failed
CRS-2680: Clean of 'ora.DATA.DATASTORE.advm' on 'asmcloud2' failed



4) crsd_orarootagent_root.trc log reports the "SetGroupAsmAdmin: ERROR: Unable to get GID (34)" and "ORA-15477: cannot communicate with the volume driver" errors:

.
.
.
2015-10-08 09:39:44.270518 :CLSDYNAM:2375592256: [ora.DATA.DATASTORE.advm]{1:3841:2} [start] (:CLSN00010:)asmadmin
2015-10-08 09:39:44.270529 :CLSDYNAM:2375592256: [ora.DATA.DATASTORE.advm]{1:3841:2} [start] (:CLSN00010:)
2015-10-08 09:39:44.270545 :CLSDYNAM:2375592256: [ora.DATA.DATASTORE.advm]{1:3841:2} [start] execCmd ret = 0
2015-10-08 09:39:44.270607 :CLSDYNAM:2375592256: [ora.DATA.DATASTORE.advm]{1:3841:2} [start] SetGroupAsmAdmin: ERROR: Unable to get GID (34).
2015-10-08 09:39:44.270852 :CLSDYNAM:2375592256: [ora.DATA.DATASTORE.advm]{1:3841:2} [start] invalid OCI handle
2015-10-08 09:39:44.270947 :CLSDYNAM:2375592256: [ora.DATA.DATASTORE.advm]{1:3841:2} [start] VolumeAgent::queryVolStatus: stmtExecute connection exception. retrycount=0, excp=invalid OCI handle
.
.
.
2015-10-08 09:39:44.502593 :CLSDYNAM:2392803648: [ora.RECO.ACFSVOL.advm]{1:3841:2} [start] UsmUtils::execCmd: cmd=osdbagrp -a
2015-10-08 09:39:44.502617 :CLSDYNAM:2392803648: [ora.RECO.ACFSVOL.advm]{1:3841:2} [start] Utils::getOracleHomeAttrib getEnvVar oracle_home:/u01/app/12.1.0.2/grid
2015-10-08 09:39:44.502624 :CLSDYNAM:2392803648: [ora.RECO.ACFSVOL.advm]{1:3841:2} [start] Utils::getOracleHomeAttrib oracle_home:/u01/app/12.1.0.2/grid
2015-10-08 09:39:44.502639 :CLSDYNAM:2392803648: [ora.RECO.ACFSVOL.advm]{1:3841:2} [start] Utils:execCmd action = 1 flags = 38 ohome = /u01/app/12.1.0.2/grid cmdname = osdbagrp.
2015-10-08 09:39:44.549068 :CLSDYNAM:2386098496: [ora.FLASH.FLASHDATA.advm]{1:3841:2} [start] (:CLSN00010:)asmadmin
2015-10-08 09:39:44.549084 :CLSDYNAM:2386098496: [ora.FLASH.FLASHDATA.advm]{1:3841:2} [start] (:CLSN00010:)
2015-10-08 09:39:44.549097 :CLSDYNAM:2386098496: [ora.FLASH.FLASHDATA.advm]{1:3841:2} [start] execCmd ret = 0
2015-10-08 09:39:44.549168 :CLSDYNAM:2386098496: [ora.FLASH.FLASHDATA.advm]{1:3841:2} [start] SetGroupAsmAdmin: ERROR: Unable to get GID (34).
2015-10-08 09:39:44.549277 :CLSDYNAM:2386098496: [ora.FLASH.FLASHDATA.advm]{1:3841:2} [start] invalid OCI handle
2015-10-08 09:39:44.549316 :CLSDYNAM:2386098496: [ora.FLASH.FLASHDATA.advm]{1:3841:2} [start] VolumeAgent::queryVolStatus: stmtExecute connection exception. retrycount=0, excp=invalid OCI handle
.
.
.
2015-10-08 09:39:54.385404 :CLSDYNAM:2375592256: [ora.DATA.DATASTORE.advm]{1:3841:2} [start] VolumeAgent::start: Enabling volume. enableStmt=ALTER DISKGROUP DATA ENABLE VOLUME DATASTORE  /* asm agent *//* {1:3841:2} */
2015-10-08 09:39:54.385465 :CLSDYNAM:2375592256: [ora.DATA.DATASTORE.advm]{1:3841:2} [start] InstConnectionRoot:connectInt connected
2015-10-08 09:39:54.403292 :CLSDYNAM:2375592256: [ora.DATA.DATASTORE.advm]{1:3841:2} [start] ORA-15032: not all alterations performed
ORA-15477: cannot communicate with the volume driver

2015-10-08 09:39:54.403336 :CLSDYNAM:2375592256: [ora.DATA.DATASTORE.advm]{1:3841:2} [start] VolumeAgent::start excp ORA-15032: not all alterations performed
ORA-15477: cannot communicate with the volume driver
.
.
.



5) But the ACFS/ADVM drivers are in good shape on both nodes:

[grid@asmcloud1 trace]$ acfsdriverstate loaded
ACFS-9203: true
[grid@asmcloud1 trace]$ acfsdriverstate installed
ACFS-9203: true
[grid@asmcloud1 trace]$ acfsdriverstate supported
ACFS-9200: Supported
[grid@asmcloud1 trace]$ acfsdriverstate version
ACFS-9325:     Driver OS kernel version = 2.6.39-400.3.0.el5uek(x86_64).
ACFS-9326:     Driver Oracle version = 150721.
[grid@asmcloud1 trace]$ acfsroot version_check
ACFS-9316: Valid ADVM/ACFS distribution media detected at: '/u01/app/12.1.0.2/grid/usm/install/Oracle/EL5UEK/x86_64/2.6.39-400/2.6.39-400-x86_64/bin'

 
 

[grid@asmcloud2 trace]$ acfsdriverstate loaded
ACFS-9203: true
[grid@asmcloud1 trace]$ acfsdriverstate installed
ACFS-9203: true
[grid@asmcloud1 trace]$ acfsdriverstate supported
ACFS-9200: Supported
[grid@asmcloud1 trace]$ acfsdriverstate version
ACFS-9325:     Driver OS kernel version = 2.6.39-400.3.0.el5uek(x86_64).
ACFS-9326:     Driver Oracle version = 150721.
[grid@asmcloud1 trace]$ acfsroot version_check
ACFS-9316: Valid ADVM/ACFS distribution media detected at: '/u01/app/12.1.0.2/grid/usm/install/Oracle/EL5UEK/x86_64/2.6.39-400/2.6.39-400-x86_64/bin'

 

Cause

 

1) This problem is due to the next bug:

  • Bug 21294273 - ASM VOLUMES GETS DISABLED AFTER EVERY NODE RESTART

 


2) Whenever an OS group (with many users) in the “/etc/group” precedes the target ASM Admin group (e.g. “SS_ASM_GRP = asmadmin”), the function 'SetGroupAsmAdmin' fails with “ERROR: Unable to get GID (34)”.

Example:

[root@asmcloud1 lib]# cat /etc/group
.
.
.
dba1:x:1001:oracle,applmgr,svutukuru,traja,srangasube,pdan,erudolph,dreddy,bka
 ndra,balluri,rkanumury,dsisola,smeka,cakkinepally,ebonovich,root,backupexec,bs
 haw,fakkawi,reiuser,sganapathy,pnalamada,npati,srangasube,vsiricilla,sgopinath
 ,skadiyala,salluri,aharris,cjagarlamudi,pbatraju,prayen,schava,sdarsi,spoosa,s
 kokku,rbodempudi,rthadigoppula,pbodempudi,sbommadi,hksura,rkaranam,pvangala,pp
 alyam,mbatchu,smadala,moreilly,apenumetsa,ssyed,pkothapalli,jsutraye,rkatta,vp
 alivela,ppurra,amote,srakshit,pmaram,saarakatla,rmadoori,rnalabolu,ckarupakula
 ,rajnalla,sgogineni,sballa,smallavarapu,nmamindlapalli,rpasha,creckamp,kmaduri
 ,mraffiuddin,celangovan,serigineni,stalasila,vpilli,smohanty,renuganti
 
asmadmin:x:1100:grid
 

 

Solution

 

1) Move the OS group (with many users) at the very end of the “/etc/group” or after the "asmadmin", "asmoper", "asmdba" OS groups.


2) As a permanent solution, please apply the ODA patch containing the fix for the following bug:

  • Bug 21294273 - ASM VOLUMES GETS DISABLED AFTER EVERY NODE RESTART

 

Note: The solution provide in this document is applicable to non-ODA ACFS configurations as well.

 


 

ODA Community Discussions

Still have questions? Use the communities window below to search for similar discussions or start a new discussion on this subject. (Window is the live community not a screenshot)

Click here to open in main browser window

 


 

ACFS Community Discussions

Still have questions? Use the communities window below to search for similar discussions or start a new discussion on this subject. (Window is the live community not a screenshot)

Click here to open in main browser window

References

<BUG:21294273> - ASM VOLUMES GETS DISABLED AFTER EVERY NODE RESTART
<BUG:18768597> - LNX64-121-CMT: ADVM PROXY UNAVAILABLE DURING FIRST VOLUME CREATE

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback