![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||||||||||||||||||||||||||||||
Solution Type Predictive Self-Healing Sure Solution 1665754.1 : ODA (Oracle Database Appliance): GI Patching Troubleshooting
In this Document
Applies to:Oracle Database Appliance - Version All Versions to All Versions [Release All Releases]Oracle Database Appliance Software - Version 2.2.0.0 to 2.10.0.0 Information in this document applies to any platform. ***Checked for relevance on 03-AUG-2016*** PurposeThere are cases for which the GI patching fails and the customer will need to reapply the GI patching. ScopeBasic Grid Infrastructure & oakcli knowledge is required. DetailsGI Patching TroubleshootingAcronyms, Terms and Procedures Used in This NoteRefer to note 1374275.1 for abbreviations, acronyms, terms and Procedures used in this note. Relevant Log and Trace locationLocation of OAK GI patching needed log/traceRelevant OAK GI patching phase logs/traces are the following - output of the comand (from both nodes) oakcli show version -detail
and oakcli show disk
- Files from /opt/oracle/oak/log/<nodename>/patch/<patch_version>/* ie:
/opt/oracle/oak/log/odanode1/patch/2.9.0.0.0/* - Files from /opt/oracle/oak/pkgrepos/System/<patch_version>/bin/tmp/* ie:
/opt/oracle/oak/pkgrepos/System/2.9.0.0.0/bin/tmp/* - Files from /opt/oracle/oak/onecmd/tmp timestamp with the patch date ls -lt /opt/oracle/oak/onecmd/tmp | grep '<timestamp path date>'
ie: ls -lt /opt/oracle/oak/onecmd/tmp | grep 'Apr 24' Location of runInstaller(OUI) needed log/trace- Collect the following file: /u01/app/oraInventory/ContentsXML/inventory.xml - From both node collect the output of the command: opatch lsinventory -detail
Location of needed GI log/trace- Files from /u01/app/<GI version>/grid/cfgtoollogs/* from both old and new GI homes: ie:
/u01/app/11.2.0.3/grid/cfgtoollogs/* /u01/app/11.2.0.4/grid/cfgtoollogs/* - Files from /u01/app/<GI version>/grid/log/<nodename>/* from both old and new GI homes: ie:
/u01/app/11.2.0.3/grid/log/odanode1/* /u01/app/11.2.0.4/grid/log/odanode1/* Note:
starting with GI 12.1 the logs are under /u01/app/grid/crsdata/<nodename>/* /u01/app/grid/diag/crs/<nodename>/crs/trace/*
- Files from /u01/app/<GI version>/grid/install/* from new GI homes: ie:
/u01/app/11.2.0.4/grid/install/* - File from /u01/app/<GI version>/grid/inventory/* from both old and new GI homes: ie:
/u01/app/11.2.0.3/grid/inventory/* /u01/app/11.2.0.4/grid/inventory/* - Output of the following command: cat /etc/oracle/olr.loc
- The ASM alert.log from both nodes: ie:
(node1) /u01/app/grid/diag/asm/+asm/+ASM1/trace/ (node2) /u01/app/grid/diag/asm/+asm/+ASM2/trace/
Note: to help you on collecting the above logs/traces within one command, you can issue the attached bash script on both nodes: GIupdiag.sh # ./GIupdiag.sh -h
Usage: GIupDiag.sh GIupDiag.sh <Patching Date format YYYYMMDD> i.e.: GIupDiag.sh 20120928 # it will collect log/trace above that day Note: default <Patching Date> is 20110101 (a big dump is expected) GIupdiag.sh will create a compressed file under /tmp/GIupDiag_<hostname>_<timestemp>.tar.gz Case Studies
1. GI update is failing due to invalid response fileDuring the GI update you are getting an error message like the following: (...) and on /u01/app/<GI version>/grid/cfgtoollogs/ INFO: Createing properties map - in ExtendedPropertyFileFormat.loadPropertiesMap()
Jun 12, 2012 8:08:34 AM oracle.install.commons.util.exception.DefaultErrorAdvisor$AbstractErrorAdvisor getDetailedMessage SEVERE: [FATAL] [INS-10105] The given response file /opt/oracle/oak/pkgrepos/System/2.2.0.0.0/bin/tmp/grid.rsp is not valid. CAUSE: Syntactically incorrect response file. Either unexpected variables are specified or expected variables are not specified in the response file. ACTION: Refer the latest product specific response file template SUMMARY: - cvc-datatype-valid.1.2.1: '1521,1522' is not a valid value for 'integer'. cvc-type.3.1.3: The value '1521,1522' of element 'oracle.install.crs.config.gpnp.scanPort' is not valid. oracle.install.commons.base.driver.common.InstallerException: [INS-10105] The given response file /opt/oracle/oak/pkgrepos/System/2.2.0.0.0/bin/tmp/grid.rsp is not valid. at oracle.install.commons.base.driver.common.Installer.validateResponseFile(Installer.java:375) at oracle.install.commons.base.driver.common.Installer.run(Installer.java:327) at oracle.install.ivw.common.util.OracleInstaller.run(OracleInstaller.java:106) at oracle.install.commons.util.Application.startup(Application.java:891) at oracle.install.commons.flow.FlowApplication.startup(FlowApplication.java:165) at oracle.install.commons.flow.FlowApplication.startup(FlowApplication.java:182) at oracle.install.commons.base.driver.common.Installer.startup(Installer.java:348) at oracle.install.ivw.crs.driver.CRSConfigWizard.startup(CRSConfigWizard.java:84) at oracle.install.ivw.crs.driver.CRSConfigWizard.main(CRSConfigWizard.java:91) Caused by: java.lang.Exception: cvc-datatype-valid.1.2.1: '1521,1522' is not a valid value for 'integer'. cvc-type.3.1.3: The value '1521,1522' of element 'oracle.install.crs.config.gpnp.scanPort' is not valid. at oracle.install.commons.util.XmlSupport.validate(XmlSupport.java:110) at oracle.install.commons.bean.xml.XmlBeanStoreFormat.validate(XmlBeanStoreFormat.java:201) at oracle.install.commons.bean.xml.PropertyFileFormat.validate(PropertyFileFormat.java:144) at oracle.install.commons.base.driver.common.Installer.validateResponseFile(Installer.java:373) ... 8 more Jun 12, 2012 8:08:34 AM oracle.install.commons.util.exception.DefaultErrorAdvisor$AbstractErrorAdvisor advise INFO: Advice is ABORT Jun 12, 2012 8:08:34 AM oracle.install.commons.util.exception.DefaultExceptionHandler handleException SEVERE: Unconditional Exit Jun 12, 2012 8:08:34 AM oracle.install.commons.util.ExitStatusSet add INFO: Adding ExitStatus FAILURE to the exit status set In this case you should follow the note ODA (Oracle Database Appliance): GI update is failing with oraInventory corruption (Doc ID 1466664.1) 2. Out of place upgrade failingThe "Out of place" upgrade has the following procedure: Step 1) Create a new GI home.
Step 2) Run clone.pl from new GI home. Step 3) Stop the crs stack. Step 4) Run gi_home/crs/config/config.sh from new GI home Step 5) Run rootupgrade.sh from new GI home. Step 6) update the inventory using: runInstaller -updateNodeList ORACLE_HOME=new_gi_home CRS=\"false\" -local runInstaller -updateNodeList ORACLE_HOME=new_gi_home CRS=\"true\" -local Step 7) Detach the old gi_home detachHome.sh -silent -local In same cases if something wrong happens during step 2, 4 or 5, the inventory is updated with the new GI home, detaching the old home.
crsctl query crs activeversion
- which is the GI version software issuing the command (from both nodes): crsctl query crs softwareversion
- check the inventory entry if it's poiting to the right GI home (on both nodes) see /u01/app/oraInventory/ContentsXML/inventory.xml
Example, the active CRS is 11.2.0.3.6 but the inventory is poiting to the new GI home that's 11.2.0.4.0: # crsctl query crs activeversion
Oracle Clusterware active version on the cluster is [11.2.0.3.6] # cat /u01/app/oraInventory/ContentsXML/inventory.xml <?xml version="1.0" standalone="yes" ?> <!-- Copyright (c) 1999, 2011, Oracle. All rights reserved. --> <!-- Do not modify the contents of this file by hand. --> <INVENTORY> <VERSION_INFO> <SAVED_WITH>11.2.0.3.0</SAVED_WITH> <MINIMUM_VER>2.1.0.6.0</MINIMUM_VER> </VERSION_INFO> <HOME_LIST> <HOME NAME="OraGrid11gR4" LOC="/u01/app/11.2.0.4/grid" TYPE="O" IDX="1" CRS="true"> <NODE_LIST> <NODE NAME="rwsoda309c1n1"/> <NODE NAME="rwsoda309c1n2"/> </NODE_LIST> </HOME> <HOME NAME="OraDb11204_home1" LOC="/u01/app/oracle/product/11.2.0.4/dbhome_1" TYPE="O" IDX="2"> <NODE_LIST> <NODE NAME="rwsoda309c1n1"/> <NODE NAME="rwsoda309c1n2"/> </NODE_LIST> </HOME> <HOME NAME="OraDb11203_home1" LOC="/u01/app/oracle/product/11.2.0.3/dbhome_1" TYPE="O" IDX="3"> <NODE_LIST> <NODE NAME="rwsoda309c1n1"/> <NODE NAME="rwsoda309c1n2"/> </NODE_LIST> </HOME> </HOME_LIST> <COMPOSITEHOME_LIST> </COMPOSITEHOME_LIST> </INVENTORY> Note also as into the inventory the OLD home can be marked as "REMOVED": <HOME NAME="OraGrid11gR3" LOC="/u01/app/11.2.0.3/grid" TYPE="O" IDX="1" REMOVED="T"/>
In case we are failing on step 5 (failing rootupgrade.sh) due to OCR inaccessible: crsctl query crs activeversion failed with PROC-26 Error while accessing the physical storage
ORA-29701: unable to connect to Cluster Synchronization Service You could try a reboot of both node. If this does not help, we should investigate if there is any disk mode_status missing, header_status unknown, mount_status offline/closed. In this case you should collect the output of the following commands: oakcli show disk
SQL query against ASM instance executed as grid as sysasm: set pages 40000
set lines 300 col PATH for a40 SELECT GROUP_NUMBER,DISK_NUMBER,MOUNT_STATUS,HEADER_STATUS,MODE_STATUS,STATE,OS_MB,TOTAL_MB,FREE_MB,NAME,FAILGROUP,PATH FROM V$ASM_DISK order by path; Moreover you should provide the log/trace described into "Location of OAK GI patching log/trace" section. 3. During GI update the rootupgrade.sh did not complete successfully (due to OC4J resource failed to start)During the GI update process you are getting an error message like: (...)
INFO: 2014-08-14 06:31:22: Running root scripts ERROR : Ran '/usr/bin/ssh -l root oda2 /u01/app/12.1.0.2/grid/rootupgrade.sh' and it returned code(25) and output is: Check /u01/app/12.1.0.2/grid/install/root_oda2_2014-08-14_06-43-45.log for the output of root script error at<Command = /usr/bin/ssh -l root oda2 /u01/app/12.1.0.2/grid/rootupgrade.sh> and errnum=<25> ERROR : Command = /usr/bin/ssh -l root oda2 /u01/app/12.1.0.2/grid/rootupgrade.sh did not complete successfully. Exit code 25 #Step -1# Exiting... ..........done INFO: GI patching summary on node: zaoda1 INFO: GI patching summary on node: zaoda2 INFO: Running post-install scripts ..........done INFO: Started Oakd INFO: Setting up the SSH ..........done and checking into the above log, in this example "/u01/app/12.1.0.2/grid/install/root_oda2_2014-08-14_06-43-45.log" you are observing the following: (...)
Started to upgrade the Oracle Clusterware. This operation may take a few minutes. Started to upgrade the OCR. Started to upgrade the CSS. The CSS was successfully upgraded. Started to upgrade Oracle ASM. Started to upgrade the CRS. The CRS was successfully upgraded. Successfully upgraded the Oracle Clusterware. Oracle Clusterware operating version was successfully set to 12.1.0.2.0 2014/08/14 06:52:35 CLSRSC-479: Successfully set Oracle Clusterware active version ^[[0m 2014/08/14 06:52:38 CLSRSC-476: Finishing upgrade of resource types ^[[0m 2014/08/14 06:53:05 CLSRSC-482: Running command: 'upgrade model -s 11.2.0.4.0 -d 12.1.0.2.0 -p last' ^[[0m 2014/08/14 06:53:05 CLSRSC-477: Successfully completed upgrade of resource types ^[[0m 2014/08/14 07:03:19 CLSRSC-1003: Failed to start resource OC4J ^[[0m 2014/08/14 07:03:20 CLSRSC-1007: Failed to start OC4J resource ^[[0m Died at /u01/app/12.1.0.2/grid/crs/install/crsupgrade.pm line 4214. The command '/u01/app/12.1.0.2/grid/perl/bin/perl -I/u01/app/12.1.0.2/grid/perl/lib -I/u01/app/12.1.0.2/grid/crs/install
# /u01/app/12.1.0.2/grid/bin/crsctl query crs activeversion
Oracle Clusterware active version on the cluster is [12.1.0.2.0] # /u01/app/12.1.0.2/grid/bin/crsctl query crs softwareversion Oracle Clusterware version on node [zaoda1] is [12.1.0.2.0] and also "oakcli show version -detail" is showing the right entry: # oakcli show version -detail |grep GI
GI_HOME 12.1.0.2.0 Up-to-date In this case you should modify the inventory '/u01/app/oraInventory/ContentsXML/inventory.xml' # cat /u01/app/oraInventory/ContentsXML/inventory.xml
<?xml version="1.0" standalone="yes" ?> <!-- Copyright (c) 1999, 2014, Oracle and/or its affiliates. All rights reserved. --> <!-- Do not modify the contents of this file by hand. --> <INVENTORY> <VERSION_INFO> <SAVED_WITH>12.1.0.2.0</SAVED_WITH> <MINIMUM_VER>2.1.0.6.0</MINIMUM_VER> </VERSION_INFO> <HOME_LIST> <HOME NAME="OraGrid11gR4" LOC="/u01/app/11.2.0.4/grid" TYPE="O" IDX="1" CRS="true"> <NODE_LIST> <NODE NAME="zaoda1"/> <NODE NAME="zaoda2"/> </NODE_LIST> </HOME> <HOME NAME="OraDb11204_home1" LOC="/u01/app/oracle/product/11.2.0.4/dbhome_1" TYPE="O" IDX="2"> <NODE_LIST> <NODE NAME="zaoda1"/> <NODE NAME="zaoda2"/> </NODE_LIST> </HOME> <HOME NAME="OraGrid12102" LOC="/u01/app/12.1.0.2/grid" TYPE="O" IDX="3"> <NODE_LIST> <NODE NAME="zaoda1"/> <NODE NAME="zaoda2"/> </NODE_LIST> </HOME> </HOME_LIST> <COMPOSITEHOME_LIST> </COMPOSITEHOME_LIST> </INVENTORY> as the CRS active home should be the new one, in this case the "OraGrid12102" and the old one should be signed as removed. The right inventory should be as following: # cat /u01/app/oraInventory/ContentsXML/inventory.xml
<?xml version="1.0" standalone="yes" ?> <!-- Copyright (c) 1999, 2014, Oracle and/or its affiliates. All rights reserved. --> <!-- Do not modify the contents of this file by hand. --> <INVENTORY> <VERSION_INFO> <SAVED_WITH>12.1.0.2.0</SAVED_WITH> <MINIMUM_VER>2.1.0.6.0</MINIMUM_VER> </VERSION_INFO> <HOME_LIST> <HOME NAME="OraGrid11gR4" LOC="/u01/app/11.2.0.4/grid" TYPE="O" IDX="1" REMOVED="T"> <NODE_LIST> <NODE NAME="zaoda1"/> <NODE NAME="zaoda2"/> </NODE_LIST> </HOME> <HOME NAME="OraDb11204_home1" LOC="/u01/app/oracle/product/11.2.0.4/dbhome_1" TYPE="O" IDX="2"> <NODE_LIST> <NODE NAME="zaoda1"/> <NODE NAME="zaoda2"/> </NODE_LIST> </HOME> <HOME NAME="OraGrid12102" LOC="/u01/app/12.1.0.2/grid" TYPE="O" IDX="3" CRS="true"> <NODE_LIST> <NODE NAME="zaoda1"/> <NODE NAME="zaoda2"/> </NODE_LIST> </HOME> </HOME_LIST> <COMPOSITEHOME_LIST> </COMPOSITEHOME_LIST> </INVENTORY> 4. During GI update the rootupgrade.sh did not complete successfully (due to OC4J resource failed to stop)You where performing the GI upgrade from version 11.2.0.3.6 (ODA v. 2.6) to 12.1.0.2 and you are getting the following error mesages (/opt/oracle/oak/log/<hostname>/patch/12.1.2.0.0/gidbupdate_xxxx.log): (...)
2014-10-28 09:55:14: Running config.sh 2014-10-28 09:55:14: INFO : Building up the config.sh response file... 2014-10-28 09:55:15: INFO : This is root, will become grid and run: /bin/su grid -c /usr/bin/ssh -l grid srv-odabenn1 /opt/oracle/oak/onecmd/tmp/gridconfig.sh 2014-10-28 09:55:15: INFO : Running on the local node: /bin/su grid -c /opt/oracle/oak/onecmd/tmp/gridconfig.sh 2014-10-28 09:56:25: INFO: Running root scripts 2014-10-28 09:56:25: INFO : Running /u01/app/12.1.0.2/grid/rootupgrade.sh on srv-odabenn1 2014-10-28 09:56:26: INFO : Running as root: /usr/bin/ssh -l root srv-odabenn1 /u01/app/12.1.0.2/grid/rootupgrade.sh 2014-10-28 10:11:21: ERROR : Ran '/usr/bin/ssh -l root srv-odabenn1 /u01/app/12.1.0.2/grid/rootupgrade.sh' and it returned code(25) and output is: Check /u01/app/12.1.0.2/grid/install/root_srv-odabenn1_2014-10-28_09-56-26.log for the output of root script 2014-10-28 10:11:21: ERROR : Command = /usr/bin/ssh -l root srv-odabenn1 /u01/app/12.1.0.2/grid/rootupgrade.sh did not complete successfully. Exit code 25 #Step -1# and checking into the above log, in this example "/u01/app/12.1.0.2/grid/install/root_srv-odabenn1_2014-10-28_09-56-26.log" you are observing the following: (...)
clscfg: EXISTING configuration version 5 detected. clscfg: version 5 is 11g Release 2. Successfully taken the backup of node specific configuration in OCR. Successfully accumulated necessary OCR keys. Creating OCR keys for user 'root', privgrp 'root'.. Operation successful. 2014/10/28 10:11:17 CLSRSC-1009: failed to stop resource OC4J [0m 2014/10/28 10:11:18 CLSRSC-1006: Failed to create the wallet APPQOSADMIN or associated users during upgrade. [0m Died at /u01/app/12.1.0.2/grid/crs/install/crsupgrade.pm line 4094. The command '/u01/app/12.1.0.2/grid/perl/bin/perl -I/u01/app/12.1.0.2/grid/perl/lib -I/u01/app/12.1.0.2/grid/crs/install /u01/app/12.1.0.2/grid/crs/install/rootcrs.pl -upgrade' execution failed at this point on node 0 $ crsctl query crs activeversion
Oracle Clusterware active version on the cluster is [11.2.0.3.0] $ crsctl query crs softwareversion Oracle Clusterware version on node [srv-odabenn1] is [12.1.0.2.0] and on node 1: $ crsctl query crs activeversion
Oracle Clusterware active version on the cluster is [11.2.0.3.0] $ crsctl query crs softwareversion Oracle Clusterware version on node [srv-odabenn1] is [11.2.0.3.0] Then you need manually upgrade the GI: 1. stop the running Databases (you may need to use sqlplus and not srvctl) 2. complete the GI upgrade on node 0: as root
# cd <12.1.0.1 GRID_HOME> # ./rootupgrade.sh 3. check the CRS active version on node 0 $ crsctl query crs activeversion
Oracle Clusterware active version on the cluster is [12.1.0.2.0] $ crsctl query crs softwareversion Oracle Clusterware version on node [srv-odabenn1] is [12.1.0.2.0] 4. perform the GI upgrade on node 1 as root
# cd <12.1.0.1 GRID_HOME> # ./rootupgrade.sh 5. check the CRS active version on node 1 $ crsctl query crs activeversion
Oracle Clusterware active version on the cluster is [12.1.0.2.0] $ crsctl query crs softwareversion Oracle Clusterware version on node [srv-odabenn1] is [12.1.0.2.0] 6. change the inventory.xml on both nodes, the 12.1.0.2 GridHome will be the CRS="true" # cat /u01/app/oraInventory/ContentsXML/inventory.xml
<?xml version="1.0" standalone="yes" ?> <!-- Copyright (c) 1999, 2014, Oracle and/or its affiliates. All rights reserved. --> <!-- Do not modify the contents of this file by hand. --> <INVENTORY> <VERSION_INFO> <SAVED_WITH>12.1.0.2.0</SAVED_WITH> <MINIMUM_VER>2.1.0.6.0</MINIMUM_VER> </VERSION_INFO> <HOME_LIST> <HOME NAME="OraGrid11gR3" LOC="/u01/app/11.2.0.3/grid" TYPE="O" IDX="1" REMOVED="T"> <NODE_LIST> <NODE NAME="zaoda1"/> <NODE NAME="zaoda2"/> </NODE_LIST> </HOME> <HOME NAME="OraDb11204_home1" LOC="/u01/app/oracle/product/11.2.0.4/dbhome_1" TYPE="O" IDX="2"> <NODE_LIST> <NODE NAME="zaoda1"/> <NODE NAME="zaoda2"/> </NODE_LIST> </HOME> <HOME NAME="OraGrid12102" LOC="/u01/app/12.1.0.2/grid" TYPE="O" IDX="3" CRS="true"> <NODE_LIST> <NODE NAME="zaoda1"/> <NODE NAME="zaoda2"/> </NODE_LIST> </HOME> </HOME_LIST> <COMPOSITEHOME_LIST> </COMPOSITEHOME_LIST> </INVENTORY> 7. Check "/opt/oracle/oak/install/oakdrun" on both nodes, you should have the following entry: # cat /opt/oracle/oak/install/oakdrun
start 5. During GI update the rootupgrade.sh did not complete successfully (due to ASM not able to start up succesfully)During the GI update process you are getting an error message like: (...)
SUCCESS: All nodes in /opt/oracle/oak/onecmd/tmp/db_nodes are pingable and alive. INFO: 2014-07-24 01:39:40: Installing GI clone INFO: 2014-07-24 01:51:49: Running root scripts ERROR : Ran '/usr/bin/ssh -l root jupiter1 /u01/app/11.2.0.4/grid/rootupgrade.sh' and it returned code(25) and output is: Check /u01/app/11.2.0.4/grid/install/root_oda1_2014-07-24_01-51-50.log for the output of root script error at<Command = /usr/bin/ssh -l root oda1 /u01/app/11.2.0.4/grid/rootupgrade.sh> and errnum=<25> ERROR : Command = /usr/bin/ssh -l root oda1 /u01/app/11.2.0.4/grid/rootupgrade.sh did not complete successfully. Exit code 25 #Step -1# Exiting... ............done and checking into the related log, in this example "/u01/app/11.2.0.4/grid/install/root_oda1_2014-07-24_01-51-50.log" you are observing the following: (...)
ASM upgrade has initialized on first node. OLR initialization - successful Replacing Clusterware entries in inittab Start of resource "ora.asm" failed CRS-2672: Attempting to start 'ora.drivers.acfs' on 'oda1' CRS-2676: Start of 'ora.drivers.acfs' on 'oda1' succeeded CRS-2672: Attempting to start 'ora.asm' on 'oda1' CRS-5017: The resource action "ora.asm start" encountered the following error: ORA-03113: end-of-file on communication channel Process ID: 14459 Session ID: 143 Serial number: 1 . For details refer to "(:CLSN00107:)" in "/u01/app/11.2.0.4/grid/log/oda1/agent/ohasd/oraagent_grid/oraagent_grid.log". CRS-2674: Start of 'ora.asm' on 'oda1' failed CRS-2679: Attempting to clean 'ora.asm' on 'oda1' CRS-2681: Clean of 'ora.asm' on 'oda1' succeeded CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'oda1' CRS-2677: Stop of 'ora.drivers.acfs' on 'oda1' succeeded CRS-4000: Command Start failed, or completed with errors. Failed to start Oracle Grid Infrastructure stack Failed to start ASM at /u01/app/11.2.0.4/grid/crs/install/crsconfig_lib.pm line 1340. /u01/app/11.2.0.4/grid/perl/bin/perl -I/u01/app/11.2.0.4/grid/perl/lib -I/u01/app/11.2.0.4/grid/crs/install /u01/app/11.2.0.4/grid/crs/install/rootcrs.pl execution failed At this time, we were not able to complete rootupgrade.sh script, and you won't be able to start GI from old (i.e. 11.2.0.3) GI home. In order to restore the Grid Infrastructure you need issue the following command from the new GI home (in this particular case is 11.2.0.4): <new GI home>/crs/install/rootcrs.pl -downgrade -force -oldcrshome <old gi home path> -version <old gi version>
i.e.: /u01/app/11.2.0.4/grid/crs/install/rootcrs.pl -downgrade -force -oldcrshome /u01/app/11.2.0.3/grid -version 11.2.0.3.0 Check the OCR and inspect the ocrdump key [SYSTEM.version.hostnames] to make sure the software version for the existing GI doesn't change: ocrdump -stdout -keyname SYSTEM.version.hostnames |grep ORATEXT
ORATEXT : 11.2.0.4.0 ORATEXT : 11.2.0.4.0 If the OCR gets changed, it can be resorted from the OCR backup files which are located at <old_GIHOME>/grid/cdata/<cluster_name>: ocrconfig -restore <old_GIHOME>/grid/cdata/<cluster_name>
ie: ocrconfig -restore /u01/app/11.2.0.3/grid/cdata/oda1-c/backup00.ocr (backup00.ocr day.ocr week.ocr) After the "-downgrade" command issued above, the failing node is having the Grid Infrastructure not configured. You should execute rmnode/addnode from the working one. - From the working node that you are not deleting, run the following command from the crsctl delete node -n node_to_be_deleted
- To add the node back $ ./addNode.sh "CLUSTER_NEW_NODES={<node_name>}" "CLUSTER_NEW_VIRTUAL_HOSTNAMES={<node VIP hostname>}"
Once you have fixed the reason for which ASM was not able to startup anymore, you can proceed with the GI infrastructure update one more time.
6. Successfully GI upgrade but ASM is crashing with ODA-600 [kfdJoin3]In this case study the GI upgrade went fine but ASM is crashing with ora-600 [kfdJoin3]. As reported on Note 888888.1 this is a know issue and the solution is quite simple. You should update the three files: 1.) "opt/oracle/oak/onecmd/asmapplconf_header.txt"
2.) "opt/oracle/oak/onecmd/asmapplconf_header_V2_J2.txt" 3.) "/opt/oracle/extapi/asmappl.config file" In all the above files, you should set the value for attr max_disk_count from 500 to100. Then ASM will be able to startup.
7. GI upgrade failure after a previous failureThere was a failure in update the Grid Infrastructure ("--gi"). trying to run the upgrade one more time, it is failing with following error : ERROR : Ran '/usr/bin/ssh -l root odanode1 /opt/oracle/oak/onecmd/tmp/giclonepl.sh' and it returned code(255) and output is:
rm -f oracle dbv tstshm maxmem orapwd dbfsize cursize genoci extproc extproc32 hsalloci hsots hsdepxa dgmgrl dumpsga mapsga osh sbttest expdp impdp imp exp sqlldr rman /u01/app/11.2.0.4/grid/rdbms/lib/dg4odbc mkpatch /u01/app/11.2.0.4/grid/rdbms/lib/dg4adbs /u01/app/11.2.0.4/grid/rdbms/lib/dg4db2 /u01/app/11.2.0.4/grid/rdbms/lib/dg4ifmx /u01/app/11.2.0.4/grid/rdbms/lib/dg4ims /u01/app/11.2.0.4/grid/rdbms/lib/dg4msql /u01/app/11.2.0.4/grid/rdbms/lib/dg4sybs /u01/app/11.2.0.4/grid/rdbms/lib/dg4tera /u01/app/11.2.0.4/grid/rdbms/lib/dg4vsam nid adrci wrc extjob extjobo jssu genezi kfod amdu kfed grdcscan uidrvci diskmon setasmgid renamedg orion skgxpinfo /u01/app/11.2.0.4/grid/rdbms/lib/ksms.s /u01/app/11.2.0.4/grid/rdbms/lib/ksms.o (if /u01/app/11.2.0.4/grid/bin/skgxpinfo | grep rds;\ then \ make -f /u01/app/11.2.0.4/grid/rdbms/lib/ins_rdbms.mk ipc_rds; \ else \ make -f /u01/app/11.2.0.4/grid/rdbms/lib/ins_rdbms.mk ipc_g; \ fi) make[1]: Entering directory `/u01/app/11.2.0.4/grid/rdbms/lib' rm -f /u01/app/11.2.0.4/grid/lib/libskgxp11.so cp /u01/app/11.2.0.4/grid/lib//libskgxpg.so /u01/app/11.2.0.4/grid/lib/libskgxp11.so make[1]: Leaving directory `/u01/app/11.2.0.4/grid/rdbms/lib' - Use stub SKGXN library cp /u01/app/11.2.0.4/grid/lib/libskgxns.so /u01/app/11.2.0.4/grid/lib/libskgxn2.so /usr/bin/ar cr /u01/app/11.2.0.4/grid/rdbms/lib/libknlopt.a /u01/app/11.2.0.4/grid/rdbms/lib/kcsm.o Background process 3868 (node: odanode2) gets done with the exit code 255 Background process 3845 (node: odanode1) gets done with the exit code 255 ERROR : Failure in copying /opt/oracle/oak/onecmd/tmp/giclonepl.sh to DB nodes and executing it as root in parallel Exiting.. Due to previous failure, the new grid home was created already and the new upgrade process (clone GI home) is failing. From giclonepl.sh-xxxx.log you can see similar error as given below after failure. You can find the log of this install session at:
/u01/app/oraInventory/logs/cloneActions2014-01-11_12-31-24PM.log OUI-10197:Unable to create a new Oracle Home at /u01/app/11.2.0.4/grid. Oracle Home already exists at this location. Select another location. SEVERE:OUI-10197:Unable to create a new Oracle Home at /u01/app/11.2.0.4/grid. Oracle Home already exists at this location. Select another location. tall2014-01-11_12-31-24PM/oui/jlib/ojmisc.jar:/tmp/OraInstall2014-01-11_12-31-24PM/oui/jlib/xml.jar:/tmp/OraInstall2014-01-11_12-31-24PM/oui/jlib/srvm.jar:/tmp/OraInstall2014-01-11_12-31-24PM/oui/jlib/srvmasm.jar: To resolve the issue follow the steps given below. 1. Remove new Grid_Home on both the nodes rm -fr /u01/app/<new GI home>
ie rm -fr /u01/app/11.2.0.4 2. Replace the central inventory ("/u01/app/oraInventory/ContentsXML/inventory.xml") on both nodes from last backup, or execute on both nodes the following commands (change nodename1,nodename2 with your ODA host nodes name): export NEW_GI_HOME=/u01/app/11.2.0.4/grid
export ORACLE_HOME=$NEW_GI_HOME $OLD_HOME/oui/bin/runInstaller -detachHome -silent -local ORACLE_HOME=$NEW_GI_HOME . export OLD_GI_HOME=/u01/app/11.2.0.3/grid export ORACLE_HOME=$OLD_GI_HOME $OLD_HOME/oui/bin/runInstaller -attachHome -silent -local ORACLE_HOME=$OLD_GI_HOME ORACLE_HOME_NAME=OraGrid11gR3 "CLUSTER_NODES=nodename1,nodename2" CRS=true 3. Rerun update --GI command. oakcli update –patch <patch_number> –gi
References<NOTE:1374275.1> - Oracle Clusterware (GI or CRS) Related Abbreviations, Acronyms and Procedures<NOTE:888888.1> - Oracle Database Appliance - 12.1.2 and 2.X Supported ODA Versions & Known Issues <NOTE:1557502.1> - ODA (Oracle Database Appliance) troubleshooting and solutions for ORA-600 [kfdjoin3] causing ASM startup failure after patching to 2.5 or 2.6 <BUG:18292186> - LNX64-112-CMT: OUT-OF-PLACE PATCHING GI, LOG AND ERROR CHECKING IMPROVEMENT <BUG:18276205> - ODA: FAILED TO UPGRADE GI HOME. <BUG:19444164> - LNX64-112-CMT: GI UPGRADE FAILURE DUE TO OC4J <NOTE:1053393.1> - How to Update Inventory to Set/Unset "CRS=true" Flag for Oracle Clusterware Home <BUG:18149174> - 2.2.0.0.0 NOT UPGRADED GI_HOME 11.2.0.2.5(13343424, 11.2.0.3.2(13696216,1334344 <NOTE:1466664.1> - ODA (Oracle Database Appliance): GI update is failing with oraInventory corruption <NOTE:1056322.1> - Troubleshoot Grid Infrastructure/RAC Database installer/runInstaller Issues <BUG:18456643> - AMDU ERRORS OUT IF A DISK WHICH WAS NOT PART OF A DISKGROUP WAS NOT READABLE <BUG:19280211> - ODA: OAKCLI UPDATE -PATCH 2.10.0.0.0 --GI FAILED WITH ASM ERRORS <NOTE:1364947.1> - How to Proceed When Upgrade to 11.2 Grid Infrastructure Cluster Fails <NOTE:1513912.1> - TFA Collector - Tool for Enhanced Diagnostic Gathering <BUG:14151562> - LNX64-112-CMT: NEED TO CLEANUP THE INVENTORY ENTRIES IF GI/DB UPGRADE IS FAILED. <BUG:18276205> - ODA: FAILED TO UPGRADE GI HOME. Attachments This solution has no attachment |
||||||||||||||||||||||||||||||||||||||||||||||
|