![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||||
Solution Type Problem Resolution Sure Solution 2043654.1 : SuperCluster - Infiniband switch reboot may cause database evictions on if there are a large number of RDS connections
When a SuperCluster Infiniband switch is rebooted, database nodes may evict on domains with a large number of RDS connections. The problem is due to a timeout threshold being reached before all the RDS connections can be re-established. There are a number of factors which impact the number of RDS connections. This Doc provides information for users to determine their current number of RDS connections and prediction formulae to determine the likely number of RDS connections after applying this QFSDP. If the prediction formulae indicate that applying this QFSDP would exceed the current safe threshold of RDS connections, then file an SR and await further advice. The prediction formulae should also be used before deploying additional Databases on a system. In this Document
Applies to:SPARC SuperCluster T4-4 Full Rack - Version All Versions to All Versions [Release All Releases]SPARC SuperCluster T4-4 Half Rack - Version All Versions to All Versions [Release All Releases] Oracle SuperCluster T5-8 Full Rack - Version All Versions to All Versions [Release All Releases] Oracle SuperCluster T5-8 Half Rack - Version All Versions to All Versions [Release All Releases] Oracle SuperCluster M6-32 Hardware - Version All Versions to All Versions [Release All Releases] Oracle Solaris on SPARC (64-bit) SymptomsOne of more database nodes may evict due to diskmon split brain for RAC or diskmon may fence off cells for non-RAC. If this has occurred, indications similar to the following can be found in the diskmon.trc file after the rebooted switch comes online: ossnet_connect_to_box: Giving up on box <IPADDR> as retry limit (7) reached.
ossnet_connect_to_box: Giving up on box <IPADDR> as retry limit (7) reached. ossnet_connect_to_box: Giving up on box <IPADDR> as retry limit (7) reached. If the impacted database is in a zone, the zone may or may not transition to the state "shutting_down" and possibly hang. This may be observed using the zoneadm command: # zoneadm list -civ
ID NAME STATUS PATH BRAND IP 0 global running / solaris shared 5 etc3-exa4dbadm01 shutting_down /zoneHome/etc3-exa4dbadm01 solaris excl 15 etc3-exa5dbadm01 shutting_down /zoneHome/etc3-exa5dbadm01 solaris excl 16 etc3-exa1dbadm01 shutting_down /zoneHome/etc3-exa1dbadm01 solaris excl 17 etc3-exa2dbadm01 running /zoneHome/etc3-exa2dbadm01 solaris excl 18 etc3-exa3dbadm01 running /zoneHome/etc3-exa3dbadm01 solaris excl 19 etc3-exa7dbadm01 running /zoneHome/etc3-exa7dbadm01 solaris excl To fully recover from a hung zone, the domain containing the database zones needs to be rebooted. ChangesWhen a SuperCluster Infiniband switch is rebooted, database nodes may evict on domains with a large number of RDS connections. The numbers stated are on a per LDom (global zone) basis not the cumulative across all LDoms(global zones) The prediction formulae should also be used before deploying additional Databases on a system. CauseThe problem is due to a timeout threshold being reached before all the RDS connections can be re-established on SuperCluster domains with a large number of RDS connections. SolutionIncremental improvements have been made in Solaris 11.3 SRU7 (in July 2016 QFSDP) and SRU11 (in Oct 2016 QFSDP). The safe RDS connection limit is:
1. To determine the number of RDS connections currently on a System, run the shell script below (conn-cnt.sh) in the global zone of all database domains. If the reported number of RDS connections is below the safe RDS connection limit in each domain, then your system is not currently at risk. Proceed to step 2. $ cat conn-cnt.sh #! /bin/sh
# # gather rds connections from global zone rds-info -n > conn_rds.txt # # gather rds connections from all non-global zones zones=`eval "zoneadm list | grep -v global"` for i in $zones do zlogin $i "rds-info -n" >> conn_rds.txt done # # remove headers grep -v "RDS Connections" conn_rds.txt > totalcount1 grep -v "LocalAddr" totalcount1 > totalcount grep -v "127.0.0.1" totalcount > totalcount1 sed '/^$/d' totalcount1 > totalcount # # record total count tc=`eval wc -l totalcount` rm totalcount1 totalcount conn_rds.txt # create a log for offline reference uname -a > conn_rds.txt date >> conn_rds.txt rds-info -n >> conn_rds.txt zones=`eval "zoneadm list | grep -v global"` for i in $zones do echo $i >> conn_rds.txt zlogin $i "rds-info -n" >> conn_rds.txt echo " " >> conn_rds.txt done # echo "Total number of rds connections detected : $tc" >> conn_rds.txt echo "Total number of rds connections detected : $tc" 2) Before applying the July 2015 QFSDP or later, creating more database zones, adding storage expansion or migrating database versions - for example, to 12.1.0.2 - use the following formulae to predict the likely resultant number of RDS connections. If the prediction is above the safe RDS connection limit, log a service request for further assistance. On a database domain in every RAC cluster: B is the number of IP addresses specified in the fle /etc/oracle/cell/network-config/cellinit.ora, which can be retrieved as follows: # grep "^ipaddress" /etc/oracle/cell/network-config/cellinit.ora | wc -l C is the number of IP addresses specified in file /etc/oracle/cell/network-config/cellip.ora. Note: There may be more than one ip address per line and each should be counted. N is the node count in the cluster. Run 'olsnodes' and count the number of nodes: Run as root or grid user: # $GRID_HOME/bin/olsnodes | wc -l For 12.1.0.2 databases, the formula is: B(B*N*2 + C*7) For 11g and 12.1.0.1 database versions, the formula is: B(B*N*2 + C*2) Sum the above for all database zones under a single domain. If there are no zones, it's just the sum of all databases running in the domain itself. For example, rds_count=Zone_1_12.1.0.2[B(B*N*2 + C*7)] + Zone_2_11.2.0.4[B(B*N*2+*C*2)] + Zone_3_12.1.0.1[B(B*N*2+*C*2)]. This bug introduced by the putback for BugID: 16024464 - QoS - Segregate RDS traffic based on SL (/PSARC/2013/237 - IB QOS for RDSv3) <Bug 22380320> - Multiple worker threads are required for connection scaling (was <Bug 21417505> - rebooting of ib switch causing local zones going to shutting down state) was addressed in Solaris 11.3 SRU7 (in the July 2016 QFSDP) and the safe RDS connection limit has been further enhanced in Solaris 11.3 SRU11 with further enhancements due in Solaris 11.3 SRU14. Safe RDS Connection Limits:
NOTE: If a customer system has already exceeded the limit of safe RDS connection limit, there are 2 choices (in the following order of preference): 1) Avoid Infiniband switch reboots 2) Disable the NRM / QOS feature on all 12c RAC nodes Option 1 is preferable since it doesn't involve any disruptive changes that have to be enabled again once the bug is fixed (which might be forgotten or overlooked). However, it still leaves the customer vulnerable should an Infiniband switch fail. If the customer has experienced leaf switch failures or node evictions in the recent past, then consider implementing option 2. Disabling NRM / QOS is the last choice and not generally recommended since it requires setting a hidden parameter. It requires both the compute nodes and the Exadata storage cells to be rebooted at the same time to get rid of existing RDS QOS connections, otherwise they'll persist. 20 minutes must be left between IB switch reboots. 1) Disable NRM / QOS on all 12 RAC nodes as follows: - As root on one RAC node stop the cluster: # crsctl stop cluster -all - As root on all RAC nodes disable restart and stop OHAS: # crsctl disable crs * All CRS on all nodes that talk to the cells should be stopped. - As root on all RAC nodes edit /etc/oracle/*/*/cellinit.ora, adding: _skgxp_ctx_flags1=8388608 - As root on all RAC nodes shutdown Solaris: # init 0 2) Reboot all cells to purge previously established NRM / QOS connections. * Take proper precaution to ensure other RACs sharing the cells are - As root on all cells reboot Linux: # reboot - As root verify celld service post reboot: # service celld status 3) Boot RAC nodes and then restart CRS as follows: - On all RAC nodes boot Solaris: > boot * For a zone RAC node use 'zoneadm -z <zonename> boot' from GZ. - As root on all RAC nodes start the cluster: # crsctl start crs (will take some time) - As root on all RAC nodes enable restart: # crsctl enable crs - As root verify all resources are up: # crsctl stat res -t 4) Verify RDS connections: - As root on all RAC nodes verify only Tos/SL's 0 and 4 are in use: # rds-info -n RDS IB Connections:
References<NOTE:1452277.1> - SuperCluster Critical IssuesAttachments This solution has no attachment |
||||||||||||||||||||
|