Asset ID: |
1-72-1520330.1 |
Update Date: | 2016-04-07 |
Keywords: | |
Solution Type
Problem Resolution Sure
Solution
1520330.1
:
"smpartition list active" Shows Inconsistent Partition Information Between Exalogic Switches
Related Items |
- Exalogic Elastic Cloud X4-2 Eighth Rack
- Oracle Exalogic Elastic Cloud Software
|
Related Categories |
- PLA-Support>Eng Systems>Exalogic/OVCA>Oracle Exalogic>MW: Exalogic Core
|
In this Document
Created from <SR 3-6507832511>
Applies to:
Oracle Exalogic Elastic Cloud Software - Version 2.0.0.0.0 and later
Exalogic Elastic Cloud X4-2 Eighth Rack - Version X4 to X4 [Release X4]
Linux x86-64
Oracle Solaris on x86-64 (64-bit)
Symptoms
You have an Exalogic system where information about nodes added to Subnet Manager partitions, which is expected to be identical across all the switches in the Exalogic rack (all NM2-GW gateway switches and any NM2-36P spine switch that is present within the rack configuration), appears to be inconsistent between some switches as the result of not being propagated across all the switches. As a result, information about changes to partitions is not present in all switches which could give rise to unexpected behavior in the event that one of the switches, whose partition list is incomplete/out of sync, assumes the role of the MASTER Subnet Manager (as identified by the "getmaster" command available on switches).
To determine whether or not the partition information is consistent across all switches, run the "smpartition list active" command and compare the output. As the output is long and since the output of the "smpartition list active" cannot be easily redirected to a file (it pauses the output and requires the space/return key to be pressed in a "more" like fashion) the following script can be created as checkPartitionConsistency.sh and run:
#!/bin/sh
PROGNAME="${0##*/}"
SWITCHES="elorl01gw01-adm"
SWITCHES="${SWITCHES} elorl01gw02-adm"
SWITCHES="${SWITCHES} elorl01sw-ib-ilom"
TMPFILE=`mktemp /tmp/${PROGNAME}.XXXXXXXXXX`
for SWITCH in ${SWITCHES}
do
MD5SUM='md5sum /conf/partitions.current'
CMD="ssh root@${SWITCH} ${MD5SUM}"
RESULT=`${CMD} 2> /dev/null|grep 'partitions.current$'`
RESULT="${RESULT%% *}"
echo "${RESULT} (${SWITCH})" >> ${TMPFILE}
done
if [[ -s ${TMPFILE} ]]
then
cat ${TMPFILE}
rm -f ${TMPFILE}
fi
NOTE:
- The above script can be run from any switch or compute node that can access the IP address of each of the switches in the rack.
- When executing the script then, unless ssh user equivalence has been configured, you will be prompted to provide the root password for each of the switches.
On its conclusion the script prints, for each switch, an "md5sum" value based on the content of the switch's active partition list and the md5sum value should be expected to be the same across all switches:
# ./checkPartitionConsistency.sh
root@el_gw01-adm's password: ******
root@el_gw02-adm's password: ******
root@el_sp-adm's password: ******
0099be63b4c14f6b3eda02d288ae2db3 (el_gw01-adm)
0099be63b4c14f6b3eda02d288ae2db3 (el_gw02-adm)
0099be63b4c14f6b3eda02d288ae2db3 (el_sp-adm)
In the case the partition information is already found to be consistent across each switch then no further action is needed from this note and further analysis of the problem will be necessary.
Cause
However, in the event the md5sum values returned for any switch is inconsistent with other switches, this suggests:
- There is an inconsistency in the IP address list reflected in the "smnodes list" output for one or more of the switches
- this can occur if the same set of "smnodes add" commands have not yet been run on every one of the switches
- The difference is the result of an unexpected error encountered when an "smnodes add" command was run on one of the switches, such that the switch executing the command was unable to validate a Subnet Manager could be reached at the IP address provided.
This can occur when:
- the Subnet Manager (opensm) and Infiniband Partition Daemon (partitiond) services are not running on the switch being added
- the switch to be added does not have a valid configuarion, for example /conf/configvalid does not contain the value 1
- the switch to be added is part of an extended Infiniband Fabric that includes multiple Engineered Systems (e.g. Exalogic/Exadata/SuperCluster) and the IP address provided could not be reached from the network environment of the switch executing the "smnodes add" command
e.g. the IP address belongs to a different subnet that either may not be connected to or routable from the swith on which the "smnodes add" command is being run.
Solution
For every switch of each type present in your IB fabric, whether a gateway (NM2-GW) or spine switch (NM-36P):
- Verify whether or not the Subnet Manager is running:
# service opensmd status
opensm (pid 32296) is running...
# service partconfigd status
partitiond-daemon is running
- For any switch where the Subnet Manager is found not to be running, utilize the following steps to start it:
- Check that /conf/configvalidcontains the value 1 across all switches
- Start/re-start the Subnet Manager (opensm) and Infiniband Partition Daemon (partitiond) services, via the "enablesm" command:
# enablesm
Starting IB Subnet Manager. [ OK ]
Starting partitiond daemon. [ OK ]
- Confirm that the output from "smnodes list" on every switch, shows the IP address of each switch in the system
# smnodes list
10.141.135.37
10.141.135.38
10.141.135.40
- In the case that "smnodes list" on a switch does not show the IP address of each switch in your Exalogic rack, please run additional "smnodes add" commands to add the IP address of each of the missing switches to the list originally returned by that switch:
# smnodes add 10.141.135.39
- In the event that you receive an error, such as:
# smnodes add 10.141.135.39
Could not communicate with 10.141.135.39. Node not added
Then:
- Ensure the IP address is reachable from the current switch (# ping <Switch IP>)
- Connect to the targeted switch via ssh and:
- ensure the Subnet Manager is running (enablesm)
- confirm that /conf/configvalid is 1 (echo "1" > /conf /configvalid)
Note that, beyond the above suggestions, additional problems may be present that prevent the "smnodes add" command from completing as expected for a given switch and that the resolution of such problems may vary and therefore falls outside of the scope of this note. However, to ensure the "smnodes list" output is consistent between nodes as required, it will be necessary to further troubleshoot and resolve these problems until the same set of "smnodes add" commands have been run successfully against each of the switches expected to be present in the "smnodes list" output.
NOTE:
- If your IB fabric contains an interconnected Engineered System, such as Exadata or SuperCluster, please review the following documentation to understand which switches in the Infiniband Fabric should be running the Subnet Manager and which should not:
Oracle Exalogic Elastic Cloud
Machine Owner's Guide
Release EL X2-2 and EL X3-2
E18478-14
http://docs.oracle.com/cd/E18476_01/doc.220/e18478.pdf
Chapter 12: Using the InfiniBand Gateway Switches and Managing the InfiniBand Network Using Subnet Manager
12.3 Managing InfiniBand Network Using Subnet Manager
"12.3.2 Running the Subnet Manager in Different Rack Configurations", page 12-9
http://docs.oracle.com/cd/E18476_01/doc.220/e18478/leafswitch.htm#CBHFCCBA
- In essence, only those swithces running at the highest level version of the switch firmware in the extended Infiniband Fabric are expected to have the Subnet Manager actively running running and only the IP addresses of these nodes are expected to be found in the smnodes list across the rack. In addition, it would be beneficial to ensure that the Subnet Manager configuration on systems running a lower version of switch firmware be assigned a lower priority, such that they will not readily assume the role of MASTER Subnet Manager should the Subnet Manager be accidentally started on those switches.
Attachments
This solution has no attachment