Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-79-2140928.1
Update Date:2018-04-08
Keywords:

Solution Type  Predictive Self-Healing Sure

Solution  2140928.1 :   How to Prepare an Infiniband (IB) Fabric for Planned Outage of an IB Switch  


Related Items
  • Sun Datacenter InfiniBand Switch 36
  •  
  • Oracle SuperCluster Specific Software
  •  
  • Sun Network QDR InfiniBand Gateway Switch
  •  
  • Exadata Database Machine V2
  •  
Related Categories
  • PLA-Support>Sun Systems>SAND>Network>SN-SND: Sun Network Infiniband
  •  




In this Document
Purpose
Scope
Details
 1. Checks for IB fabric with multiple IB Switches
  1.1. Confirm Hosts bonding/IPMP/IO-path redundancy
  1.2. For a CRS Cluster, confirm fix is in place for node reboot on IB Switch reboot issue
  1.4. Check the opensm status and smpriorities on all switches
  1.5. Check IB Fabric using “ibswitches” and “getmaster”
  1.6.  Check that all IB Switches can ping each other through management interfaces
  1.7. Check IB partitions and secret M-Key policy
 2.  Confirm type/extent of downtime required
 3. Complete the check-list template – IB Fabric preparation for IB Switch planned outage.
 4. Data Collection and Upload
 5. Proceed to next steps
 Notes / Addendum
References


Applies to:

Sun Datacenter InfiniBand Switch 36 - Version All Versions to All Versions [Release All Releases]
Oracle SuperCluster Specific Software
Sun Network QDR InfiniBand Gateway Switch - Version All Versions to All Versions [Release All Releases]
Exadata Database Machine V2 - Version All Versions and later
Information in this document applies to any platform.

Purpose

This document contains information on how to prepare an Infiniband (IB) Fabric for any planned outage of an Infiniband Switch within that IB Fabric. It also contains a checklist to assist Customer-admin to determine if a full Fabric outage will be required, based on the results of checks done.

Scope

Note: For IB switches within an exalogic system or a multirack containing exalogic, use Doc ID 2211261.1 instead of this document.

Planned Outage could include a Reboot (or boot after previous shut-down), Patching (firmware-upgrade), or Replacement of an IB Switch in the IB Fabric.

The checks and actions in this document are critical to ensuring that production traffic in the Infiniband (IB) Fabric may be resilient to the necessary restart of the IB Switch involved in any of the above operations.

Based on the result of the aforementioned checks, guidance is provided - via a checklist - as to whether a full downtime of the IB Fabric will be required (full outage of all switches and nodes actively participating in the fabric). Customers should only take the IB Switch outage within a production IB Fabric, when all checks are cleared in the affirmative.

This document is referenced by several other Oracle Support knowledge articles, including:

   - How to Prepare an Infiniband Switch for Replacement (Doc ID 1636229.1)

The document distribution is EXTERNAL since it needs to be shared with and used by the Customer-admin, as well as referenced by Partners, Field Engineers, and Oracle Support.

 

Details

 

1. Checks for IB fabric with multiple IB Switches

 1.1. Confirm Hosts bonding/IPMP/IO-path redundancy


Ensure that the configurations of the interfaces in all the Hosts connected to the IB Fabric have appropriate multipathing/redundancy/fail-over, such that replacement/reboot/patching of this switch will not result in an outage of your network.

 1.2. For a CRS Cluster, confirm fix is in place for node reboot on IB Switch reboot issue


If you have a CRS Cluster running on the Host-nodes in this IB Fabric (for example with Exadata and SuperCluster), then check the fix for the following bug and make sure that all Cluster nodes have the required patches:
      Bug 18199185 - MOS Notes 18199185.8, 1645783.1, 1567979.1.
      (Exadata-related Doc ID 1645783.1) and (Oracle SuperCluster Supported Software Versions - All Hardware Types (Doc ID 1567979.1)).
Contact Oracle Support and have them confirm whether your systems/Hosts/nodes are susceptible to this bug and if they are to get them patched to ensure this is not an issue prior to proceeding with the Suggested Action.

 

1.3.  Check firmware version on all the IB switches within the rack.

         All switches must be running the same firmware version.   The output of the following command will give the firmware version number.

                #version

 1.4. Check the opensm status and smpriorities on all switches

Check the following outputs on all IB switches to determine which switches are running opensm and what their priorities are.
        #service opensmd status   <<<< This will tell you whether or not opensm is running on this switch
        #setsmpriority list             <<<< This will tell you the current priority.  This will also tell you whether or not Controlledhandover is set TRUE.


Make sure that the smpriorities and controlledhandover of the switches running opensm are as per the standard configuration of your engineered system, and that opensm is running on the switches as per the standard configurations:


    - If this is rack or multirack consisting of Exadata and/or SuperCluster only, refer to "Understanding the Network Subnet Manager Master" in Oracle Exadata Database Machine Owner's Guide.

   - If this is a custom multi-IB-switch configuration, check your ISV's install documentation.

 

 1.5. Check IB Fabric using “ibswitches” and “getmaster”


First, list all Infiniband switches in the network by running the following command on any one of the IB leaf switches (leaf switch being any switch that has Hosts connected):

#ibswitches

Ensure that all IB switches are seen in its output. If any of the expected IB switches are missing in the output, IB cable connectivity to each missing switch needs to be checked and fixed.

Secondly, run the following command on all the IB switches in the network and make sure that all of them report the same master subnet manager in the IB network and that the master is not moving around from switch to switch:

#getmaster

Note: If this is a multirack system consisting of several racks, make sure that the above command is run on all IB switches in all the racks. Any anomaly here could be the result of problems in Infiniband cabling, for example one Switch being isolated incorrectly.

 

 1.6.  Check that all IB Switches can ping each other through management interfaces

    Make sure that you can (Ethernet) ping every IB switch from every other IB switch through its management interface.  If any Switch is not reachable from any other Switch over their respective management Ethernet interfaces, then you need to get that fixed first.  Ensure that there are no firewalls between management networks of individual racks within a multirack system.
Discuss with SR owner (Oracle Support Engineer) if help is needed on this point.

 

 1.7. Check IB partitions and secret M-Key policy

 

     If IB partitions are configured in this IB fabric or secret M-Key policy is in use, do the following steps.


Note: If the switch is being replaced and it is running and it is the current Master Subnet Manager and yet it is not accessible through the management port, you will need to confirm from your data-centre change-logs that IB partitions have been propagated to every other IB switch running opensm in the past whenever partitions were created or modified using smpartition commands. If IB partition propagation has not occurred consequent to all previous smpartition changes, then there can be issues when this switch is shut down and Master fails over. If you are unable to confirm whether propagation has occurred after all previous partition changes, then it is recommended to replace this switch during a down time of the whole IB Fabric, in other words a full outage of all switches and nodes actively participating in the fabric.

     a) Run the following command on all switches running opensm

         #smnodes list

          Make sure that this is identical on all switches and that the output has the management IP addresses of all the switches running opensm.

 

    b) If there are IB Gateway switches in this IB fabric, check and make sure that the port GUIDs of all the IB Gateway switches are in all IB partitions.

        The following command run on an IB Gateway switch will show GUIDs of the four bridges of this switch

               #showgwports

        Run the above command on all IB Gateway switches in the IB fabric.

        The following command run on the switch running as the Master will show all the IB partitions (first identify the Master either by # getmaster or # sminfo command)

               #smpartition list active

        Check if all four GUIDs of all IB Gateway switches are in all IB partitions.  If not, add the missing GUIDs as follows:

               Run the following on the switch running as Master

                     #smpartition start

                     # smpartition add -pkey <PKey> -port <port GUID> <port GUID> <port GUID> <port GUID> -m full
                                (Repeat this command for all pkeys seen in the "smaprtition list active")

                     #smpartition commit

              Note:  You can skip the next step (c) if step (b) is completed.

    c) Propagate IB partitions to all IB switches running opensm by running the following command on the IB switch running as Master (first identify the Master either by # getmaster or # sminfo command)

        #smpartition start
        #smpartition commit

        The above two commands will make sure that all IB partitions are propagated to all IB switches running opensm.


    d) Check if secret M-key policy is in use with the following command on this Master switch:

           #smsubnetprotection list active

               If the output shows secret M-keys,  run the following commands on this Master switch:

                    #smsubnetprotection start

                    #smsubnetprotection commit

                   This will make sure that secret M-Keys policy is propagated to other switches.  Normally, this is done at the time of creating secret M-keys.  This step here is to make sure that M-key policy is propagated.

 

2.  Confirm type/extent of downtime required

      If this is a standalone switch (only Switch in this IB Fabric), then you will need a downtime of the whole IB Fabric to replace, reboot or patch the standalone switch.

      If there are multiple IB Switches in this IB Fabric and if requirements in any part of step 1 are not met, then you will need a down time of the whole IB Fabric in order to reboot, patch or replace this switch.  

 

3. Complete the check-list template – IB Fabric preparation for IB Switch planned outage.


Fabric Checklist item:   ______Answer - yes/no_______

Single-Switch IB Fabric: Full IB Fabric downtime has been planned?    ___yes / no_____

Multiple-Switch IB Fabric: Hosts bonding/IPMP/IO-path redundancy confirmed?   ___yes / no_____

CRS Cluster - e.g. Exadata or SuperCluster – fix for 18199185 in place?  ___yes / no_____

All IB switches within the rack are running the same firmware version ?   ___yes / no_____

OpenSM is running on the switches as per the standard configurations?  ___yes / no_____

Smpriorities and controlledhandover of the switches running opensm are as per the standard configuration?  ___yes / no_____

All expected IB switches are seen in “ibswitches”?   ___yes / no_____

“getmaster” output consistent when run on each IB Switch in the Fabric?   ___yes / no_____

All IB Switches can Ethernet-ping all other IB Switches on Mgmt Interfaces? No firewalls between management networks of individual racks within a Multi-rack system?   ___yes / no_____

“smnodes list” output identical on all IB Switches running OpenSM?    ___yes / no_____

“smpartition start && smpartition commit” has been run from the current running Master?   ___yes / no_____

Secret M-key policy, if used, has been checked and re-propagated?    ___yes / no  / N/A_____


If “No” to any of the questions above, then a full IB Fabric downtime (full outage of all switches and nodes actively participating in the IB fabric) has been planned?   ___yes / no /  N/A ______

 

4. Data Collection and Upload

     Upon the completion of all the steps above, collect the following set of data.  This set of data will become useful for investigating root cause of any problem that may occur as a result of any planned outage.

             a). Collect the following data from all IB switches in this IB fabric (if multirack, all switches in the entire multirack)

                     #version
                     #listlinkup
                     #service opensmd status
                     #setsmpriority list
                     #smnodes list
                     #ifconfig eth0
                     #md5sum /conf/partitions.current

                     #spsh
                        -> ls /SP/network
                     #exit

             b). Copy the following file from all IB switches running opensm

                     /conf/partitions.current

             c). Collect the following data from the switch currently running the Master Subnet Manager

                     #smpartition list active

             d). Collect the following data from any one of the IB leaf switch

                     #ibnetdiscover
                     #sminfo
                     #ibdiagnet -skip dup_guids -pm -P all=1

                          After running this command, collect all the files it creates in /tmp/ibdiagnet* files
                                  Example:
                                  cd /tmp
                                  tar cvf ibdiagnet.tar ibdiagnet*

             e). If there are IB-Gateway switches in this IB fabric, collect the following data from all IB-Gateway switches.

                     #showgwports
                     #showvlan
                     #showvnics
                     #showioadapters

 

5. Proceed to next steps


- If you are replacing this Switch, please return to the article: How to Prepare an Infiniband Switch for Replacement (Doc ID 1636229.1). Be prepared to provide the full check-list with answers to each question to Oracle Support.

- If you are only rebooting or patching this IB Switch, then you are ready to go ahead, with the appropriate downtime as confirmed in the steps above. Please refer to the appropriate sections of the Product Guides, for reboot (restart) or firmware upgrade.

 

Notes / Addendum


1. Note for Engineered Systems, Multiracking: When adding an additional rack to an Engineered System as per the Multirack cabling guide, a full shutdown of the Exalogic and/or Exadata is required (Reference: Exalogic Elastic Cloud Multirack Cabling Guide, section Multirack Cabling Tasks , 4.1 Shutting Down Affected Exalogic Machines and Exadata Database Machines). This is, in effect, a full shut-down of the IB Fabric. It follows that addition of a new rack to an Engineered System, should not be attempted with any part of the IB Fabric in production.

2. Note for Patching in multiple-switch environment: It is recommended that before commencing patching, there is a master switch and at least one standby switch. Patch the standby switches first and the master last. This reduces the number of SM failovers – there will be only to one failover of the master switch to one of the standby switch.

 

 

References

<NOTE:1383773.1> - How to Replace a Failed Sun Network QDR InfiniBand Gateway Switch
<NOTE:2125242.1> - Infiniband Switch Replacement – Overview and guide to key articles
<NOTE:2125203.1> - Infiniband Switch Replacement - Follow-up Actions
<NOTE:1636229.1> - How to Prepare an Infiniband Switch for Replacement
<NOTE:1341658.1> - How to Replace a Failed Sun Datacenter InfiniBand Switch 36

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback