Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-79-1682501.1
Update Date:2018-03-22
Keywords:

Solution Type  Predictive Self-Healing Sure

Solution  1682501.1 :   Setting up the Subnet Manager in a multi-rack cabling configuration containing Exalogic/Big Data Appliance and Exadata/SuperCluster  


Related Items
  • Oracle Exadata Hardware
  •  
  • Exalogic Elastic Cloud X4-2 Hardware
  •  
  • Oracle Exalogic Elastic Cloud Software
  •  
  • Oracle Exalogic Infrastructure
  •  
  • Exalogic Elastic Cloud X4-2 Eighth Rack
  •  
  • Big Data Appliance Hardware
  •  
  • Oracle SuperCluster T5-8 Hardware
  •  
  • Exadata X4-2 Quarter Rack
  •  
  • Exalogic Elastic Cloud X3-2 Hardware
  •  
  • Oracle SuperCluster Specific Software
  •  
Related Categories
  • PLA-Support>Sun Systems>x86>Engineered Systems HW>SN-x64: EXALOGIC
  •  




In this Document
Purpose
Scope
Details
References


Applies to:

Exalogic Elastic Cloud X3-2 Hardware - Version X3 to X3 [Release X3]
Oracle Exadata Hardware
Oracle Exalogic Elastic Cloud Software - Version 2.0.6.2.4 to 2.0.6.2.4
Exadata X4-2 Quarter Rack
Oracle Exalogic Infrastructure
Information in this document applies to any platform.

Purpose

 We are aligning the Subnet Manager configuration for multi-racked Engineered Systems that contain Exalogic (EL)/Big Data Appliance (BDA) and Exadata (ED)/SuperCluster (SC). These new requirements will allow for the Subnet Manager to fail over and automatically recover in the case of an EL/BDA rack level failure.

Scope

 The scope of this document is for all multi-rack configurations that contain both Exalogic (EL)/Big Data Appliance (BDA) and Exadata (ED)/SuperCluster (SC). New systems should be configured per these requirements. Existing configurations should be changed to meet these requirements.

Details

The basic guidance is to run the (Master) Subnet Manager on the Sun Network QDR InfiniBand Gateway Switches (aka NM2-GW switches) in multi-rack configurations containing both Exalogic (EL)/Big Data Appliance (BDA) and Exadata (ED)/SuperCluster (SC). Specifically, if an Exalogic virtual is included in a multi-rack cabling, then it is required that the Master Subnet Manager runs on one of the Exalogic NM2-GW switches.
In cases of a rack level failure event that renders the NM2-GW switches unreachable, we require that a pair of Sun Datacenter InfiniBand Switch 36 (aka. NM2-36P switches) be setup on an ED/SC rack to allow the Subnet Manager to failover to it. This will allow the other members of the fabric to continue functioning until the EL/BDA rack can be brought back into service. The Master Subnet Manager will then automatically recover back to one of the NM2-GW switches located on the EL/BDA rack(s).

If the multi-rack cabling does not include an Exalogic virtual, then additional Subnet Manager configurations might be possible. For specific use cases please check with Oracle InfiniBand Support team.

Below is a chart showing these recommendations:

Rack Configuration

SM Should Run On...

SM Priority

Controlled Handover

Two or more EL racks (all EL physical):
EL & … & EL
- All leaf switches on two different EL racks

- Spines: disabled if any

- EL leaf switches: 5

- Spines: 1

- EL leaf switches: TRUE

- Spines: FALSE

Two or more BDA racks:
BDA & … & BDA

 

- All leaf switches on two different BDA racks

- Spines: disabled if any

- BDA leaf switches: 5

- Spines: 1

- BDA leaf switches: TRUE

- Spines: FALSE

One or more EL racks connected to
one or more BDA racks:
EL & BDA
EL & … & EL & BDA
EL & BDA & … & BDA
EL & … & EL & BDA & … & BDA

- All leaf switches on one EL rack

- All leaf switches on one other EL/BDA rack

- Spines: disabled if any

- EL leaf switches: 5

- BDA leaf switches: 4

- Spines: 1

- EL leaf switches: TRUE

- BDA leaf switches: FALSE

- Spines: FALSE

Single EL/BDA rack connected to
one or more ED/SC racks:
EL & ED
EL & ED & … & ED
EL & SC
EL & SC & … & SC
BDA & ED
BDA & ED & … & ED
BDA & SC
BDA & SC & … & SC
EL & ED & SC
EL & ED & … & ED & SC
EL & ED & SC & … & SC
EL & ED & … & ED & SC & … & SC
BDA & ED & SC
BDA & ED & … & ED & SC
BDA & ED & SC & … & SC
BDA & ED & … & ED & SC & … & SC

- All EL/BDA leaf switches (on the one rack)

- All leaf switches on one ED/SC rack

- Spines: disabled if any

- EL/BDA leaf switches: 5

- ED/SC leaf switches: 2

- Spines: 1

- EL/BDA leaf switches: TRUE

- ED/SC leaf switches: FALSE

- Spines: FALSE

Two or more EL racks connected to
one or more ED/SC racks:
EL & … & EL & ED
EL & … & EL & ED & … & ED
EL & … & EL & SC
EL & … & EL & SC & … & SC
EL & … & EL & ED & SC
EL & … & EL & ED & … & ED & SC
EL & … & EL & ED & SC & … & SC
EL & … & EL & ED & … & ED & SC & … & SC
- All leaf switches on two different EL racks

- All leaf switches on one ED/SC rack

- Spines: disabled if any

- EL leaf switches: 5

- ED/SC leaf switches: 2

- Spines: 1

- EL leaf switches: TRUE

- ED/SC leaf switches: FALSE

- Spines: FALSE

Two or more BDA racks connected to
one or more ED/SC racks:
BDA & … & BDA & ED
BDA & … & BDA & ED & … & ED
BDA & … & BDA & SC
BDA & … & BDA & SC & … & SC
BDA & … & BDA & ED & SC
BDA & … & BDA & ED & … & ED & SC
BDA & … & BDA & ED & SC & … & SC
BDA & … & BDA & ED & … & ED & SC & … & SC
- All leaf switches on two different BDA racks

- All leaf switches on one ED/SC rack

- Spines: disabled if any

- BDA leaf switches: 5

- ED/SC leaf switches: 2

- Spines: 1

- BDA leaf switches: TRUE

- ED/SC leaf switches: FALSE

- Spines: FALSE

One or more EL racks connected to
one or more BDA racks connected to
one or more ED/SC racks:
EL & BDA & ED
EL & BDA & SC
EL & BDA & ED & … & ED
EL & BDA & SC & … & SC
EL & BDA & ED & SC
EL & BDA & ED & … & ED & SC
EL & BDA & ED & SC & … & SC
EL & BDA & ED & … & ED & SC & … & SC
EL & … & EL & BDA & ED
EL & … & EL & BDA & SC
EL & … & EL & BDA & ED & … & ED
EL & … & EL & BDA & SC & … & SC
EL & … & EL & BDA & ED & SC
EL & … & EL & BDA & ED & … & ED & SC
EL & … & EL & BDA & ED & SC & … & SC
EL & … & EL & BDA & ED & … & ED & SC & … & SC
EL & BDA & … & BDA & ED
EL & BDA & … & BDA & SC
EL & BDA & … & BDA & ED & … & ED
EL & BDA & … & BDA & SC & … & SC
- All leaf switches on one EL rack

- All leaf switches on one other EL/BDA rack

- All leaf switches on one ED/SC rack

- Spines: disabled if any

- EL leaf switches: 5

- BDA leaf switches: 4

- ED/SC leaf switches: 2

- Spines: 1

- EL leaf switches: TRUE

- BDA leaf switches: FALSE

- ED/SC leaf switches: FALSE

- Spines: FALSE

 

The process for changing the configuration on the NM2-36p switches is below.

On a pair of NM2-36P switches located in one of the Exadata (ED)/SuperCluster (SC) attached to the Exalogic (EL)/Big Data Appliance (BDA):

1) Disable the Subnet Manager: disablesm

2) Set the Subnet Manager priority to 2: setsmpriority 2

3) Set the controlled handover to false: setcontrolledhandover FALSE

4) Configure the smnodes list

The smnodes list needs to contain the IP addresses of all switches which have Subnet Manager enabled so that partition configuration can be synchronized across all these switches.
From the Sun Network QDR InfiniBand Gateway Switch Command Reference for Firmware Version 2.1:
"[...]
The Subnet Manager nodes file must exist in every management controller file system. The file contains a list of IP addresses of all active management controllers hosting a Subnet Manager in your fabric. The file should have an entry for every Sun Datacenter InfiniBand Switch 36 and Sun Network QDR InfiniBand Gateway Switch that runs a Subnet Manager in your InfiniBand fabric.
[...]"

Ensure that all switches with the Subnet Manager enabled appear in the smnodes list output by running the following command on all the Exalogic NM2-GW switches and Exadata NM2-36P leaf switches with the Subnet Manager enabled: smnodes list

If you have switch(es) that need to be removed run the command: smnodes delete x.x.x.x (where x.x.x.x is the IP address of the switch you want to remove from the smnodes file)

en

If you have switch(es) that you need to add run the command: smnodes add x.x.x.x (where x.x.x.x is the IP address of the switch you want to add to the smnodes file)

Output should be the same on all switches eligible to run the Subnet Manager.

Note:
If custom (non-default) InfiniBand partitions are used in two or more racks, the partition files of all racks to be cabled together need to be merged into a single partition file.

For additional guidance refer to the Multi-Rack Cabling EIS Checklist and to MOS Notes 1598479.1 and 2177177.1.

5) Enable the Subnet Manager: enablesm

This will allow the Subnet Manager to failover to an ED/SC rack if the EL/BDA rack experiences a rack level failure and migrate the partition keys (pkeys) to the new rack.  

When the configuration from this MOS Note is properly implemented then Master Subnet Manager will relocate back to the EL/BDA rack if this rack becomes operational again -

no need for manual action. This is already stated above.

6) Check the configuration: setsmpriority list

This will tell you the current priority. This will also tell you whether or not controlled handover is set TRUE.

Here is a sample configuration of a 4 rack setup containing 2 Exalogic Full Racks and 2 Exadata Full Racks:

Exalogic 1

spine - smpriority = 1

controlledhandover = False

subnet manager disabled

GW1 - smpriority = 5

controlledhandover = TRUE

subnet manager enabled

GW2 - smpriority = 5

controlledhandover = TRUE

subnet manager enabled

GW3 - smpriority = 5

controlledhandover = TRUE

subnet manager enabled

GW4 - smpriority = 5

controlledhandover = TRUE

subnet manager enabled

Exalogic 2

spine - smpriority = 1

controlledhandover = False

subnet manager disabled

GW1 - smpriority = 5

controlledhandover = TRUE

subnet manager enabled

GW2 - smpriority = 5

controlledhandover = TRUE

subnet manager enabled

GW3 - smpriority = 5

controlledhandover = TRUE

subnet manager enabled

GW4 - smpriority = 5

controlledhandover = TRUE

subnet manager enabled

Exadata 1

spine - smpriority = 1

controlledhandover = False

subnet manager disabled

IBA - smpriority = 2

controlledhandover = FALSE

subnet manager enabled

IBB - smpriority = 2

controlledhandover = FALSE

subnet manager enabled

Exadata 2

spine - subnet manager disabled

IBA - smpriority = 2

controlledhandover = FALSE

subnet manager disabled

IBB - smpriority = 2

controlledhandover = FALSE

subnet manager disabled

NOTE: [For the smnodes list to take effect on all four IB switches (2 - Exadata; 2 - Exalogic), had to run 'smpartition start;smpartition commit' on the master Exalogic GW switch.]

References

<BUG:17482244> - CANNOT ESTABLISH NEW CONNECTIONS UNTIL SM IS MANUALLY RESTARTED

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback