This document provides the procedure to change the Service Level values of the EoIB Data and EoIB Control configuration parameters on NM2-GW switches.
Previously NM2-GW switches will have been configured with the service levels for EoIB Data and EoIB Control traffic as shown below:
However, under heavy data load/traffic the above configuration can result in control traffic getting starved if it is at a “normal” priority service level (SL 2) while data traffic is running at a higher priority service level (SL 1). Such a condition can occur if several nodes (on bare metal) or Guest vServers (on virtual) constantly generate traffic at full bandwidth through the gateway ports on the InfiniBand gateway switches and under these conditions slowness for some operations may be observed. Therefore going forward, to prevent control traffic starvation, the following values are now recommended:
This procedure assumes that all the Virtual NICs (vNICS) in your Exalogic environment are configured using bonding. Bonding ensures that the vnics failover to the standby switch when the gateway ports are disabled during the course of this procedure.
Perform the following procedure to implement the configuration change one NM2-GW switch at a time:
- Login to the switch (e.g. el01gw01) as the ilom-admin user.
-
Launch a restricted Linux shell to run the required commands.
-> start /SYS/Fabric_Mgmt/
Are you sure you want to start /SYS/Fabric_Mgmt (y/n)? y
NOTE: start /SYS/Fabric_Mgmt will launch a restricted Linux shell.
User can execute switch diagnosis, SM Configuration and IB
monitoring commands in the shell. To view the list of commands,
use "help"atrsh prompt.
Use exit command at rshprompttorevertbackto
ILOM shell.
FabMan@el01gw01->
-
Run the following command to check the existing configuration:
FabMan@el01gw01-> showgwconfig
BXM (pid 13532) is running
BXM versions: bxm_user 2.0.0898-0, BXM-API 1.6.0, bxm_libs 2.0.0898-0, bxm_main 1.31 mlx_bx_core 1.31
Parameter Configured Value Running Value
-----------------------------------------------------------
GWInstance 30 30
SystemName None el01gw01
EoIB Data SL 1 1
EoIB Control SL 2 2
Allow host VNIC config None no
LAG mode None no
Default discover P_key None 0xffff
System MAC Not applicable 00:21:28:56:8d:02
If the EOIB Data SL and EOIB Control SL values returned do not match those in the above example and the observed values are swapped, then the correct settings are already in place and no further change is required.
If the EOIB Data SL and EOIB Control SL values returned match those in the above example then follow the remain steps to implement the required configuration a change on each of the NM2-GW switches in your Exalogic environment.
-
Identify the gateway ports that are enabled and connected. Use the output of the showgwport to identify these ports.
FabMan@el01gw01>showgwports
INTERNAL PORTS:
---------------
Device Port Portname PeerPort PortGUID LID IBState GWState
---------------------------------------------------------------------------
Bridge-0 1 Bridge-0-1 4 0x002128568d02c001 0x0002 Active Up
Bridge-0 2 Bridge-0-2 3 0x002128568d02c002 0x0004 Active Up
Bridge-1 1 Bridge-1-1 2 0x002128568d02c041 0x0010 Active Up
Bridge-1 2 Bridge-1-2 1 0x002128568d02c042 0x0012 Active Up
CONNECTOR 0A-ETH:
-----------------
Port Bridge Adminstate Link State MTU TxPause RxPause
-------------------------------------------------------------------------
0A-ETH-1 Bridge-0-2 Enabled Up Up 9600 Global Global
0A-ETH-2 Bridge-0-2 Enabled Up Up 9600 Global Global
0A-ETH-3 Bridge-0-1 Enabled Up Up 9600 Global Global
0A-ETH-4 Bridge-0-1 Enabled Up Up 9600 Global Global
CONNECTOR 1A-ETH:
-----------------
Port Bridge Adminstate Link State MTU TxPause RxPause
-------------------------------------------------------------------------
1A-ETH-1 Bridge-1-2 Enabled Down Reset 9600 Global Global
1A-ETH-2 Bridge-1-2 Enabled Down Reset 9600 Global Global
1A-ETH-3 Bridge-1-1 Enabled Down Reset 9600 Global Global
1A-ETH-4 Bridge-1-1 Enabled Down Reset 9600 Global Global
-
Identify the gateway ports that are currently being used. Run the showvnics command to list the all vnics configured on the switch. From this list identify the vnicsthatarecurrently UP and identify its gateway port. In the example below, vnics with id 39 and 41 are UP and are configured on gateway ports 0A-ETH-1 and 0A-ETH-4 respectively.
FabMan@el01gw01->showvnics
ID STATE FLG IOA_GUID NODE IID MAC VLN PKEY GW
--- -------- --- ----------------------- -------------------------------- ---- ----------------- --- ------ --------
39 UP N 47C53E16E5FE9012 el01cn01 EL-C 192.168.194.188 0000 00:21:F6:1E:C3:CB 214 0x810b 0A-ETH-1
34 WAIT-IOA N 6A218FE67AE0CA39 0000 00:21:F6:50:98:AF 214 0x810b 0A-ETH-1
63 WAIT-IOA N BAC0489E5C7ED740 0000 00:14:4F:FA:4F:11 214 0x810b 0A-ETH-1
69 WAIT-IOA N 1CA9480C54307541 0000 00:14:4F:FA:9A:AC 214 0x810b 0A-ETH-1
61 WAIT-IOA N 11CF5265C69A7142 0000 00:14:4F:F8:82:79 214 0x810b 0A-ETH-1
41 UP N 0021280001A0EE46 adce01cn07 EL-C 192.168.194.183 0000 02:C0:A0:A8:01:01 NO 0xffff 0A-ETH-4
-
Disable the gateway ports identified in step 5 as shown below
FabMan@el01gw01->disablegwport 0A-ETH-1
FabMan@el01gw01->disablegwport 0A-ETH-4
NOTE:
When the gateway ports are disabled, the vnics connected to the switch will failover to the standby switch. During this failover there will be a very brief loss of connectivity, however this does affect the availability of the vserver.
-
Validate that the gateway ports were disabled
FabMan@el01gw01->showgwports
INTERNAL PORTS:
---------------
Device Port Portname PeerPort PortGUID LID IBState GWState
---------------------------------------------------------------------------
Bridge-0 1 Bridge-0-1 4 0x002128568d02c001 0x0002 Active Up
Bridge-0 2 Bridge-0-2 3 0x002128568d02c002 0x0004 Active Up
Bridge-1 1 Bridge-1-1 2 0x002128568d02c041 0x0010 Active Up
Bridge-1 2 Bridge-1-2 1 0x002128568d02c042 0x0012 Active Up
CONNECTOR 0A-ETH:
-----------------
Port Bridge Adminstate Link State MTU TxPause RxPause
-------------------------------------------------------------------------
0A-ETH-1 Bridge-0-2 Disabled Down Reset 9600 Global Global
0A-ETH-2 Bridge-0-2 Enabled Up Up 9600 Global Global
0A-ETH-3 Bridge-0-1 Enabled Up Up 9600 Global Global
0A-ETH-1 Bridge-0-2 Disabled Down Reset 9600 Global Global
CONNECTOR 1A-ETH:
-----------------
Port Bridge Adminstate Link State MTU TxPause RxPause
-------------------------------------------------------------------------
1A-ETH-1 Bridge-1-2 Enabled Down Reset 9600 Global Global
1A-ETH-2 Bridge-1-2 Enabled Down Reset 9600 Global Global
1A-ETH-3 Bridge-1-1 Enabled Down Reset 9600 Global Global
1A-ETH-4 Bridge-1-1 Enabled Down Reset 9600 Global Global
-
Reconfigure the values of the EOIB Data and Control service levels as shown below
FabMan@el01gw01->setgwsl eoib 2
Stopping Bridge Manager... [ OK ]
Starting Bridge Manager. [ OK ]
FabMan@el01gw01->setgwsl ctrl 1
Stopping Bridge Manager... [ OK ]
Starting Bridge Manager. [ OK ]
-
Validate that the value of the EOIB Data SL is set to 2 and the value of the EOIB Control SL is set to 1. Run the showgwconfig command as shown below:
FabMan@el01gw01->showgwconfig
BXM (pid 13532) is running
BXM versions: bxm_user 2.0.0898-0, BXM-API 1.6.0, bxm_libs 2.0.0898-0, bxm_main 1.31 mlx_bx_core 1.31
Parameter Configured Value Running Value
-----------------------------------------------------------
GWInstance 30 30
SystemName None el01gw01
EoIB Data SL 2 2
EoIB Control SL 1 1
Allow host VNIC config None no
LAG mode None no
Default discover P_key None 0xffff
System MAC Not applicable 00:21:28:56:8d:02
-
Enable the disabled gateway ports as shown below.
FabMan@el01gw01->enablegwport 0A-ETH-1 -discoverpkey default
FabMan@adce01sw-ib03->enablegwport 0A-ETH-4 -discoverpkey default
-
Validate that the gateway ports are been enabled.
FabMan@adce01sw-ib03->showgwports
INTERNAL PORTS:
---------------
Device Port Portname PeerPort PortGUID LID IBState GWState
---------------------------------------------------------------------------
Bridge-0 1 Bridge-0-1 4 0x002128568d02c001 0x0002 Active Up
Bridge-0 2 Bridge-0-2 3 0x002128568d02c002 0x0004 Active Up
Bridge-1 1 Bridge-1-1 2 0x002128568d02c041 0x0010 Active Up
Bridge-1 2 Bridge-1-2 1 0x002128568d02c042 0x0012 Active Up
CONNECTOR 0A-ETH:
-----------------
Port Bridge Adminstate Link State MTU TxPause RxPause
-------------------------------------------------------------------------
0A-ETH-1 Bridge-0-2 Enabled Up Up 9600 Global Global
0A-ETH-2 Bridge-0-2 Enabled Up Up 9600 Global Global
0A-ETH-3 Bridge-0-1 Enabled Up Up 9600 Global Global
0A-ETH-4 Bridge-0-1 Enabled Up Up 9600 Global Global
CONNECTOR 1A-ETH:
-----------------
Port Bridge Adminstate Link State MTU TxPause RxPause
-------------------------------------------------------------------------
1A-ETH-1 Bridge-1-2 Enabled Down Reset 9600 Global Global
1A-ETH-2 Bridge-1-2 Enabled Down Reset 9600 Global Global
1A-ETH-3 Bridge-1-1 Enabled Down Reset 9600 Global Global
1A-ETH-4 Bridge-1-1 Enabled Down Reset 9600 Global Global
-
Exit the restricted linux shell and then logout of the switch ilom.
FabMan@adce01sw-ib03->exit
exit
-> exit
-
Repeat steps 1 to 11 on the remaining switches in your environment.
-
Verify that all EoIB interfaces (including EoIB-management, e.g. access to EMOC BUI) continue to be accessible. Login to the EMOC BUI to validate that the interface is up and running
-
Follow the steps below to validate that the EOIB interfaces for the vservers are accessible. From a location outside of the Exalogic rack from which EoIB access was previously possible, run a ping command similar to the following against each of the IP addresses you need to validate:
ping -c 1 -W 3 <TARGET_IP_ADDRESS>
For example:
[root@myExternalLocation ~]$ ping -c 1 -W 3 10.141.135.208
PING 10.141.135.208 (10.141.135.208) 56(84) bytes of data.
64 bytes from 10.141.135.208: icmp_seq=1 ttl=63 time=0.333 ms
--- 10.141.135.208 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 1ms
rtt min/avg/max/mdev = 0.333/0.333/0.333/0.000 ms