Asset ID: |
1-72-1921766.1 |
Update Date: | 2017-09-11 |
Keywords: | |
Solution Type
Problem Resolution Sure
Solution
1921766.1
:
showvnic command run on an IB gateway switch outputs "WAIT-VHUB"
Related Items |
- Exalogic Elastic Cloud X3-2 Quarter Rack
- Sun Network QDR InfiniBand Gateway Switch
|
Related Categories |
- PLA-Support>Sun Systems>SAND>Network>SN-SND: Sun Network Infiniband
|
In this Document
Created from <SR 3-9527794808>
Applies to:
Exalogic Elastic Cloud X3-2 Quarter Rack - Version X3 and later
Sun Network QDR InfiniBand Gateway Switch - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.
Symptoms
The 'showvnic' output is showing vnics in the the STATE = "WAIT-VHUB":
gw01# showvnics
ID STATE FLG IOA_GUID NODE IID MAC VLN PKEY GW
--- -------- --- ----------------------- -------------------------------- ---- ----------------- --- ------ --------
19 WAIT-VHUB N 7B2C9A60FF9DD4A2 cn08 EL-C 192.168.10.10 0000 00:14:4F:F9:FC:22 301 0x8006 0A-ETH-3
23 WAIT-VHUB N 97F8375BFFCE65AA cn06 EL-C 192.168.10.11 0000 00:14:4F:F9:FC:4F 301 0x8006 0A-ETH-3
23 WAIT-VHUB N 97F8375BFFCE65AA cn06 EL-C 192.168.10.12 0000 00:14:4F:F9:FC:4F 301 0x8006 0A-ETH-4
Cause
In this scenario, BridgeX Gateway 0A-ETH-3 and 0A-ETH-4 Port GUIDs are not in the partition table for 0x8006.
The GENERAL idea behind replacing an IB gateway switch with another is that the old IB gateway BX port GUIDs in any IB partition must be replaced with the new IB gateway BX port GUIDs (these BX port GUIDs go with the physical IB gateway switch).
The BX port GUIDs are determined by running:
# showgwports.
The IB partitions are determined on the SMINFO_MASTER by checking:
# smparition list active
# cat /conf/partitions.conf.current
Another point is when the new IB gateway switch is in place, its GWInstance must NOT match the other working IB gateway switch.
# showgwconfig
Solution
Find the bridge Port GUIDs for 0A-ETH-3 and 0A-ETH-4:
gw01# showgwports
INTERNAL PORTS:
---------------
Device Port Portname PeerPort PortGUID LID IBState GWState
---------------------------------------------------------------------------
Bridge-0 1 Bridge-0-1 4 0x0010e0FF44dcc001 0x0003 Active Up<<<<<<<<<<<<<<Bridge-0-1 GUID
Bridge-0 2 Bridge-0-2 3 0x0010e0FF44dcc002 0x0005 Active Up
Bridge-1 1 Bridge-1-1 2 0x0010e0FF44dcc041 0x0007 Active Up
Bridge-1 2 Bridge-1-2 1 0x0010e0FF44dcc042 0x0009 Active Up
CONNECTOR 0A-ETH:
-----------------
Port Bridge Adminstate Link State MTU TxPause RxPause
-------------------------------------------------------------------------
0A-ETH-1 Bridge-0-2 Enabled Up Up 9600 Global Global
0A-ETH-2 Bridge-0-2 Enabled Up Up 9600 Global Global
0A-ETH-3 Bridge-0-1 Enabled Up Up 9600 Global Global<<<<<<<<<<<<<Bridge-0-1
0A-ETH-4 Bridge-0-1 Enabled Up Up 9600 Global Global<<<<<<<<<<<<<Bridge-0-1
Follow the same steps on the other GW switch (e.g. gw02) in the rack if WAIT-VHUB is also seen in 'showvnic' output on gw02.
Confirm if the 0A-ETH-3 and 0A-ETH4 port GUIDs are missing from the partition table:
gw01# smpartition list active
# Sun DCS IB partition config file
# This file is generated, do not edit
#! version_number : 76
Default=0x7fff, ipoib :
ALL_CAS=both,
ALL_SWITCHES=full,
SELF=full;
SUN_DCS=0x0001, ipoib :
ALL_SWITCHES=full;
= 0x8006,ipoib:<<<<<<<<<<<<<<<<Confirmed that 0A-ETH-3 and 0A-ETH4 port GUIDs are missing from the partition table 0x8006
0x0010e000013fc7ec=full,
0x0010e000013fc7ed=full,
0x0010e000013fc8ef=full;
gw01#
!!! If 0A-ETH-3 and 0A-ETH4 bridge Port GUIDs are in the partition table, STOP here and contact Oracle Support!!!
If 0A-ETH-3 and 0A-ETH4 bridge Port GUIDs are confirmed NOT in the partition table, continue...
Now do the following to fix WAIT-VHUB for vnics in the 0x8006 partition:
Verify that each GW switch shows both switches listed by 'smnodes':
gw01# smnodes list
<IP_address_gw01>
<IP_address_gw02>
gw01#
gw02# smnodes list
<IP_address_gw01>
<IP_address_gw02>
gw02#
Find out which switch is running Subnet Manager Master by running the command 'getmaster' from any switch. The following shows gw01 as SM MASTER:
gw01#getmaster
Local SM enabled and running, state MASTER
20140828 10:11:28 Master SubnetManager on sm lid 11 sm guid 0x2128557ff2c0b0 : SUN IB QDR GW switch gw01 10.152.228.232
gw01#
All commands run on SM Master gw01:
gw01# smpartition start
gw01# smpartition add -pkey 0x8006 -port 0x0010e0FF44dcc001 -m full (note: bridge Port GUID of gw01)
gw01# smpartition add -pkey 0x8006 -port 0x0010e0FF385cc001 -m full (note: bridge Port GUID of gw02)
gw01# smpartition list modified (check that GUIDs are now under 0x8006 partition before committing)
gw01# smpartition commit <-----This will commit changes and propagate to the other standby GW switch (gw02, in this example).
NOTE: see also: smpartition add outputs "Wrong value for pkey. Legal value is 1 to 0x7fff" (Doc ID 1921832.1)
Verify the bridge Port GUIDs have been added to the active table:
gw01# smpartition list active
# Sun DCS IB partition config file
# This file is generated, do not edit
#! version_number : 76
Default=0x7fff, ipoib :
ALL_CAS=both,
ALL_SWITCHES=full,
SELF=full;
SUN_DCS=0x0001, ipoib :
ALL_SWITCHES=full;
= 0x8006,ipoib:
0x0010e000013fc7ec=full,
0x0010e000013fc7ed=full,
0x0010e000013fc8ef=full,
0x0010e0FF44dcc001=full,<<<<<<<<<<<<<<<<Confirmed that gw01 0A-ETH-3 and 0A-ETH4 port GUID is now in the active partition table
0x0010e0FF385cc001=full;<<<<<<<<<<<<<<<<Confirmed that gw02 0A-ETH-3 and 0A-ETH4 port GUID is now in the active partition table
gw01#
Confirm that the vnics are now in the UP state:
gw01# showvnics
ID STATE FLG IOA_GUID NODE IID MAC VLN PKEY GW
--- -------- --- ----------------------- -------------------------------- ---- ----------------- --- ------ --------
19 UP N 7B2C9A60FF9DD4A2 cn08 EL-C 192.168.10.10 0000 00:14:4F:F9:FC:22 301 0x8006 0A-ETH-3
23 UP N 97F8375BFFCE65AA cn06 EL-C 192.168.10.11 0000 00:14:4F:F9:FC:4F 301 0x8006 0A-ETH-3
23 UP N 97F8375BFFCE65AA cn06 EL-C 192.168.10.12 0000 00:14:4F:F9:FC:4F 301 0x8006 0A-ETH-4
Attachments
This solution has no attachment