Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1921766.1
Update Date:2017-09-11
Keywords:

Solution Type  Problem Resolution Sure

Solution  1921766.1 :   showvnic command run on an IB gateway switch outputs "WAIT-VHUB"  


Related Items
  • Exalogic Elastic Cloud X3-2 Quarter Rack
  •  
  • Sun Network QDR InfiniBand Gateway Switch
  •  
Related Categories
  • PLA-Support>Sun Systems>SAND>Network>SN-SND: Sun Network Infiniband
  •  




In this Document
Symptoms
Cause
Solution


Created from <SR 3-9527794808>

Applies to:

Exalogic Elastic Cloud X3-2 Quarter Rack - Version X3 and later
Sun Network QDR InfiniBand Gateway Switch - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.

Symptoms

The 'showvnic' output is showing vnics in the the STATE = "WAIT-VHUB":

gw01# showvnics

ID  STATE     FLG IOA_GUID                             NODE                             IID  MAC                   VLN      PKEY      GW
--- --------  --- -----------------------            --------------------------------  ---- -----------------   ---       ------     --------
19 WAIT-VHUB   N 7B2C9A60FF9DD4A2       cn08 EL-C 192.168.10.10      0000 00:14:4F:F9:FC:22 301 0x8006 0A-ETH-3
23 WAIT-VHUB   N 97F8375BFFCE65AA        cn06 EL-C 192.168.10.11      0000 00:14:4F:F9:FC:4F 301 0x8006 0A-ETH-3
23 WAIT-VHUB   N 97F8375BFFCE65AA        cn06 EL-C 192.168.10.12      0000 00:14:4F:F9:FC:4F 301 0x8006 0A-ETH-4

 

Cause

In this scenario, BridgeX Gateway 0A-ETH-3 and 0A-ETH-4 Port GUIDs are not in the partition table for 0x8006.

The GENERAL idea behind replacing an IB gateway switch with another is that the old IB gateway BX port GUIDs in any IB partition must be replaced with the new IB gateway BX port GUIDs (these BX port GUIDs go with the physical IB gateway switch).

The BX port GUIDs are determined by running:
# showgwports.

The IB partitions are determined on the SMINFO_MASTER by checking:
# smparition list active

# cat /conf/partitions.conf.current

Another point is when the new IB gateway switch is in place, its GWInstance must NOT match the other working IB gateway switch.

# showgwconfig

Solution

Find the bridge Port GUIDs for 0A-ETH-3 and 0A-ETH-4:

gw01# showgwports

INTERNAL PORTS:
---------------

Device   Port Portname  PeerPort PortGUID           LID    IBState  GWState
---------------------------------------------------------------------------
Bridge-0  1   Bridge-0-1    4    0x0010e0FF44dcc001 0x0003 Active   Up<<<<<<<<<<<<<<Bridge-0-1 GUID
Bridge-0  2   Bridge-0-2    3    0x0010e0FF44dcc002 0x0005 Active   Up
Bridge-1  1   Bridge-1-1    2    0x0010e0FF44dcc041 0x0007 Active   Up
Bridge-1  2   Bridge-1-2    1    0x0010e0FF44dcc042 0x0009 Active   Up

CONNECTOR 0A-ETH:
-----------------

Port          Bridge      Adminstate Link  State       MTU  TxPause  RxPause
-------------------------------------------------------------------------
0A-ETH-1  Bridge-0-2  Enabled    Up    Up          9600 Global   Global
0A-ETH-2  Bridge-0-2  Enabled    Up    Up          9600 Global   Global
0A-ETH-3  Bridge-0-1  Enabled    Up    Up          9600 Global   Global<<<<<<<<<<<<<Bridge-0-1
0A-ETH-4  Bridge-0-1  Enabled    Up    Up          9600 Global   Global<<<<<<<<<<<<<Bridge-0-1

Follow the same steps on the other GW switch (e.g. gw02) in the rack if WAIT-VHUB is also seen in 'showvnic' output on gw02.

Confirm if the 0A-ETH-3 and 0A-ETH4 port GUIDs are missing from the partition table:

gw01# smpartition list active
# Sun DCS IB partition config file
# This file is generated, do not edit
#! version_number : 76
Default=0x7fff, ipoib :
ALL_CAS=both,
ALL_SWITCHES=full,
SELF=full;
SUN_DCS=0x0001, ipoib :
ALL_SWITCHES=full;
  = 0x8006,ipoib:<<<<<<<<<<<<<<<<Confirmed that 0A-ETH-3 and 0A-ETH4 port GUIDs are missing from the partition table 0x8006
0x0010e000013fc7ec=full,
0x0010e000013fc7ed=full,
0x0010e000013fc8ef=full;
gw01#

!!! If 0A-ETH-3 and 0A-ETH4 bridge Port GUIDs are in the  partition table, STOP here and contact Oracle Support!!!

If 0A-ETH-3 and 0A-ETH4 bridge Port GUIDs are confirmed NOT in the  partition table, continue...

Now do the following to fix WAIT-VHUB for vnics in the 0x8006 partition:

Verify that each GW switch shows both switches listed by 'smnodes':

gw01# smnodes list
<IP_address_gw01>
<IP_address_gw02>
gw01#

gw02# smnodes list
<IP_address_gw01>
<IP_address_gw02>
gw02#

Find out which switch is running Subnet Manager Master by running the command 'getmaster' from any switch. The following shows gw01 as SM MASTER:

gw01#getmaster

Local SM enabled and running, state MASTER
20140828 10:11:28 Master SubnetManager on sm lid 11 sm guid 0x2128557ff2c0b0 : SUN IB QDR GW switch gw01 10.152.228.232
gw01#

All commands run on SM Master gw01:
  gw01# smpartition start
  gw01# smpartition add -pkey 0x8006 -port 0x0010e0FF44dcc001 -m full   (note: bridge Port GUID of gw01)
  gw01# smpartition add -pkey 0x8006 -port 0x0010e0FF385cc001 -m full   (note: bridge Port GUID of gw02)
  gw01# smpartition list modified  (check that GUIDs are now under 0x8006 partition before committing)
  gw01# smpartition commit     <-----This will commit changes and propagate to the other standby GW switch (gw02, in this example).

NOTE: see also:  smpartition add outputs "Wrong value for pkey. Legal value is 1 to 0x7fff" (Doc ID 1921832.1)

Verify the bridge Port GUIDs have been added to the active table:

gw01# smpartition list active
# Sun DCS IB partition config file
# This file is generated, do not edit
#! version_number : 76
Default=0x7fff, ipoib :
ALL_CAS=both,
ALL_SWITCHES=full,
SELF=full;
SUN_DCS=0x0001, ipoib :
ALL_SWITCHES=full;
  = 0x8006,ipoib:
0x0010e000013fc7ec=full,
0x0010e000013fc7ed=full,
0x0010e000013fc8ef=full,
0x0010e0FF44dcc001=full,<<<<<<<<<<<<<<<<Confirmed that gw01 0A-ETH-3 and 0A-ETH4 port GUID is now in the active partition table
0x0010e0FF385cc001=full;<<<<<<<<<<<<<<<<Confirmed that gw02 0A-ETH-3 and 0A-ETH4 port GUID is now in the active partition table
gw01#

Confirm that the vnics are now in the UP state:

gw01# showvnics

ID  STATE     FLG IOA_GUID                             NODE                             IID  MAC                   VLN      PKEY      GW
--- --------  --- -----------------------            --------------------------------  ---- -----------------   ---       ------     --------
19   UP          N 7B2C9A60FF9DD4A2       cn08 EL-C 192.168.10.10      0000 00:14:4F:F9:FC:22 301 0x8006 0A-ETH-3
23   UP          N 97F8375BFFCE65AA        cn06 EL-C 192.168.10.11      0000 00:14:4F:F9:FC:4F 301 0x8006 0A-ETH-3
23   UP          N 97F8375BFFCE65AA        cn06 EL-C 192.168.10.12      0000 00:14:4F:F9:FC:4F 301 0x8006 0A-ETH-4

 


Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback