Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-79-1399265.1
Update Date:2018-01-03
Keywords:

Solution Type  Predictive Self-Healing Sure

Solution  1399265.1 :   Sun Network QDR Infiniband Gateway Switch (NM2-GW) Product Page  


Related Items
  • Sun Network QDR InfiniBand Gateway Switch
  •  
Related Categories
  • PLA-Support>Sun Systems>SAND>Network>SN-SND: Sun Network Infiniband
  •  
  • _Old GCS Categories>Sun Microsystems>Operating Systems>Solaris Network
  •  




In this Document
Purpose
Details
 Product Support Team
 Alerts
 Description
 Versions
 Hardware Versions
 Software Versions
 Compatibility/Patches
 Configuration
 FAQ
 Information Gathering
  Example output of information gathering commands
 Installation
 De-installation
 Troubleshooting
  Internal Port Mapping
  Physical Connector Numbering (on front panel)
 Performance
 Lab
 Contacts
 Proactive
 Further Reading
References


Oracle Confidential PARTNER - Available to partners (SUN).
Reason: Solaris and Network Domain (SaND) Product Page, internal content

Applies to:

Sun Network QDR InfiniBand Gateway Switch - Version Not Applicable and later
Information in this document applies to any platform.

Purpose

This document contains the Datacenter Qdr Gateway Switch Product Page.

Details

Product Support Team


PLA: SN-SND: Sun Network Infiniband

 

Alerts


 

 

Description


The Sun Network QDR Infiniband Gateway Switch (also known as SUN IB QDR GW switch, DCS GW, or NM2-GW) comprises an Infiniband switch and an Ethernet Gateway (as used in the Exalogic system) 

Infiniband QDR Switch

It has 32 IB ports and 8 1G/10G Ethernet ports housed in a 19" 1RU chassis similar to the Sun Datacenter Switch 36.


The connector panel has 32 QSFP connectors for the IB ports, and 2 QSFP connectors for the 8 Ethernet ports (4 ports per connector via splitter cable QSFP -> 4 x SFP+).

There are two Mellanox BridgeX asics in the NM2-GW switch that provide the IB to Ethernet functionality.

Each BridgeX asic has two IB internal connections to the Mellanox I4 Infiniband switch asic and presents four ethernet ports to the external ethernet network via a single QSFP port.

Each node in the IB fabric that requires an EoIB interface is assigned a vnic on one of the BridgeX ethernet ports that maps to a virtual ethernet interface on the node.



The BridgeX asic is split into two identical slices, Slice 0 and Slice 1. Each slice has an internal and an external side, with the internal side having an IB port linking it to the I4 IB switch and the external side providing ports to an Ethernet network via the *A-ETH connectors.

The IB connections to the 36 port I4 switch asic are :

Internal I4 Switch port 1 is connected to Bridge-1 port Bridge-1-2



Internal I4 Switch port 2 is connected to Bridge-1 port Bridge-1-1



Internal I4 Switch port 3 is connected to Bridge-0 port Bridge-0-2



Internal I4 Switch port 4 is connected to Bridge-0 port Bridge-0-1



The Ethernet port mapping is :

Connector 0A-ETH maps to:

 0A-ETH-1 Bridge-0 port Bridge-0-2

 0A-ETH-2 Bridge-0 port Bridge-0-2

 0A-ETH-3 Bridge-0 port Bridge-0-1

 0A-ETH-4 Bridge-0 port Bridge-0-1



Connector 1A-ETH maps to:

 1A-ETH-1 Bridge-1 port Bridge-1-2

 1A-ETH-2 Bridge-1 port Bridge-1-2

 1A-ETH-3 Bridge-1 port Bridge-1-1

 1A-ETH-4 Bridge-1 port Bridge-1-1



 

 

Versions



Firmware
To display the firmware version :

# version
SUN DCS gw version: 2.1.6-2           <<<<<<<
Build time: Dec 8 2014 10:48:59
FPGA version: 0x33
SP board info:
Manufacturing Date: 2010.01.22
Serial Number: "NCD4J0150"
Hardware Revision: 0x0006
Firmware Revision: 0x0102
BIOS version: NOW1R112
BIOS date: 04/24/2009

 

Hardware Versions

 

541-4269 Data Center InfiniBand Switch Gateway
541-4188 Data Center InfiniBand Switch Gateway Subassembly
350-1566 Fan Module
300-2233 760W Power Supply
371-2210 CR2032 Battery


System Handbook

Software Versions

The DCS GW has QSFP ports so the cables to connect Infiniband QDR HCAs are :

Part# Option# Connector Length Type
530-4402 X2886-1M QSFP - QSFP 1 meter 4X - 4X Infiniband Cable
530-4403 X2886-2M QSFP - QSFP 2 meter 4X - 4X Infiniband Cable
530-4404 X2886-3M QSFP - QSFP 3 meter 4X - 4X Infiniband Cable
530-4415 X2886-5M QSFP - QSFP 5 meter 4X - 4X Infiniband Cable
530-4444 X2121A-1M QSFP - QSFP 1 meter QSFP - QSFP Cable (1)
530-4567 X2121A-2M QSFP - QSFP 2 meter QSFP - QSFP Cable (1)
530-4445 X2121A-3M QSFP - QSFP 3 meter QSFP - QSFP Cable (1)
530-4446 X2121A-5M QSFP - QSFP 5 meter QSFP - QSFP Cable (1)
530-4448 X2121A-10M QSFP - QSFP 10 meter QSFP - QSFP Cable (1)

(1) X2121A can be used for both Infiniband and 10GbE (X2886 are no longer orderable)

System Handbook

For the Ethernet BridgeX Ports

Transceiver Type Connector Remote Transceiver Part Number
QSFP (4 x 10GbE) MTP MMF QSFP MTP or 4 SFP+ 10Gbps SR X2124A

System Handbook

Cable Type Length Part Number
QSFP MTP to 4 x LC Optical 10 Meter X2127A-10M
QSFP MTP to 4 x LC Optical 20 Meter X2127A-20M

System Handbook

 

 

Compatibility/Patches


FW updates available via MOS

Date Version Patch Number
FCS 1.0.1-1  
Nov 2010 1.1.2-2 11732400
Jun 2011 1.3.2-1 12353972

 

 

Configuration


 

 

FAQ


 

 

Information Gathering


 

On the SUN IB QDR GW switch the following general log files are available

 

  • /var/log/messages (there may be messages.1 also)
  • /var/log/opensm.log
  • /var/log/opensm-subnet.lst

(the opensm files may not contain much information if the SUN IB QDR GW switch is not the Subnet Manager Master)

If an OpenSM issue is suspected, or its log file is needed for further investigation, find the SM master (getmaster) and retrieve its /var/log/opensm.log (and messages files if relevant, consider asking for a tarball of /var/log)

And the following utilites can be used to collect data from the SUN IB QDR GW switch (full path is shown but normally just the utility name can be typed)

 

  • /usr/local/bin/version (used to be nm2version)
  • /usr/local/bin/env_test
  • /usr/local/bin/listlinkup
  • /usr/local/bin/showvnics
  • /usr/local/bin/showioadapters

 

To collect  generic Infiniband Fabric data refer to "Gathering Troubleshooting Information for the Infiniband Network in Engineered Systems (Doc ID 1538237.1)"

Example output of information gathering commands

listlinkup shows all the connectors on the NM2-GW, whether they have a cable present, and, if so, the mapping to the Infiniband I4 switch port or the BridgeX port and the logical state of the link

 

# listlinkup
Connector  0A Not present
Connector  1A Not present
Connector  2A Not present
Connector  3A Not present
Connector  4A Present <-> Switch Port 28 up (Enabled)
Connector  5A Present <-> Switch Port 30 up (Enabled)
Connector  6A Present <-> Switch Port 35 up (Enabled)
Connector  7A Present <-> Switch Port 33 up (Enabled)
Connector  8A Present <-> Switch Port 31 up (Enabled)
Connector  9A Present <-> Switch Port 14 up (Enabled)
Connector 10A Present <-> Switch Port 16 up (Enabled)
Connector 11A Present <-> Switch Port 12 up (Enabled)
Connector 12A Present <-> Switch Port 18 up (Enabled)
Connector 13A Present <-> Switch Port 9  up (Enabled)
Connector 14A Present <-> Switch Port 7  up (Enabled)
Connector 15A Present <-> Switch Port 5  up (Enabled)
Connector 0A-ETH Present
  Bridge-0 Port 0A-ETH-1 (Bridge-0-2) up   (Enabled)
  Bridge-0 Port 0A-ETH-2 (Bridge-0-2) down (Enabled)
  Bridge-0 Port 0A-ETH-3 (Bridge-0-1) down (Enabled)
  Bridge-0 Port 0A-ETH-4 (Bridge-0-1) down (Enabled)
Connector 1A-ETH Present
  Bridge-1 Port 1A-ETH-1 (Bridge-1-2) down (Enabled)
  Bridge-1 Port 1A-ETH-2 (Bridge-1-2) down (Enabled)
  Bridge-1 Port 1A-ETH-3 (Bridge-1-1) down (Enabled)
  Bridge-1 Port 1A-ETH-4 (Bridge-1-1) down (Enabled)
Connector  0B Not present
Connector  1B Not present
Connector  2B Not present
Connector  3B Not present
Connector  4B Present <-> Switch Port 27 up (Enabled)
Connector  5B Present <-> Switch Port 29 up (Enabled)
Connector  6B Present <-> Switch Port 36 up (Enabled)
Connector  7B Present <-> Switch Port 34 up (Enabled)
Connector  8B Present <-> Switch Port 32 up (Enabled)
Connector  9B Present <-> Switch Port 13 up (Enabled)
Connector 10B Present <-> Switch Port 15 up (Enabled)
Connector 11B Present <-> Switch Port 17 up (Enabled)
Connector 12B Present <-> Switch Port 11 up (Enabled)
Connector 13B Present <-> Switch Port 10 up (Enabled)
Connector 14B Present <-> Switch Port 8  up (Enabled)
Connector 15B Present <-> Switch Port 6  up (Enabled)
Connector 0B-FC Not present
Connector 1B-FC Not present

 

showvnics displays the guid of the servers that will have a virtual EoIB interface, and the mac address that will be used for accessing that server via the Gateway 10GbE ports from the external ethernet network

 

NOTE: If a system running EoIB has its IB HCA changed, then be sure to update the VNICS on the IB GW switches as detailed in:-

How to Replace a Failed InfiniBand (HCA) Card on a Exalogic Compute Node (Doc ID 1390273.1)

 

# showvnics

ID  STATE     FLG IOA_GUID                NODE                        IID  MAC               VLN PKEY   GW
--- --------  --- ----------------------- ----------                  ---- ----------------- --- ----   --------
 15 UP          N 00:21:28:00:01:A1:0C:02 el01cn09 EL-C 192.168.10.9  0000 A0:0C:02:10:01:09 NO  ffff   0A-ETH-1
 14 UP          N 00:21:28:00:01:A1:0C:05 el01cn06 EL-C 192.168.10.6  0000 A0:0C:05:10:01:06 NO  ffff   0A-ETH-1
  2 UP          N 00:21:28:00:01:A0:F6:0D el01cn03 EL-C 192.168.10.3  0000 A0:F6:0D:10:01:03 NO  ffff   0A-ETH-1
 10 UP          N 00:21:28:00:01:A0:FC:16 el01cn13 EL-C 192.168.10.13 0000 A0:FC:16:10:01:13 NO  ffff   0A-ETH-1
  6 UP          N 00:21:28:00:01:A0:FB:19 el01cn08 EL-C 192.168.10.8  0000 A0:FB:19:10:01:08 NO  ffff   0A-ETH-1
 13 UP          N 00:21:28:00:01:A0:F9:2E el01cn16 EL-C 192.168.10.18 0000 A0:F9:2E:10:01:18 NO  ffff   0A-ETH-1
  1 UP          N 00:21:28:00:01:A0:F9:35 el01cn02 EL-C 192.168.10.2  0000 A0:F9:35:10:01:02 NO  ffff   0A-ETH-1
  9 UP          N 00:21:28:00:01:A0:F7:36 el01cn12 EL-C 192.168.10.12 0000 A0:F7:36:10:01:12 NO  ffff   0A-ETH-1
  7 UP          N 00:21:28:00:01:A0:FB:46 el01cn10 EL-C 192.168.10.10 0000 A0:FB:46:10:01:10 NO  ffff   0A-ETH-1
  8 UP          N 00:21:28:00:01:A0:F9:4E el01cn11 EL-C 192.168.10.11 0000 A0:F9:4E:10:01:11 NO  ffff   0A-ETH-1
 12 UP          N 00:21:28:00:01:A0:FB:5A el01cn15 EL-C 192.168.10.17 0000 A0:FB:5A:10:01:17 NO  ffff   0A-ETH-1
  3 UP          N 00:21:28:00:01:A0:F9:85 el01cn04 EL-C 192.168.10.4  0000 A0:F9:85:10:01:04 NO  ffff   0A-ETH-1
  4 UP          N 00:21:28:00:01:A0:F9:89 el01cn05 EL-C 192.168.10.5  0000 A0:F9:89:10:01:05 NO  ffff   0A-ETH-1
  5 UP          N 00:21:28:00:01:A0:FB:8D el01cn07 EL-C 192.168.10.7  0000 A0:FB:8D:10:01:07 NO  ffff   0A-ETH-1
  0 UP          N 00:21:28:00:01:A0:F6:B5 el01cn01 EL-C 192.168.10.1  0000 A0:F6:B5:10:01:01 NO  ffff   0A-ETH-1
 11 UP          N 00:21:28:00:01:A0:F6:BE el01cn14 EL-C 192.168.10.14 0000 A0:F6:BE:10:01:14 NO  ffff   0A-ETH-1

 

showioadapters displays the guids of the servers that are configured for access via the Gateway 10GbE ports

 

# showioadapters
IOA_GUID                NODE                      LID  #vADPT FLAGS   GW
----------------------- ----------                ---- ------ -----   --------
00:21:28:00:01:A1:0C:01 el01cn09 EL-C 192.168.10.9  24    0     HD     0A-ETH-1
00:21:28:00:01:A1:0C:02 el01cn09 EL-C 192.168.10.9  25    1     ND     0A-ETH-1
00:21:28:00:01:A1:0C:05 el01cn06 EL-C 192.168.10.6  10    1     ND     0A-ETH-1
00:21:28:00:01:A1:0C:06 el01cn06 EL-C 192.168.10.6  11    0     HD     0A-ETH-1
00:21:28:00:01:A0:F6:0D el01cn03 EL-C 192.168.10.3  22    1     ND     0A-ETH-1
00:21:28:00:01:A0:F6:0E el01cn03 EL-C 192.168.10.3  23    0     HD     0A-ETH-1
00:21:28:00:01:A0:FC:15 el01cn13 EL-C 192.168.10.13 1c    0     HD     0A-ETH-1
00:21:28:00:01:A0:FC:16 el01cn13 EL-C 192.168.10.13 1d    1     ND     0A-ETH-1
00:21:28:00:01:A0:FB:19 el01cn08 EL-C 192.168.10.8  12    1     ND     0A-ETH-1
00:21:28:00:01:A0:FB:1A el01cn08 EL-C 192.168.10.8  13    0     HD     0A-ETH-1
00:21:28:00:01:A0:F9:2D el01cn16 EL-C 192.168.10.18 2c    0     HD     0A-ETH-1
00:21:28:00:01:A0:F9:2E el01cn16 EL-C 192.168.10.18 2d    1     ND     0A-ETH-1
00:21:28:00:01:A0:F9:35 el01cn02 EL-C 192.168.10.2  18    1     ND     0A-ETH-1
00:21:28:00:01:A0:F9:36 el01cn02 EL-C 192.168.10.2  19    0     HD     0A-ETH-1
00:21:28:00:01:A0:F7:35 el01cn12 EL-C 192.168.10.12 1e    0     HD     0A-ETH-1
00:21:28:00:01:A0:F7:36 el01cn12 EL-C 192.168.10.12 1f    1     ND     0A-ETH-1
00:21:28:00:01:A0:FB:45 el01cn10 EL-C 192.168.10.10 14    0     HD     0A-ETH-1
00:21:28:00:01:A0:FB:46 el01cn10 EL-C 192.168.10.10 15    1     ND     0A-ETH-1
00:21:28:00:01:A0:F9:4D el01cn11 EL-C 192.168.10.11 26    0     HD     0A-ETH-1
00:21:28:00:01:A0:F9:4E el01cn11 EL-C 192.168.10.11 27    1     ND     0A-ETH-1
00:21:28:00:01:A0:FB:59 el01cn15 EL-C 192.168.10.17 2a    0     HD     0A-ETH-1
00:21:28:00:01:A0:FB:5A el01cn15 EL-C 192.168.10.17 2b    1     ND     0A-ETH-1
00:21:28:00:01:A0:F9:85 el01cn04 EL-C 192.168.10.4  16    1     ND     0A-ETH-1
00:21:28:00:01:A0:F9:86 el01cn04 EL-C 192.168.10.4  17    0     HD     0A-ETH-1
00:21:28:00:01:A0:F9:89 el01cn05 EL-C 192.168.10.5  20    1     ND     0A-ETH-1
00:21:28:00:01:A0:F9:8A el01cn05 EL-C 192.168.10.5  21    0     HD     0A-ETH-1
00:21:28:00:01:A0:FB:8D el01cn07 EL-C 192.168.10.7  30    1     ND     0A-ETH-1
00:21:28:00:01:A0:FB:8E el01cn07 EL-C 192.168.10.7  31    0     HD     0A-ETH-1
00:21:28:00:01:A0:F6:B5 el01cn01 EL-C 192.168.10.1  1a    1     ND     0A-ETH-1
00:21:28:00:01:A0:F6:B6 el01cn01 EL-C 192.168.10.1  1b    0     HD     0A-ETH-1
00:21:28:00:01:A0:F6:BD el01cn14 EL-C 192.168.10.14 2e    0     HD     0A-ETH-1
00:21:28:00:01:A0:F6:BE el01cn14 EL-C 192.168.10.14 2f    1     ND     0A-ETH-1

getportstatus can be used to display the port state and speed of an ethernet port.

 

# getportstatus 0A-ETH-1
Port status for connector 0A-ETH-1 Bridge-0 Port Bridge-0-2
Adminstate.......................Enabled
State............................Reset
Link state.......................Down
Protocol.........................Ethernet
Link Mode........................XFI
Speed............................10Gb/s
MTU..............................9600
Tx pause.........................Global
Rx pause.........................Global

or for all 4 ports on a connector :

 

# getportstatus 0A-ETH

Port status for connector 0A-ETH-1 Bridge-0 Port Bridge-0-2
Adminstate.......................Enabled
State............................Reset
Link state.......................Down
Protocol.........................Ethernet
Link Mode........................XFI
Speed............................10Gb/s
MTU..............................9600
Tx pause.........................Global
Rx pause.........................Global
Port status for connector 0A-ETH-2 Bridge-0 Port Bridge-0-2
Adminstate.......................Enabled
State............................Reset
Link state.......................Down
Protocol.........................Ethernet
Link Mode........................XFI
Speed............................10Gb/s
MTU..............................9600
Tx pause.........................Global
Rx pause.........................Global
Port status for connector 0A-ETH-3 Bridge-0 Port Bridge-0-1
Adminstate.......................Enabled
State............................Reset
Link state.......................Down
Protocol.........................Ethernet
Link Mode........................XFI
Speed............................10Gb/s
MTU..............................9600
Tx pause.........................Global
Rx pause.........................Global
Port status for connector 0A-ETH-4 Bridge-0 Port Bridge-0-1
Adminstate.......................Enabled
State............................Reset
Link state.......................Down
Protocol.........................Ethernet
Link Mode........................XFI
Speed............................10Gb/s
MTU..............................9600
Tx pause.........................Global
Rx pause.........................Global

or for the 1A-ETH connector :

 

# getportstatus 1A-ETH

Port status for connector 1A-ETH-1 Bridge-1 Port Bridge-1-2
Adminstate.......................Enabled
State............................Reset
Link state.......................Down
Protocol.........................Ethernet
Link Mode........................XFI
Speed............................10Gb/s
MTU..............................9600
Tx pause.........................Global
Rx pause.........................Global
Port status for connector 1A-ETH-2 Bridge-1 Port Bridge-1-2
Adminstate.......................Enabled
State............................Reset
Link state.......................Down
Protocol.........................Ethernet
Link Mode........................XFI
Speed............................10Gb/s
MTU..............................9600
Tx pause.........................Global
Rx pause.........................Global
Port status for connector 1A-ETH-3 Bridge-1 Port Bridge-1-1
Adminstate.......................Enabled
State............................Reset
Link state.......................Down
Protocol.........................Ethernet
Link Mode........................XFI
Speed............................10Gb/s
MTU..............................9600
Tx pause.........................Global
Rx pause.........................Global
Port status for connector 1A-ETH-4 Bridge-1 Port Bridge-1-1
Adminstate.......................Enabled
State............................Reset
Link state.......................Down
Protocol.........................Ethernet
Link Mode........................XFI
Speed............................10Gb/s
MTU..............................9600
Tx pause.........................Global
Rx pause.........................Global

getportstatus can also be used for IB ports on the switch

 

# getportstatus 6A
Port status for connector 6A Switch Port 35
Adminstate:......................Enabled
LinkWidthEnabled:................1X or 4X
LinkWidthSupported:..............1X or 4X
LinkWidthActive:.................4X
LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps
LinkState:.......................Active
PhysLinkState:...................LinkUp
LinkSpeedActive:.................10.0 Gbps
LinkSpeedEnabled:................2.5 Gbps or 5.0 Gbps or 10.0 Gbps

getportcounters will display the data and error counters for a port

 

# getportcounters 0A-ETH-1
ETH Port 0A-ETH-1
----------------------------------
RX bytes:........................0x6a495
RX packets:......................0xd680
RX Jumbo packets:................0x0
RX unicast packets:..............0x37f9
RX multicast packets:............0xc593
RX broadcast packets:............0x196a
RX no buffer:....................0x0
RX CRC:..........................0x0
RX runt:.........................0x0
RX errors:.......................0x0
TX bytes:........................0x5f398
TX packets:......................0xd4f3
TX Jumbo packets:................0x0
TX unicast packets:..............0x38da
TX multicast packets:............0xc47a
TX broadcast packets:............0x18b3
TX errors:.......................0x0

and for an IB port

 

# getportcounters  4A
# Port counters: Lid 15 port 28
PortSelect:......................28
CounterSelect:...................0x1b01
SymbolErrors:....................0
LinkRecovers:....................0
LinkDowned:......................0
RcvErrors:.......................0
RcvRemotePhysErrors:.............0
RcvSwRelayErrors:................0
XmtDiscards:.....................0
XmtConstraintErrors:.............0
RcvConstraintErrors:.............0
LinkIntegrityErrors:.............0
ExcBufOverrunErrors:.............0
VL15Dropped:.....................0
XmtData:.........................661422672
RcvData:.........................661477000
XmtPkts:.........................9186426
RcvPkts:.........................9186797

 

 

 

Installation


General rules when working with Infiniband cables

During installation it is very important to ensure that any copper core InfiniBand cable is not subjected to a bend tighter than a 5 inch (127 mm) radius.

Do not allow any optical InfiniBand cable to bend tighter than a 3.4 inch (85 mm) radius. Tight bends can damage the cable internally.

Do not use zip ties to bundle or support InfiniBand cables. The sharp edges of the ties can damage the cables internally, use soft hook and loop straps to keep cables organized.

Do not allow any InfiniBand cable to experience extreme tension. Do not pull on an InfiniBand cable or allow it to drag. Pulling on an InfiniBand cable can damage the cables internally

Do not twist an InfiniBand cable more than 1 revolution for its entire length. Twisting an InfiniBand cable can damage the cable internally.

Do not route InfiniBand cables where they might be stepped upon or experience rolling loads. Such a crushing effect can damage the cable internally.

 

De-installation


 

 

Troubleshooting


Many "infiniband" issues reported turn out to be problems with the cables.

Infiniband cables are very easily damaged either at installation time or when other maintenance work has been carried out adjacent to them.

It is very important to ensure that any copper core infiniBand cable is not subjected to a bend tighter than a 5 inch (127 mm) radius.

The QSFP connectors are also very easily damaged. Just dropping a cable-end on the floor can damage the connector such that it can cause damage to the port when it is plugged in.

Also a connector that isn't fully engaged can cause errors to occur when a link is in use.

Use the getportcounters utility to display the error counters for the port, remember that these are accumulated counts so take repeated snapshots to see if they are still increasing or are just historical.

The most common error reported caused by poor connections or damaged cables are symbol errors, which in severe cases can also be accompanied by link recovery errors.

If these errors are being reported the first thing to check is that the cable associated with the port reporting the errors is securely plugged in at both ends. Unplug the cable and visually check the connector ends for any distortion or damage before re-insertion, ensuring the connector is fully home.

If the errors are still being reported then try moving the cable to a free port and check if the errors move with the cable.

If possible try a replacement cable.

If the errors are confined to one or two links then it is more likely for the problem to be with the host system's HCA and/or the cable than it is to be an issue on the switch.

 

Check the internal temperatures of the switch components :

# showtemps

Back temperature 25
Front temperature 24
SP temperature 40
Switch temperature 51, maxtemperature 55
Bridge-0 temperature 39, maxtemperature 40
Bridge-1 temperature 43, maxtemperature 44
All temperatures OK

To check the overall health of the switch :

# showunhealthy
OK - No unhealthy sensors
#

Check the power supplies

# checkpower
PSU 0 present status: OK
PSU 1 present status: OK
#

Check the fans (only three are installed in the NM2-GW)

# getfanspeed
Fan 0 not present
Fan 1 rpm 12426
Fan 2 rpm 12426
Fan 3 rpm 12317
Fan 4 not present

Note: If there are less than two operational fans the NM2-GW will shutdown to prevent thermal overload

Check the Infiniband ASIC

# checkboot 0
Switch OK
#

Or, to run all tests

# env_test

Environment test started:
Starting Environment Daemon test:
Environment daemon running
Environment Daemon test returned OK
Starting Voltage test:
Voltage ECB OK
Measured 3.3V Main = 3.28 V
Measured 3.3V Standby = 3.37 V
Measured 12V = 11.97 V
Measured 5V = 5.04 V
Measured VBAT = 3.09 V
Measured 1.0V = 1.01 V
Measured I4 1.2V = 1.22 V
Measured 2.5V = 2.50 V
Measured V1P2 DIG = 1.18 V
Measured V1P2 ANG = 1.17 V
Measured 1.2V BridgeX = 1.22 V
Measured 1.8V = 1.78 V
Measured 1.2V Standby = 1.19 V
Voltage test returned OK
Starting PSU test:
PSU 0 present OK
PSU 1 present OK
PSU test returned OK
Starting Temperature test:
Back temperature 25
Front temperature 24
SP temperature 39
Switch temperature 51, maxtemperature 55
Bridge-0 temperature 39, maxtemperature 40
Bridge-1 temperature 43, maxtemperature 44
Temperature test returned OK
Starting FAN test:
Fan 0 not present
Fan 1 running at rpm 12535
Fan 2 running at rpm 12535
Fan 3 running at rpm 12317
Fan 4 not present
FAN test returned OK
Starting Connector test:
Connector test returned OK
Starting Onboard ibdevice test:
Switch OK
Bridge-0 OK
Bridge-1 OK
All Internal ibdevices OK
Onboard ibdevice test returned OK
Environment test PASSED

Other utilities available :

Display port counters

# getportcounters port or connector (-R can be added to reset the port's counters)

(where port can be 1-36 and connector can be 0A-15A, 0A-ETH, 1A-ETH, or 0B-15B)

Display link status

# listlinkup

Display a port status

# getportstatus port

(where port can be from 1 through 36)

or

# getportstatus connector

(where connector can be 0A-15A, 0A-ETH, 1A-ETH, 0B-15B)

 

 

Internal Port Mapping

 

Connector  0A  <-> Port 20
Connector  1A  <-> Port 22
Connector  2A  <-> Port 24
Connector  3A  <-> Port 26
Connector  4A  <-> Port 28
Connector  5A  <-> Port 30
Connector  6A  <-> Port 35
Connector  7A  <-> Port 33
Connector  8A  <-> Port 31
Connector  9A  <-> Port 14
Connector 10A  <-> Port 16
Connector 11A  <-> Port 12
Connector 12A  <-> Port 18
Connector 13A  <-> Port 09
Connector 14A  <-> Port 07
Connector 15A  <-> Port 05
Connector 0A-ETH
           Bridge 0A-ETH-1
           Bridge 0A-ETH-2
           Bridge 0A-ETH-3
           Bridge 0A-ETH-4
Connector 1A-ETH
           Bridge 1A-ETH-1
           Bridge 1A-ETH-2
           Bridge 1A-ETH-3
           Bridge 1A-ETH-4
Connector  0B  <-> Port 19
Connector  1B  <-> Port 21
Connector  2B  <-> Port 23
Connector  3B  <-> Port 25
Connector  4B  <-> Port 27
Connector  5B  <-> Port 29
Connector  6B  <-> Port 36
Connector  7B  <-> Port 34
Connector  8B  <-> Port 32
Connector  9B  <-> Port 13
Connector 10B  <-> Port 15
Connector 11B  <-> Port 17
Connector 12B  <-> Port 11 
Connector 13B  <-> Port 10 
Connector 14B  <-> Port 08 
Connector 15B  <-> Port 06
Connector 0B-FC  <-> (unused)
Connector 1B-FC  <-> (unused)

 

Physical Connector Numbering (on front panel)

 

                                                                                   ETH 
  0A  1A  2A     3A  4A  5A     6A  7A  8A     9A 10A 11A     12A 13A 14A    15A  0A 1A
  -----------    -----------    -----------    -----------    -----------    -----------
 |   |   |   |  |   |   |   |  |   |   |   |  |   |   |   |  |   |   |   |  |   |   |   |
  -----------    -----------    -----------    -----------    -----------    -----------

  -----------    -----------    -----------    -----------    -----------    -----------
 |   |   |   |  |   |   |   |  |   |   |   |  |   |   |   |  |   |   |   |  |   |   |   |
  -----------    -----------    -----------    -----------    -----------    -----------
  0B  1B  2B     3B  4B  5B     6B  7B  8B     9B 10B 11B     12B 13B 14B    15B  0B 1B
                                                                                 Not Used

 

 

Performance


 

 

Lab


 

 

Contacts


 

 

Proactive


 

Further Reading

Sun Network QDR InfiniBand Gateway Switch Documentation Library

 

Sun Network QDR InfiniBand Gateway Switch Getting Started Guide                                              821-1192 (shipped with switch)

Sun Network QDR InfiniBand Gateway Switch Product Notes                                                      821-1190

Sun Network QDR InfiniBand Gateway Switch Installation Guide                                                 821-1186

Sun Network QDR InfiniBand Gateway Switch Administration Guide                                               821-1187

Sun Network QDR InfiniBand Gateway Switch Service Manual                                                     821-1188

Sun Network QDR InfiniBand Gateway Switch Command Reference                                                  821-1189

Sun Network QDR InfiniBand Gateway Switch Safety and Compliance Guide                                        821-1191

Oracle Integrated Lights Out Manager (ILOM) 3.0 Supplement for the Sun Network QDR Infiniband Gateway Switch 821-1545

Sun System Handbook

Exalogic Library

EDP Training

NM2-GW Nano Magnum 2 Gate Way Switch
    NM2-GW TOI: Configuring Ethernet
    NM2-GW TOI: Firmware Upgrade
    NM2-GW TOI: Hardware Architecture
    NM2-GW TOI: Software Architecture

 

Course Title:         Engineering TOI: Infiniband Overview and Driver Update for TSC Session 1
Course Description:
This TOI will provide an overview of the latest Infiniband technology and the latest driver stacks implemented in Solaris. This TOI will also discuss the new realigned Infiniband SR (Service Request) coverage between TSC networking and storage driver group.

Replay Duration: 94 minutes
Availability:         Employee
EDP Link: http://edp.oraclecorp.com/lms/faces/lms/edp/plancurrassignments.jspx?planid=178481
iLearning Link: http://oukc.oracle.com/static09/opn/login/?t=checkusercookies|r=-1|c=1234484941


Course Title:         Engineering TOI: InfiniBand Session 2
Course Description:
Introduction to Ethernet over InfiniBand technology using Oracle's InfiniBand Gateway Switch.
An overview of building blocks inside this Gateway switch, Virtual HUBs, Virtual NICs and use of Link Aggregation. How to administer EoIB VNICs in a datacenter and their use on compute nodes running Solaris and Linux. Introduction to ILOM snapshots from InfiniBand switches and their use in problem analysis.

Replay Duration: 94 minutes
Availability:         Partner and Employee
EDP Link: http://edp.oraclecorp.com/lms/faces/lms/edp/plancurrassignments.jspx?planid=176825
iLearning Link: http://oukc.oracle.com/static09/opn/login/?t=checkusercookies|r=-1|c=1218512234


Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback