Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-75-1368260.1
Update Date:2015-12-15
Keywords:

Solution Type  Troubleshooting Sure

Solution  1368260.1 :   Troubleshooting Information for Switches Included with Oracle Exalogic Racks  


Related Items
  • Oracle Exalogic Elastic Cloud Software
  •  
  • Oracle Exalogic Elastic Cloud X2-2 Hardware
  •  
Related Categories
  • PLA-Support>Eng Systems>Exalogic/OVCA>Oracle Exalogic>MW: Exalogic Core
  •  


Troubleshooting Information for Gateways Included with Oracle Exalogic Racks; Sun Network QDR InfiniBand Gateway Switch, Sun Datacenter InfiniBand Switch 36 and Cisco Catalyst 4948 Ethernet management switch

In this Document
Purpose
Troubleshooting Steps
 1) Sun Datacenter InfiniBand Switch 36 (Sun NM2-36p)
 2) Sun Network QDR InfiniBand Gateway Switch (Sun NM2-GW )
 3) Cisco Catalyst 4948 Ethernet management switch (Cisco 4948 Switch) 


Applies to:

Oracle Exalogic Elastic Cloud Software - Version 1.0.0.0.0 and later
Oracle Exalogic Elastic Cloud X2-2 Hardware - Version X2 to X2 [Release X2]
Information in this document applies to any platform.
***Checked for relevance on 15-Jan-2014***

Purpose

The purpose of this document is to outline the commands and logs needed to troubleshoot, resolve and report issues resulting from problems with networking hardware included in the Exalogic X-2 rack.

Troubleshooting Steps

Oracle Exalogic racks consist of the following switches. The number included varies based on the configuration (Quarter, Half or Full rack)

1) Sun Datacenter InfiniBand Switch 36 (Sun NM2-36p)

Documentation: http://download.oracle.com/docs/cd/E19197-01/index.html

This switch consists of following ports:

(36x) QDR InfiniBand ports (BASE-T)
(1x) GbE management ports (BASE-T)

This switch is a full, 36 port, QDR InfiniBand switch. It is only included in half and full rack configurations. It comes from the factory unwired as it is only used in multi-rack configurations. When implemented for multi-rack installations, it serves to create a "fat-tree" fabric architecture. The switch includes configuration options for limiting access from other racks.

Troubleshooting information:

On the SDS 36 Infiniband switch, the following log files are available to troubleshoot issues:

/var/log/messages
/var/log/opensm.log
/var/log/opensm-subnet.lst
(the opensm files may not contain much information if the SDS 36 is not the Subnet Manager Master)


And the following utilities can be used to collect data for troubleshooting:

/usr/local/bin/version
/usr/local/bin/env_test
/usr/local/bin/listlinkup
/usr/bin/ibdiagnet -skip dup_guids -pm
/usr/sbin/ibcheckerrors -v (collect output)


When opening a service request, please collect all the log files above, the output of the utilities and also all files the utilities create in /tmp. Some of the files which are created in /tmp are listed here:

ibdiagnet.db
ibdiagnet.lst
ibdiagnet.pm
ibdiagnet.pkey
ibdiagnet_ibis.log
ibdiagnet.fdbs
ibdiagnet.mcfdbs
ibdiagnet.sm
ibdiagnet.log
ldalog

2) Sun Network QDR InfiniBand Gateway Switch (Sun NM2-GW )

Documentation: http://download.oracle.com/docs/cd/E19671-01/index.html

This switch consists of following ports:

(32x) QDR InfiniBand ports (BASE-T)
(8x) 10GbE ports
(1x) GbE management port (BASE-T)

The gateway is a full, 32-port QDR InfiniBand switch which is redundantly deployed in each Exalogic rack. It consists of 8, 10GbE ports bridged (not switched) to the IB fabric.

This serves two roles in Exalogic:
1) Core InfiniBand switching function on all IB-connected components are switched through these.
2) Connectivity to datacenter 10GbE client network

Exadata racks (ED) do not have these switches, ED uses the NM2-36p for IPoIB and RDS communications between the DB nodes and Cells. By default, client side connectivity is accessed via onboard 1GbE or 10GbE interfaces local to the DB nodes. The use of EoIB is currently unsupported

Troubleshooting information:

The following commands are available from all hosts in the IB subnet :

  • ibdiagnet -ls 10 -lw 4x
Used to verify that all links are running 4x QDR IB


-ls is link speed, 10 is 10 gbps, the raw linkspeed of QDR on a lane
-lw is link width, 4x is the link with used with our products.
The output of command will state if there are links in the IB subnet which are not at 4x QDR speed. The possible cause of this could be a cable not correctly fitted in the connector.

  • ibhosts : Command to show CAs in the IB subnet:

This command will list name of HCAs and will also list the BridgeX devices and the Gateway names for those

 

  • ibswitches: Command to show switches in the IB subnet

This command will list name of IB switches in the IB subnet including IB switches in the Gatways in the IB subnet

 

  • ibnodes: Command to show all IB devices in the IB subnet (CAs and Switches)

 

  • ibnetdiscover : Command to show connectivity in the IB subnet

This command will show the IB devices CAs and Switches) in the IB subnet and the connections between the IB devices


All these commands mentioned as "command available from all hosts in the IB subnet" will also work if run from the service processor of the NM2GW.

All the above information can also be collected by running RDA in the Exalogic Environment.


The following is a selection of diagnostic commands which can be executed at the Service Processor of the NM2GW :

  • showunhealthy: Will check all environment sensors in the NM2GW and state OK if no errors
  • env_test : Will do a full environment test and show the results
  • showvnics : Will show the virtual NIC resources on the gateway
  • showvlans : Will show the virtual LAN resources for the

NM2GW - IB Connectivity verification:

- IB connection verification can be done by using the ibnetdiscover command which is available for both hosts running the IB software stack and the NM2GW
- In addition the NM2GW got some added commands that could be used for IB connectivity verification
  • generatetopology: Generate a IB subnet topology file that describes the IB subnet in a readable format
  • matchtopology: Match the current IB subnet topology with a topology file that is provided as input to the command
  • Showtopology : Show the current IB subnet topology to the user

 

3) Cisco Catalyst 4948 Ethernet management switch (Cisco 4948 Switch) 

Documentation: http://www.cisco.com/en/US/products/ps6021/tsd_products_support_troubleshoot_and_alerts.html

This switch connects all components in Exalogic together on a 1GbE management network.

The following are the connections used in connecting different components of Exalogic rack

  • Compute nodes - NET0/eth0
  • 7320 Storage - NET0 of each head, NET1/2 for clustering
  • InfiniBand Switches/Gateways - Management port
  • Power Distribution Units - Management port

Typically uplinked to the datacenter for remote access to nodes, this switch is also used for network management tools (ASR, OpsCenter, etc)

Troubleshooting information:

The following can be collected to assist troubleshooting.

show logging
show running-config
show tech-support


To redirect the output of any 'show' command to a file, use the show redirect command in privileged EXEC mode.

show <command> | redirect url

 

Syntax Description :

| redirect url

The addition of this syntax redirects the command output to the file location specified in the URL. The pipe (|) is required.

The Cisco IOS File System (IFS) uses URLs to specify the location of a file system, directory, and file. Typical URL elements include:

prefix:[directory/]filename

Prefixes can be local file locations, such as flash: or disk0:. Alternatively, you can specify network locations using the following syntax:

ftp:[[//[username[:password]@]location]/directory]/filename

tftp:[[//location]/directory]/filename

Note The rcp: prefix is not supported.

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback