Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-2162835.1
Update Date:2018-01-03
Keywords:

Solution Type  Technical Instruction Sure

Solution  2162835.1 :   How to fix the problem of ping failure or communication failure over ipoib interface when ibping works  


Related Items
  • Oracle Exadata Hardware
  •  
  • Oracle Exalogic Elastic Cloud Software
  •  
  • Sun Datacenter InfiniBand Switch 36
  •  
  • Sun Network QDR InfiniBand Gateway Switch
  •  
  • Sun Infiniband HCA
  •  
Related Categories
  • PLA-Support>Sun Systems>SAND>Network>SN-SND: Sun Network Infiniband
  •  




In this Document
Goal
Solution
References


Applies to:

Sun Datacenter InfiniBand Switch 36 - Version All Versions and later
Sun Network QDR InfiniBand Gateway Switch - Version All Versions and later
Oracle Exadata Hardware - Version 11.1.0.6 and later
Sun Infiniband HCA - Version All Versions and later
Oracle Exalogic Elastic Cloud Software - Version 1.0.0.0.0 and later
Information in this document applies to any platform.

Goal

 In an infiniband network,  communication between nodes over IPoIB interfaces may fail because of several reasons.  Ping over IPoIB interface does not work whereas ibping between these two nodes using lid of the ports are working well.  ibstat output shows that IB ports are up and active.  This document provides guidelines for fixing such an issue.

Solution

 There are several possible reasons why ping over IPoIB interface not working whereas ibping works.  The following are some of them.

      (Note: It is assumed that IB ports are up and active and ibping works)

     1. ib ports of these two nodes are not in the same IB partitions
     2. ipoib flag is not set in the IB partition
     3. mtu of the interface (IB layer MTU) is not matching with the mtu of the IB partition at switch
     4. Master subnet manager is not present in the IB fabric, or it is in a limbo state, or the SA database of SM master is corrupted.

 

1. ib ports of these two hosts are not in the same IB partitions.

    To check this, first find the portguids of the ports by running the following command on this host

       #ibstat

         Example:

           # ibstat
           CA 'mlx4_0' <<<<<<< HCA card name
           CA type: MT26428
           Number of ports: 2
           Firmware version: 2.7.8130 <<<<<< HCA firmware
           Hardware version: b0
           Node GUID: 0x0021280001cf8e4e
           System image GUID: 0x0021280001cf8e51
           Port 1:
                    State: Active <<<<<<<<<<<<<<<<<<<
                    Physical state: LinkUp <<<<<<<<<<<<<<<
                    Rate: 40
                    Base lid: 8 <<<<<<<<<<<< Lid of this port
                    LMC: 0
                    SM lid: 1 <<<<<<<<<<<< Lid of the Master subnet manager
                    Capability mask: 0x02510868
                    Port GUID: 0x0021280001cf8e4f <<<<<< GUID of this port
                    Link layer: InfiniBand
           Port 2:
                    State: Active
                    Physical state: LinkUp
                    Rate: 40
                    Base lid: 19
                    LMC: 0
                    SM lid: 1
                    Capability mask: 0x02510868
                    Port GUID: 0x0021280001cf8e50
                    Link layer: InfiniBand

    If this host is running solaris, the following command will also help to know the port guid and the partitions each belongs

         #dladm show-ib

             Example:

                # dladm show-ib
                LINK HCAGUID PORTGUID PORT STATE PKEYS
                net6 10E00001444698 10E00001444699 1 up FFFF
                net7 10E00001444698 10E0000144469A 2 up FFFF

    Then, login to the IB switch which is running as the current Master and run the following command to know the IB partitions in the IB fabric

          #smpartition list active

     For a connectivity between the two hosts, the ports guid of both these hosts must belong to the same partition. So, check the output of the above command and

     make sure that this condition is met. If port guid of one of these hosts is missing in the partition they are supposed to belong, add it by running the following

     commands on the IB switch which is running as the Master.

           #smpartition start
           #smpartition add -n <name of the partition> -port <port guid>
           #smpartition commit

 

2. ipoib flag is not set in the ib partitions.


    To check this, login to the IB switch running as Master and run the following command

          #smpartition list active

           Example:

                #smpartition list active
                # Sun DCS IB partition config file
                # This file is generated, do not edit
                #! version_number : 5
                Default=0x7fff, ipoib : <<<<<<<< ipoib flag
                ALL_CAS=full,
                ALL_SWITCHES=full,
                SELF=full;
                SUN_DCS=0x0001, ipoib : <<<<<<<< ipoib flag
                ALL_SWITCHES=full;
                ib_part_10 = 0x7010,defmember=full: ;
                ib_part_20 = 0x7020,defmember=full: ;
                ib_part_30 = 0x7030,defmember=full: ;
                ib_part_40 = 0x7040,defmember=full: ;

         If ipoib flag is missing in the partition, you may add it as follows by running the following command on the switch running as master.

                #smpartition start
                #smpartition modify -n <name of the ib partition> -flag ipoib
                #smpartition commit

 

3. mtu of the interface (IB layer MTU) is not matching with the mtu of the IB partition at switch


         For a ping to work between two hosts, the mtu of the ib port has to match with the mtu of the IB partition.
         Refer to the following document for details of this and how to fix it.

               doc id 1988452.1 : dladm show-part shows link down over an IB partition (Doc ID 1988452.1)

 

4. Master subnet manager is not present in the IB fabric, or it is in a limbo state, or the SA database of SM master is corrupted.

         Run the following command on any IB switch to know who the current master is

             #getmaster

         Regardless on which switch the above command is run, it should always point to the same master. There shall be only one master in an ib fabric.

         It is possible that the SM master is not functioning well, or is in a limbo state due to either bug 17482244 or any other reason.
         It is also possible that the SA database is corrupted because of some issues with SM master or fabric.
         Ping over ipoib will not work under those circumstances.
         It is also possible that an ibping using port guid also will not work. Refer to the following document for testing using ibping aas well as ping over ipoib.

              doc id 2016560.1 : Troubleshooting communication issues over an Infiniband fabric Using ibping, ping, and rds-ping (Doc ID 2016560.1)

         To test using port guid of the destination, run the ibping command on the client as follows:

             #ibping -G <port guid of the destination>

         To fix this, reboot the switch which is running as the current Master. Disabling SM on this switch also may help, instead of rebooting.
         That will make the Master move from the current switch to a different switch.
         After this, check if ping over ipoib works.

 

 

References

<NOTE:2016560.1> - Troubleshooting communication issues over an Infiniband fabric Using ibping, ping, and rds-ping
<NOTE:1988452.1> - dladm show-part shows link down over an IB partition

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback