Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-2265048.1
Update Date:2017-05-25
Keywords:

Solution Type  Problem Resolution Sure

Solution  2265048.1 :   IB (INFINIBAND) PORT Is Null(missing) On DB Node  


Related Items
  • Exadata X4-2 Hardware
  •  
  • Exadata X3-2 Hardware
  •  
  • Exadata X5-2 Hardware
  •  
Related Categories
  • PLA-Support>Sun Systems>SAND>Network>SN-SND: Sun Network Infiniband
  •  




In this Document
Symptoms
Cause
Solution


Created from <SR 3-14840316250>

Applies to:

Exadata X4-2 Hardware - Version All Versions and later
Exadata X3-2 Hardware - Version All Versions and later
Exadata X5-2 Hardware - Version All Versions and later
Information in this document applies to any platform.

Symptoms

IB Ports are missing the below output is empty

[root@exadbadm01 ~]# dbmcli -e list ibport
[root@exadbadm01 ~]#



Cause

[root@exadbadm01 ~]# ibstatus
Infiniband device 'mlx4_0' port 1 status:
  default gid: fe80:0000:0000:0000:0010:e000:0134:f621
  base lid: 0x9
  sm lid: 0x1
  state: 4: ACTIVE
  phys state: 5: LinkUp
  rate: 40 Gb/sec (4X QDR)
  link_layer: IB

Infiniband device 'mlx4_0' port 2 status:
  default gid: fe80:0000:0000:0000:0010:e000:0134:f622
  base lid: 0xa
  sm lid: 0x1
  state: 4: ACTIVE
  phys state: 5: LinkUp
  rate: 40 Gb/sec (4X QDR)
  link_layer: IB

ibstatus output shows both IB ports are Linkup and ACTIVE on the node exadbadm01. This confirms IB ports are available and working.
 
All other IB commands are failing as shown below.

[root@exadbadm01 ~]# ibstat
ibstat: error while loading shared libraries: libosmcomp.so.3: cannot open shared object file: No such file or directory

[root@exadbadm01 ~]# ibswitches
/usr/sbin/ibnetdiscover: error while loading shared libraries: libosmcomp.so.3: cannot open shared object file: No such file or directory

It seems some libraries/RPM packages are missing on the node exadbadm01 which are existing on the good node exadbadm02.

[root@exadbadm02 ~]# ls -al /usr/lib64/libosmcomp.so.3
lrwxrwxrwx 1 root root 19 May 5 20:14 /usr/lib64/libosmcomp.so.3 -> libosmcomp.so.3.0.9
[root@exadbadm01 ~]# ls -al /usr/lib64/libosmcomp.so.3
ls: cannot access /usr/lib64/libosmcomp.so.3: No such file or directory

[root@exadbadm02 ~]# ls -al /usr/lib64 | wc -l
611
[root@exadbadm01 ~]# ls -al /usr/lib64 | wc -l
507


 

Solution

/usr/lib64/libosmcomp.so.3.0.5 or /usr/lib64/libosmcomp.so.3 are shipped by the opensm-libs package.  This package is corrupt or not installed properly.

A properly installed opensm-libs package should have something like:

# rpm -ql opensm-libs-3.3.15-6.mlnx1.5.5r2.el6.x86_64
/usr/lib64/libopensm.so.5
/usr/lib64/libopensm.so.5.0.0
/usr/lib64/libosmcomp.so.3
/usr/lib64/libosmcomp.so.3.0.5
/usr/lib64/libosmvendor.so.3
/usr/lib64/libosmvendor.so.3.0.7

Linux engineer helped customer to copy the missing libraries from known good node to the problematic node.




Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback