Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1528092.1
Update Date:2017-08-01
Keywords:

Solution Type  Problem Resolution Sure

Solution  1528092.1 :   Oracle Fabric Interconnect (Formerly Xsigo):: IB Symbol Errors Indicating IB Hardware Issues  


Related Items
  • Exadata X3-2 Quarter Rack
  •  
  • Oracle Fabric Interconnect F1-15
  •  
  • Oracle Fabric Interconnect F1-4
  •  
Related Categories
  • PLA-Support>Sun Systems>SAND>Network>SN-SND: Oracle Virtual Networking
  •  




In this Document
Symptoms
Cause
Solution


Applies to:

Oracle Fabric Interconnect F1-4 - Version All Versions to All Versions [Release All Releases]
Exadata X3-2 Quarter Rack - Version All Versions to All Versions [Release All Releases]
Oracle Fabric Interconnect F1-15 - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.

Symptoms

 One possible Indication of a bad IB cable, bad IB port, or bad HCA is when it is running at lower rate (SDR versus DDR, or QDR) than it should.

Cause

 IB Hardware Malfunction

Solution

In Oracle Fabric Interconnect versions prior to 2.8.5 you had to login to the Oracle Fabric Interconnect CLI as user 'root' in order to run the commands below to check current IB errors and then clear them.  You then run 'ibcheckerrors' again after clearing the errors to see if error counters immediately start incrementing.

# ibcheckerrors
# ibclearerrors

In Oracle Fabric Interconnect versions 2.8.5 and above, you can login to the Oracle Fabric Interconnect CLI as user 'admin' and run these commands:

Command below will show 'ibcheckerrors' in output:
# show diagnostics ofed

This command is the same as 'ibclearerrors' when run as user 'admin':
# set diagnostics ib-clear-counters

After clearing the counters run 'show diagnostics ofed' again to see 'ibcheckerrors' output.  What you are looking for is to find if "SymbolErrors" immediately increment, see example below:

Error check on lid 24 (Infiniscale-IV Mellanox Technologies) port all: FAILED #warn: counter SymbolErrors = 202 (threshold 10) lid 24 port 26

Information below shows output for following HCA (lid 24 and port 26). Please note it is running in SDR mode as opposed to DDR. This indicates either a bad IB Cable,  bad IB port or bad HCA.    The output following this entry shows the expected IB rate:

vendid=0x2c9
devid=0xbd36
sysimgguid=0x2264ffff342838
switchguid=0x2264ffff342838(2264ffff342838)
Switch 32 "S-002264ffff342838" # "Infiniscale-IV Mellanox Technologies" base port 0 lid 24 lmc 0
[26] "H-0002c9030008ff00"[1](2c9030008ff01) # "MT25408 ConnectX Mellanox Technologies" lid 26 4xSDR

Expected rate from same switch:

23] "H-0002c90300090078"[1](2c90300090079) # "MT25408 ConnectX Mellanox Technologies" lid 30 4xDDR


Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback