![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||
Solution Type Problem Resolution Sure Solution 1988445.1 : ibqueryerrors Reports SymbolErrors on One or More of the IB switch or HCA port
In this Document
Created from <SR 3-10238951403> Applies to:Big Data Appliance X3-2 In-Rack Expansion - Version All Versions and laterBig Data Appliance X5-2 Full Rack - Version All Versions and later Big Data Appliance X5-2 Hardware - Version All Versions and later Big Data Appliance X3-2 Full Rack - Version All Versions and later Exadata X3-2 Hardware - Version All Versions and later Linux x86-64 SymptomsNote: The information in this note applies to BDA as well as Exadata, Exalogic, SuperCluster, BDA, Private Cloud Appliance, Zero Data Loss Recovery Appliance, and standalon Infiniband switches. ibqueryerrors reports [SymbolErrors] on one of the IB switches. Note: This is not limited to Gateway switches. The same applies to any IB Switch or HCA port.
For example: [root@bdasw-ib2 ~]# ibqueryerrors.pl -rR -s PortRcvSwitchRelayErrors,PortXmitDiscards,PortXmitWait,VL15Dropped
Suppressing: PortRcvSwitchRelayErrors,PortXmitDiscards,PortXmitWait,VL15Dropped Errors for * "SUN IB QDR GW switch bdasw-ib2 *.*.*.*" GUID 0x002128d02ccac0a0 port 13: [SymbolErrors == 2] Link info: 68 13[ ] ==( 4X 10.0 Gbps)==> * 36[ ] "SUN DCS 36P QDR bdasw-ibs01 *.*.*.*"
CauseSymbol errors are almost always caused by a poorly seated cable or defective cable. In rare cases they can be caused by a defective switch port. SolutionOpen an SR to check the gateway.
# ibclearcounters
# ibclearerrors
# ibdiagnet -c 100 -P all=1
For example: bdasw-ib2# getportcounters 13
or bdasw-ib2# getportcounters 6b
Note: If you need a mapping of the port numbers to the port labels on the switch where the operations are being performed, get this with: bdasw-ib2# dcsport -printconnectors # dcsport -printconnectors
DCS-GW connectors:Connector 0A maps to Switch port 20 Connector 1A maps to Switch port 22 Connector 2A maps to Switch port 24 Connector 3A maps to Switch port 26 Connector 4A maps to Switch port 28 Connector 5A maps to Switch port 30 Connector 6A maps to Switch port 35 Connector 7A maps to Switch port 33 Connector 8A maps to Switch port 31 Connector 9A maps to Switch port 14 Connector 10A maps to Switch port 16 Connector 11A maps to Switch port 12 Connector 12A maps to Switch port 18 Connector 13A maps to Switch port 9 Connector 14A maps to Switch port 7 Connector 15A maps to Switch port 5 ...
Output when counters are cleared is like: bdasw-ib2# getportcounters 6b
Port counters for connector 6B Switch port 36SymbolErrors.....................0 LinkRecovers.....................0 LinkDowned.......................0 RcvErrors........................0 RcvRemotePhysErrors..............0 RcvSwRelayErrors.................0 XmtDiscards......................0 XmtConstraintErrors..............0 RcvConstraintErrors..............0 LinkIntegrityErrors..............0 ExcBufOverrunErrors..............0 VL15Dropped......................0 XmtData..........................0 RcvData..........................0 XmtPkts..........................0 RcvPkts..........................0 XmtWait..........................0 Ensure symbol errors are zero. 4. If you want to verify the IB port counters from a Server do the following: a) Run ibnetdiscover to discover the InfiniBand topology # ibnetdiscover
b) Then run perfquery to query the InfiniBand port counters. For example on a node you can use the following perfquery command where 13 is the lid and 1 is the port obtained from ibnetdiscover: # perfquery 13 1
Attachments This solution has no attachment |
||||||||||||||||
|