![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
|||||||||||||||||||||
Solution Type Problem Resolution Sure Solution 1573312.1 : "sminfo" On The Infiniband Switch Reports: ibwarn: [11010] mad_rpc: _do_madrpc failed; dport
sminfo on the IB switch reports "mad_rpc:_do_madrpc failed" In this Document
Created from <SR 3-7564255315> Applies to:Exadata X3-2 Hardware - Version All Versions and laterInformation in this document applies to any platform. SymptomsExecuting sminfo command gives ibwarn and sminfo errors: ibwarn: [11010] mad_rpc: _do_madrpc failed; dport (Lid 2)
sminfo: iberror: failed: query
Below messages logged in /var/log/messages when switch/opensmd was restarted:
Also, most of the switch ports/host HCA ports are in INIT state (This can be identified from switch command output #/usr/bin/ibdiagnet -skip dup_guids -pm ).
ChangesThe partition file may have been inadvertently deleted, and/or the partition valid flag in /conf/configvalid may have recently changed value to false(0)
CauseIf the switch is running firmware version lower than 2.0, the partition valid flag in /conf/configvalid may be set to false(0) and stay in this state even after reboot. This happens if partitiond is not able to signal the SM when SM becomes master. If this happens the SM will not be fully operational. Error in /var/log/opensm.log shows that partitiond is not able to find a valid partition file. partitiond: No valid partition file
SolutionCheck if the switch is running firmware version lower than 2.0 and if so, check if the partition file if exists.
If exists check to see if configvalid file has the value set to '1'.
If configvalid file is set to '0' , change the value to '1' :
#disablesm
#echo 1 > /conf/configvalid #enablesm If the switch is running firmware version 2.0 or newer, check if this switch is running opensm #service opensmd status If it is runing opensm, then check the output of the following command #smnodes list The output of this must be identical to that on the IB switch currently running as Master, and it must contain the management ip addresses of all IB switches running opensm in this IB fabric. Once that is verified and fixed, propagate IB partitions to all IB switches running opensm by running the following two commands on the IB switch running as the Master. #smpartition start
Attachments This solution has no attachment |
|||||||||||||||||||||
|