![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||||
Solution Type Problem Resolution Sure Solution 2209786.1 : OVN (Xsigo), Two F1-15 Fabric Interconnects Show Duplicate HCA GUID, OpenSM Won't Come Out of Discover
In this Document
Created from <SR 3-13550566031> Applies to:Oracle Fabric Interconnect F1-15 - Version All Versions to All Versions [Release All Releases]Oracle Fabric Interconnect F1-4 - Version All Versions to All Versions [Release All Releases] Oracle Fabric Interconnect F1-15 Information in this document applies to any platform. SymptomsIn a meshed IB Fabric, OpenSM never comes out of Discover state to Master or Standby for either of the two Fabric Interconnects: Example: admin@xsigo1[xsigo] show diagnostics sm-info
- SM is running on xsigo1 - SM Lid 2 - SM Guid 0x13970201001960 - SM key 0x0 - SM priority 0 - SM State DISCOVER opensm.log shows: Nov 07 15:12:36 873584 [B5688B70] 0x01 -> Directed Path Dump of 3 hop path: Path = 0,1,1,36
Nov 07 15:12:36 873599 [B5688B70] 0x01 -> Directed Path Dump of 3 hop path: Path = 0,1,1,36 Nov 07 15:12:36 919884 [B5688B70] 0x81 -> report_duplicated_guid: ERR 0D01: Found duplicated node GUID. ChangesFound when moving from two separate IB Fabrics (both Fabric Interconnects configured as OpenSM master - not meshed) to moving to meshed IB Fabric (OpenSM master and Standby) Following instructions to set subnet manager faslse (#set system is-subnet-manager false), and then running "guid2lid" to zero out the lid table, resulted in one or both Fabric Interconnects staying in "Discover" state. CauseBoth Fabric Interconnect's HCA had the same port GUID. This is a manufacturing defect, and at this time it isn't known if this was a one time defect, or if there are other Front Panels that have duplicated HCA Port GUIDs in inventory. How to find if both Fabric Interconnects HCAs have the same HCA Port GUID: Run the command below, logged to Fabric Interconnect Command Line Interface (CLI) as user 'root' on both Fabric Interconnects: Example output: root@xsigo1:~# cat /sys/class/infiniband/mlx4_0/ports/1/gids/0 root@xsigo2:/# cat /sys/class/infiniband/mlx4_0/ports/1/gids/0 SolutionReplace the Front Panel in one of the Fabric Interconnects using this KB: How to replace a Gen2 Front Panel on Oracle Fabric Interconnects (Xsigo) (Doc ID 1663431.1) NOTE: Customer noted in order to get all server-profiles and v-star devices to come fully up/up after replacing one of the Front Panels, had to "Disconnect and Reconnect" all server-profiles.
References<NOTE:1663431.1> - How to replace a Gen2 Front Panel on Oracle Fabric Interconnects (Xsigo)Attachments This solution has no attachment |
||||||||||||||||||||
|