Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-2040991.1
Update Date:2018-01-03
Keywords:

Solution Type  Problem Resolution Sure

Solution  2040991.1 :   Solaris Netname Changed - Solaris 11.1 IO Domain with Xsigo/OVN F1-15 VNICs - Several Reasons  


Related Items
  • Oracle Fabric Interconnect F1-15
  •  
Related Categories
  • PLA-Support>Sun Systems>SAND>Network>SN-SND: Oracle Virtual Networking
  •  




In this Document
Symptoms
Changes
Cause
Solution
References


Created from <SR 3-10472446071>

Applies to:

Oracle Fabric Interconnect F1-15 - Version All Versions and later
Oracle Solaris on x86-64 (64-bit)
Oracle Solaris on SPARC (64-bit)

Symptoms

After an outage in a customer DataCenter Xsigo / F1-15 / Solaris environment, response to which included:

  i) Reboot of the Solaris/SPARC IO Domains (after an issue with hung storage, IO-domain was rebooted to recover from the hang state)
 a
nd
  ii) D
isconnect/reconnect of Server Profiles in OFM/XFM GUI manager for the F1-15,

...it was found that netnames (on the Solaris servers/hosts), of VNICs connected through the Xsigo/F1-15, had been changed unexpectedly.

 

The netnames had to be manually corrected to allow resumption of operations, example follows:


Netnames mapping after rebooting the IO Domain:

net17             Ethernet             up         32000  full      xsvnic2
net18             Ethernet             up         10000  full      xsvnic3
net19             Ethernet             up         32000  full      xsvnic7
net20             Ethernet             up         32000  full      xsvnic5
net21             Ethernet             up         32000  full      xsvnic0
net22             Ethernet             up         10000  full      xsvnic1
net23             Ethernet             up         32000  full      xsvnic6
net24             Ethernet             up         32000  full      xsvnic4

  


Netnames mapping after correcting the mapping back to what it had been originally:

net21             global    Ethernet             up         32000  full      xsvnic0
net25             global    Ethernet             up         10000  full      xsvnic1 -------->
net22             global    Ethernet             up         32000  full      xsvnic6 -------->
net24             global    Ethernet             up         32000  full      xsvnic4
net17             global    Ethernet             up         32000  full      xsvnic2
net20             global    Ethernet             up         32000  full      xsvnic5
net23             global    Ethernet             up         32000  full      xsvnic7 -------->
net18             global    Ethernet             up         10000  full      xsvnic3

  

 

Changes

  i) Reboot of the Solaris/SPARC IO Domains (after an issue with hung storage, IO-domain was rebooted to recover from the hang state)


 a
nd/or


  ii) D
isconnect/reconnect of Server Profiles in OFM/XFM GUI manager for the F1-15,

 

Cause

There are two independent causes here:


i) Netname change after the reboot of IO-Domain, was caused by a Bug/issue in Solaris 11.1, relating to timing issue in "nwamd" - the system would purge physical datalinks that appear to be no longer present via a call to "dladm init-phys".   This was ultimately tracked down to Bug 18354100 and this has been corrected in Solaris 11.2 SRU 8.4 and above and in Solaris 11.3

ii) Netname change after the disconnect/reconnect of Server Profiles in OFM/XFM Gui Manager, is purely down to the design of OFM/XFM - HCA ports are essentially randomly assigned to VNICs on reconnect which did not cause an issue for other OS due to them using VNIC name only for the node/device, whereas Solaris uses the device-port-ID as part of the netname mapping.   The workaround is to use Down/Up to recover a Server-profile, instead of Disconnect/Connect.   An RFE is filed to fix/correct this in future version of OFM/XFM, refer to Bug 20844134 / RFE 19519892.

 

Solution

 i) Update to Solaris 11.2 SRU 8.4 and above;  or Solaris 11.3

ii) Use Up/Down of Server-profile instead of Disconnect/Connect, when working with Solaris hosts;  alternatively use the CLI for disconnect/connect, which gives more control over HCA port assignment

 

References

<BUG:19519892> - REDESIGN ALGORITHM FOR ALLOCATING PORTS IN CLOUDS AND SERVER PROFILES TO HCAS
<BUG:20900286> - "LINK" - "DEVICE" NAME MAPPING WAS CHANGED UNEXPECTEDLY AFTER FORCE REBOOT.
<BUG:20844134> - OFM: SERVER PROFILE DISCONNECT-RECONNECT INCONSISTENT BEHAVIOR
<BUG:18354100> - 60+ SECOND INSTALL DELAY DUE TO NWAMD

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback