![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||
Solution Type Technical Instruction Sure Solution 1517629.1 : How to Perform an Oracle Fabric Interconnect Firmware (XgOS) Upgrade
In this Document
Applies to:Oracle Fabric Interconnect F1-4 - Version All Versions and laterOracle Fabric Interconnect F1-15 - Version All Versions and later Oracle Virtual Compute Appliance X3-2 Hardware - Version All Versions to All Versions [Release All Releases] Information in this document applies to any platform. GoalOracle Fabric Interconnect Firmware Upgrade Best Practices SolutionDISPATCH INSTRUCTIONS
WHAT STATE SHOULD THE SYSTEM BE IN TO BE READY TO PERFORM THE RESOLUTION ACTIVITY?: Before performing the XgOS upgrade, there are pre-upgrade checks that can be performed. Before commencing the XgOS upgrade, please review this KB to see if it is relevant to the version of XgOS being installed: Oracle Virtual Networking - After Upgrade to 4.0.7 XgOS, QDR IB Switch F/W is Not Changed (Doc ID 2204491.1) Make sure the number of vnics / vhbas per IO Cards matches across both Fabric Interconnects, this will help to find server-profiles that are not configured for redundancy. From user ‘admin’ CLI run: show iocards
Note the ‘v-resources’ column is the number of vnics or vhbas terminated: Example: admin@f1-4-sc11-a[xsigo] show iocardsslot state descr type v-resources-------------------------------------------------------------------------------1 up/resourceMissing sanFc2Port4GbLrCard 12 up/up nwEthernet10Port1GbCard 03 up/up nwEthernet1Port10GbCard 13 records displayedadmin@f1-4-sc11-a[xsigo]
Check the contents of the logs to see if there are any warnings, or errors spewing. Running ‘showlog <logfilename>’ is like running a ‘tail –f’ on live logs. It displays the most recent/current events that being logged. Additionally compare server-profiles across BOTH Fabric Interconnects to make sure that the vnic and vhba counts match. EXAMPLE:
admin@f1-4-sc11-a[xsigo] show server-profile Run the commands below as user ‘admin’ with the ‘showlog’ command:
showlog user.log showlog syslog.log showlog xvnd.log showlog ib.log showlog opensm.log
If there are repetitive ‘warnings’ or ‘errors’ being logged, of questionable messages repeating in the logs (log spew), contact the Oracle Support Hotline and open an SR. Do not proceed with the XgOS upgrade if you see loss of vnic, vhbas or link (link=0 in xvnd.log) that is ongoing and current. You may need to open an SR with Oracle Support if you cannot resolve inability to failover when performing ‘set server-profile * down’ or setting specific server-profiles down a few at a time. WHAT ACTION DOES THE FIELD ENGINEER/ADMINISTRATOR NEED TO TAKE?: To perform XgOS upgrade follow these instructions after reading the specifc XgOS version Product Notes in full located here: show diagnostics opensm-param
Look at "SM State" for whether the Fabric Interconnect node is Master or Standby (meshed IB Fabric). In dual IB Fabric, both Fabric Interconnects will be Master. Here is an example output of "show diagnostics opensm-param' that shows an external IB Switch that is also running OpenSM. OpenSM should <only> be running on the internal OVN F1-15 IB Switches, not on any external Sun NM2 IB Switch. Below examples shows three IB Switches with SUN IB Switch running OpenSM as "Master" with priority 14: admin@f1-4-sca11-a[xsigo] show diagnostics opensm-param OpenSM $ Current log level is 0x3 OpenSM $ Current sm-priority is 0 OpenSM $ OpenSM Version : OpenSM 3.3.13 SM State : Master SM Priority : 0 SA State : Ready Routing Engine : minhop Loaded event plugins : <none>
PerfMgr state/sweep state : Disabled/Sleeping
MAD stats --------- QP0 MADs outstanding : 0 QP0 MADs outstanding (on wire) : 0 QP0 MADs rcvd : 20984033 QP0 MADs sent : 20984006 QP0 unicasts sent : 2160272 QP0 unknown MADs rcvd : 0 SA MADs outstanding : 0 SA MADs rcvd : 182194380 SA MADs sent : 182194380 SA unknown MADs rcvd : 0 SA MADs ignored : 0
Subnet flags ------------ Ignore existing lfts : 0 Subnet Init errors : 0 In sweep hop 0 : 0 First time master sweep : 0 Coming out of standby : 0
Known SMs
Port GUID SM State Priority --------- -------- -------- 0x1397020100xxxx Standby 0 SELF <===> GUID that starts with 01397 denotes OVN F1-15 internal IB Switches 0x10e0650f3fxxxx Master 14 <===> GUID starting with 10e65 denotes external Sun NM2 IB Switch GUID and shows it is OpenSM "Master". Priority 14 is higher than the default priority of OVN F1-15 IB Switches which default to priority 0. ONE of the OVN F1-15 HA Pair MUST be OpenSM master. To resolve please see solution immediately below. 0x1397020100xxxx Standby 0
1) Log into the upstream Sun NM2 IB Switch that matches the GUID showing under "Known SM" and disable OpenSM on the corresponding external IB Switch. OpenSM must be running <only> on one of internal Mellanox IB Switches in the F1-15 HA pair. 2) Set OpenSM priority to 15 on ONE of the F1-15 HA pair, using this command:
# set diagnostics opensm-param priority 15 # resweep
set server-profile <profile_name*> down
(use * wildcard for pattern matching to set small groups of server-profiles down in order to verify failover occurred correctly. If at any time failover of vnic, vhba or server-profile didn't occur, STOP the pre-upgrade process and correct the failover failure before proceeding. DO NOT initiate XgOS upgrade if any v-star (vnic or vhba) or server-profile failover fails! 'system upgrade <file.xpf>’ EXAMPLE: system upgrade xgos-3.9.2.xpf Start on Chapter 20 page 369 Upgrading XgOS OBTAIN CUSTOMER ACCEPTANCE WHAT ACTION DOES THE FIELD ENGINEER/ADMINISTRATOR NEED TO TAKE TO RETURN THE SYSTEM TO AN OPERATIONAL STATE?: The Fabric Interconnect will reboot as a result of the XgOS upgrade. After it has come fully up, login as user 'root' then wait a few minutes and su to admin. set server-profile profile* up
set server-profile mysystem* up Verify that all server-profiles, vnics and vhbas are fully up AND verify that the vhbas and vnics are fully up on the hosts BEFORE starting to upgrade the XgOS of the second Fabric Interconnect. PLEASE do NOT perform 'system downgrade' if you encounter problems after the upgrade, please open an SR and upload Fabric Interconnect diagnostic log bundles plus upsteam ethernet switch configs and logs so that cause of failure can be analyzed.
NOTE: if there is any outage that appears to be due to the XgOS upgrade to 4.x XgOS, and customer deems it is necessary to *rollback* or *downgrade* the XgOS back to the previous XgOS version; for instance customer wants to downgrade from 4.x XgOS to 3.9.x XgOS, there are some things to be aware of. The 'system upgrade' command installs a brand new image, whereas the 'system downgrade' command points to a previous XgOS image.
When the current XgOS installed version is 4.x XgOS, the 'system upgrade 3.9.x-XGOS.xpf' command cannot be used because there is a new version of login encryption that breaks when using the 'system upgrade' command to go back from 4.x XgOS to 3.9.x XgOS which consequently breaks the login to the Fabric Interconnect. The only way to recover a Fabric Interconnect that had 4.x XgOS installed and then was downgraded to 3.9.x XgOS using the 'system upgrade' command, is to ship a new Gen2 Front Panel to the customer and then install the requested version of XgOS. To avoid breaking login to the Fabric Interconnect if customer has urgent need to downgrade, the 'system downgrade 3.9.x-XGOS.xpf' command *must* be used instead to avoid losing login access to the Fabric Interconnect when going back to previous older 3.9.x XgOS version. The 'system downgrade' command was specifically tested by OVN QA on 4.0.x XgOS which was then downgraded to 3.9.2. This is the only recent XgOS code branch that the 'system downgrade' command was tested by OVN QA. References<NOTE:2204491.1> - Oracle Virtual Networking - After Upgrade to 4.0.7 XgOS, QDR IB Switch F/W is Not ChangedAttachments This solution has no attachment |
||||||||||||||||
|