![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||||||||
Solution Type Sun Alert Sure Solution 2188787.1 : SPARC M5-32 and M6-32 Systems With System Firmware Versions 9.5.3 through 9.6.5.a May Experience Solaris Panic Due to Certain PCIe Fabric Errors
In this Document
Applies to:SPARC M6-32SPARC M5-32 Sun Hardware - Generic Sun Software - Generic SPARC _____________________________________________ Date of Resolved Release: 06-Oct-2016 _____________________________________________ DescriptionOn SPARC M5-32 and M6-32 systems, a problem in system firmware versions 9.5.3 through 9.6.5.a may allow the hardware to enter a state that it interprets as a "surprise removal" of an entire PCIe bus. This condition will cause Solaris to panic. A system and all its hosts are vulnerable to this issue if one or more of the following sequences of events has taken place after an affected version of the firmware was installed: Power was removed from the entire chassis (an AC cycle) Once the system is vulnerable, the issue may occur at any time but has been seen to take months before it is triggered. OccurrenceThis issue can occur on the following platforms: SPARC Platform SPARC M5-32/M6-32 Systems with any of the following Sun System Firmware versions:
Notes: 1. The x86 Platform is not affected by this issue. To determine the firmware version installed on the system, use the following ILOM command: -> show /System system_fw_version SymptomsIf the described issue occurs, the Operating System will panic as shown below, preceded by a PCIe 'surprise removal’ warning message, which will appear on the console and also in the history file (/hostX/console/history) where X is the specific host that had the panic: WARNING: Link retraining detected in SP port pcieb222 SUNW-MSG-ID: SUNOS-8000-0G, TYPE: Error, VER: 1, SEVERITY: Major panic[cpu0]/thread=2a10009dc20: Fatal error has occured in: PCIe fabric.(0x1)(0x105) 000002a10009d590 px:px_err_panic+1c4 (208ea000, 1, 105, 1223ec00, 1, 208e7018) syncing file systems... WorkaroundThere is no workaround for this issue. Please see the "Resolution" section below. Resolution This issue is addressed in the following release:
Note: After loading firmware, it is important to shut power off to each physical domain and reset the Active SP. Either of following two procedures can be followed: A) Update all Physical Domains (PDOMs) at the same time: 1) Power off all running PDOMs: -> stop /System 2) Load the new system firmware version 9.6.6.a as described in "Mx-32 - How to update System Firmware (ILOM / HC / POST / HV / OBP / GM )" - <Document:1981675.1> 3) And then, power up each physical domain via: -> start /HOSTx (where x corresponds to 0, 1, 2, or 3) OR: B) If it is desirable to keep one or more of the physical domains running during the firmware upgrade: 1) Load the new system firmware version 9.6.6.a as described in "Mx-32 - How to update System Firmware (ILOM / HC / POST / HV / OBP / GM )" - <Document:1981675.1> 2) Then, when possible power down the PDOM: -> stop /HOSTx 3) Then reset the SP: -> reset /SP 4) After this, it is safe to power back up the PDOM: -> start /HOSTx (where x corresponds to 0, 1, 2, or 3) Notes: (a) It will require approximately 20 minutes for SP reset to complete before you can continue to start the HOST. (b) If running POST with max level is not desired due to longer downtime, temporarily set 'hw_change_level' to 'min' before resetting the SP. This can be done as follows: -> set /HOSTx/diag hw_change_level=min As stated above, the system is still vulnerable until all PDOMs have been powered down and SP has been reset and power restored to the physical domain. Patches<Patch:24736423> History06-Oct-2016: Document released, status is Resolved This issue was caused by a regression introduced by bug 21424759. This change forced Comments regarding any portion of this document should be submitted to Internal Contributor/Submitter: marcel.widjaja@oracle.com References<BUG:24351722> - MULTIPLE PANIC FATAL ERROR HAS OCCURRED IN: PCIE FABRIC.Attachments This solution has no attachment |
||||||||||||||||||||||||
|