![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||||||
Solution Type Sun Alert Sure Solution 2380939.1 : SPARC M5, M6, and M7 Servers Physical Domain(s) may Unexpectedly Reset Following the Reset of the Active Service Processor
In this Document
Applies to:SPARC - Sun System FirmwareSun Hardware - Generic SPARC M5-32 SPARC M6-32 SPARC M7-8 Information in this document applies to any platform. SPARC ___________________________________________ Date of Resolved Release: 30-Mar-2018 ___________________________________________ DescriptionPhysical Domains on SPARC M5, M6, and M7 servers with certain firmware (as listed below) may unexpectedly reset following an active Service Processor (SP) reset. This can occur either during explicit user request for the SP to reset (e.g., 'reset /SP') or during a System Firmware upgrade that initiates the reset of the Active SP. OccurrenceThis issue can occur on the following platforms: SPARC Platform
Note: To determine the firmware version installed on the system, use the following ILOM command: -> show /HOST sysfw_version /HOST SymptomsWhen a ‘reset /SP’ command is issued on the Service Processor, one or more of the hosts may reset and reboot. During a firmware upgrade this issue can occur if the firmware version being upgraded is earlier than 9.6.20.b (on SPARC M5/M6) and 9.8.0.d (on SPARC M7). This issue only occurs if:
In both of the above failure scenarios the Host never reaches a stopped state. A pending shutdown is retained within the SP Host state transition. When the SP recovers following its reset, it resumes the Host reset and an unexpected domain outage occurs. The aborted state transition can be weeks or months old. During a normal reset or stop/start of the Host, a 'Host stopped’ message is evident in the Host status log and SP event log. To verify whether the Host actually stopped, use the following commands to check the Host's status log and SP event log: -> show /HOST0/status_history/list 20180316 11:57:35: status='Host shutting down' 20180316 11:58:34: status='Solaris panicking' 20180316 11:58:55: status='Solaris rebooting' 20180316 11:58:56: status='Host stopped' <<< Host stopped indication 20180316 11:58:59: status='Standby' 20180316 11:59:00: Shutdown Host in progress SP event log: -> show /sp/logs/event/list 719 Fri Mar 16 11:58:59 2018 System Log minor Host ID 0: Standby 718 Fri Mar 16 11:58:56 2018 System Log minor Host ID 0: Host stopped 717 Fri Mar 16 11:58:55 2018 System Log minor Host ID 0: Solaris rebooting Note: If the ‘Host stopped’ message is absent in the host status list, then the SP is unaware that the Host reset or restarted. Therefore the SP has a pending reboot action in effect. This means that the next time the SP is reset, a command to reset the Host is issued. 20180316 09:57:14: status='Host shutting down' 20180316 09:57:59: status='Solaris panicking' 20180316 09:58:26: status='Solaris rebooting' 20180316 09:58:41: status='Solaris rebooting' 20180316 09:58:44: status='OpenBoot initializing' 20180316 09:59:00: status='OpenBoot Primary Boot Loader' 20180316 09:59:06: status='OpenBoot Primary Boot Loader' 20180316 09:59:27: status='OpenBoot Running OS Boot' 20180316 10:01:10: status='Solaris running' SP event log: 63816 Fri Mar 16 09:58:42 2018 System Log minor Host ID 0: Solaris rebooting 63815 Fri Mar 16 09:58:26 2018 System Log minor Host ID 0: Solaris rebooting 63814 Fri Mar 16 09:58:00 2018 System Log minor Host ID 0: Solaris panicking 63813 Fri Mar 16 09:57:14 2018 System Log minor Host ID 0: Host shutting down 63812 Fri Mar 16 09:57:12 2018 Reset Log major Reset of /HOST0 by root succeeded. In the above example, it is evident that during the Host reset, the Host paniced and rebooted, so the Host never stopped before coming back up. WorkaroundIf the Host status log or SP event log does not show the ‘stopped’ message during a Host reset or Host stop/start, then schedule downtime at earliest convenience to stop and start the Host so that the SP is aware of the proper state of the Host. Resolution This issue is addressed in the following releases: SPARC Platform
History30-Mar-2018: Document released, status is Resolved This issue is not seen on M8 servers since the minimum ILOM version on the M8 servers is 4.0.0.1.c. The bug listed here only addresses part of the solution, but other changes to SP states addresses the whole issue. References<BUG:23309265> - FAILED HOST SHUTDOWN MAY LEAD TO UNEXPECTED HOST SHUTDOWN AFTER SP REBOOTAttachments This solution has no attachment |
||||||||||||||||||||||
|