![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||||
Solution Type Problem Resolution Sure Solution 2342857.1 : SPARC M7 HOST may panic upon forced SP failover (e.g,reset by Standby)
In this Document
Created from <SR 3-16418793701> Applies to:SPARC M7-8 - Version All Versions to All Versions [Release All Releases]SPARC M7-16 - Version All Versions to All Versions [Release All Releases] Oracle SuperCluster M7 Hardware - Version All Versions to All Versions [Release All Releases] Oracle Solaris on SPARC (64-bit) SymptomsA HOST may suffer an unexpected drop to the OBP debugger upon a forced SP failover, either through explicit user action or catastrophic SP reset. The HOST console will show the drop to the OBP debugger. Upon selecting 's' to sync, the domain will panic. panic[cpu37]/thread=2a10a3b9b80: sync initiated
Domain messages immediately preceding the drop to the debugger will include messages such as those below. The key messages to recognize are the "Link retraining detected" and "Surprise removal of mga0 detected". Dec 14 12:38:09 hostname pcie: [ID 297812 kern.info] NOTICE: Live Suspend: port pci.0,0: child dev mga#0(400417ccab8) and descendants ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Various issues may cause forced SP failover. One condition known to trigger the failover is SP memory exhaustion. This condition can only be diagnosed by an Oracle engineer when SP snapshot is provided. -> show -script /SP/logs/event/list
SR owners can confirm forced failover in trace messages in the SSM ring. #SP Trace logs# SSM 2017-12-14 12:38:44.860240 1382 ssm_priv_link.c:703 [4] no message received in 61 secs
---------------------------------------------------------------------------------------------------------------------------------------------- The issue impacts all System Firmware releases earlier than 9.8.0.d. The SysFW version can be displayed with this command, -> show /System/Firmware system_fw_version Changes
CauseForced SP failover causes the unintended loss of the HOST. Host encountered a known issue described in <BUG 23621056> - Host dropped to Debugger when running SP force failover. The reason for SP failover was watchdog timeout due to memory exhaustion.
SolutionWorkaround: Clear any faults and replace no hardware Fix: Install SysFW 9.8.0.d or higher Important Note : For Oracle SuperCluster M7 Hardware (SuperCluster Patch Policy) QFSDP release is the supported vehicle for SysFW deployment on SuperCluster. See Doc ID 1567979.1 for details. It may be necessary to seek exception approval for SysFW upgrade outside a QFSDP release. Never tell a SuperCluster customer to patch an individual component in isolation. SysFW 9.8.0.d or higher is not yet in a QFSDP and must receive exception approval on a case-by-case basis for SuperCluster. Reactive patching is only allowed for critical issues with no easy/viable workaround. For approval always check with SuperCluster Maintenance Group first - ssc_maintenance_grp@oracle.com.
References<NOTE:2064922.1> - ILOM-8000-F7 - the link between the Service Processor and host has a heartbeat failure<NOTE:2063349.1> - SPARC M7 Series Servers : Interconnect - EoUSB <NOTE:1967027.1> - SPARC M8 and SPARC M7 Series Servers : Current Issues Page <BUG:23621056> - HOST DROPPED TO DEBUGGER WHEN RUNNING SP FORCE FAILOVER <NOTE:1567979.1> - Oracle SuperCluster Supported Software Versions - All Hardware Types Attachments This solution has no attachment |
||||||||||||||||||||
|