![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||
Solution Type Problem Resolution Sure Solution 2245358.1 : Solaris sun4v domains may panic after 1101 days of uptime
In this Document
Applies to:SPARC M7-16 - Version All Versions and laterOracle SuperCluster M7 Hardware - Version All Versions and later SPARC T5-2 - Version All Versions and later SPARC M5-32 - Version All Versions and later SPARC S7-2 - Version All Versions and later Information in this document applies to any platform. This issue applies to SPARC servers of machine implementation "sun4v". The Solaris command 'uname -i' can be used to display the machine implementation. Hypervisor 1.0 introduced the bug, but the panic can only be manifested on servers with Hypervisor 1.12.x or later. All "sun4v" servers use a hypervsior. SymptomsA bug exists in hypervisor (HV) 1.12.x or later which may cause a domain to panic after 1101 days of uptime. The HV version may be displayed with the Solaris command 'ldm -V | grep ")Hypervisor"'. Various types of panic might be evident in the HOST console log, including, "panic: send_mondo_set: timeout". The ILOM event log (-> show /SP/logs/event/list) may have sufficient history to check the HOST uptime. Example (event log):
225 Thu Jan 26 14:17:22 2017 System Log minor The Solaris GNU date command can be used to easily calculate a date 1101 days earlier from the panic date. Example: Many date calculators are available via Internet search that display the duration between two dates. If the ILOM event log does not have sufficient history to check when "HV started" it may still be possible to If either of the two techniques identifies a period of run time,
and the duration to panic is at or near 1101 days, then bug 23193383 has likely been manifested.
In a live service processor the "Restricted Shell" can be used to check the host status logs in Example: WARNING: The "Restricted Shell" account is provided solely ## check for available host status logs ## egrep for "HV start" and "panic" in the desired HOST status log CauseA bug was introduced into Hypervisor 1.0 which causes miscalculation of various cyclic operations, but panic will only occur on HV 1.12.x and later. SolutionUpgrade service processor System Firmware (SysFW) to a release which includes the bug fix. Various servers SysFW versions that includes the bug fix are listed below: M5-32: 9.6.7.a An exhaustive list of all impacted Oracle servers is not provided above. Any server which uses hypervisor 1.12.x or later can be impacted. If the server is of implementation "sun4v" it may be impacted. Check the server implementation with the Solaris command 'uname -i'. To find patches which include the hypervisor fix, search server patch README manifests for SysFW patches which include the fix for bug 23193383 or a Backport bug. SysFW releases are available via the Oracle Technology Network. Firmware Download links and Release History for Oracle Systems can be found on the Oracle Technology Network at, Patch descriptions will list bug 23193383 or a Backport bug in the patch README manifest. If SysFW cannot be upgraded the HOST can be stopped and restarted prior to 1101 days of uptime. This will restart the exposure to another period of 1101 days uptime and thus prevent domain panic for 1101 days. Upgrading SysFW is the preferred resolution. References<BUG:23193383> - CYCLICS MISBEHAVE AFTER TWO YEARS OF UPTIMEhttp://www.oracle.com/technetwork/systems/patches/firmware/release-history-jsp-138416.html <NOTE:1554086.1> - Fujitsu M10-1/M10-4/M10-4S XSCF Control Package (XCP) Firmware Image Software Version Matrix Information <NOTE:1540816.1> - SPARC M5-32 and M6-32 Servers: Firmware Image Software Version Matrix Information <NOTE:1967048.1> - SPARC M8 and SPARC M7 Series Servers : Firmware Image Software Version Matrix Information Attachments This solution has no attachment |
||||||||||||||||||
|