Date of Workaround Release: 22-Feb-2017
Date of Resolved Release: 10-Apr-2017
__________________________________________
Description
An ILOM fan control change for SPARC S7-2 and S7-2L servers in Sun Systems Firmware version 9.7.4 may cause CPU related FMA ereports 'PMC_tjshu't with Host console logs of 'SPSUN4V-8000-84' during power on (start /SYS), while the system is idle, or under light load. In addition, affected systems may shut down.
The affected system firmware patches 25373802 (S7-2) and 25373803 (S7-2L) for system firmware 9.7.4 are no longer available for download and have been WITHDRAWN.
Occurrence
This issue can occur on the following platforms:
SPARC Platform
- SPARC S7-2 with Firmware version 9.7.4 (patch 25373802)
- SPARC S7-2L with Firmware version 9.7.4 (patch 25373803)
Note: The Netra Platform is not affected by this issue. No other platforms or systems are affected by this issue.
To determine the firmware version installed on the system, use the following ILOM command:
-> show /HOST sysfw_version
Symptoms
Should the described issue occur, the following symptoms may be seen:
Hostconsole logs will show results similar to the following:
SUNW-MSG-ID: SPSUN4V-8000-84, TYPE: Problem, VER: 1, SEVERITY: Critical
EVENT-TIME: Thu Jan 26 00:40:40 HKT 2017
PLATFORM: SPARC S7-2, CSN: AK00390530, HOSTNAME: ORACLESP-AK00111111
SOURCE: fdd, REV: 1.0
EVENT-ID: 26d2b49c-556e-c96f-bee1-800662c37be4
DESC: An event was received indicating a fault was diagnosed by another fault manager.
AUTO-RESPONSE: Refer to the document at http://support.oracle.com/msg/SPSUN4V-8000-84.
IMPACT: Refer to the document at http://support.oracle.com/msg/SPSUN4V-8000-84.
REC-ACTION: Use 'fmadm faulty' to provide a more detailed view of this event. Please refer to the associated reference document at
http://support.oracle.com/msg/SPSUN4V-8000-84 for the latest service procedures and policies regarding this diagnosis.
2017-01-26 01:17:20 0:10:4> ERROR: Redstate trap occurred on socket 0 strand 84
2017-01-26 01:17:20 0:10:4> NOTICE:
Redstate handler finished
2017-01-26 01:17:36 SP> NOTICE: Abort boot due to /SYS/MB. Power Cycle Host
Ereports reported on the system will be similar to the following:
2017-01-26/01:01:15 ereport.cpu.generic-sparc.gchip-uc@/SYS/MB/CMP0
__tod-0 = 0x5888d9eb
__tod-1 = 0xb1e77a0
tstate = 0x4411081403
htstate = 0x0
ehdl = 0x1000000004bb
tpc = 0x10022664
tl = 0x1
tt = 0x60
stick = 0x258a0be9916
chip-seq-id = 0x4bb
cpuid = 0x0
diagnose = 0x1
error-condition = PMC_tjshut
reported-by = Hypervisor
ps-pmc-esr = 0x401
ps-scc-err-count-logging-reg-0 = 0x0
ps-scc-err-count-logging-reg-1 = 0x0
ps-scc-err-count-val-reg-0 = 0x0
ps-scc-err-count-val-reg-1 = 0x0
ps-scc-temp-reg-0 = 0xf47d
ps-scc-temp-reg-1 = 0xf67d
system_component_firmware_manufacturer = Oracle Corporation
system_component_firmware_versions = (ILOM)3.2.8.1.a,(POST)5.5.4,(OBP)4.40.4,(HV)1.17.4
system_component_firmware_releases = (ILOM)2016.12.08,(POST)2016.12.08,(OBP)2016.12.08,(HV)2016.12.08
[(flash)root@bur-sn1-312-sp:/persist/host_logs]# fmdump -e
2017-01-26/01:00:56 ereport.cpu.generic-sparc.gchip-uc@/SYS/MB/CMP0
2017-01-26/01:00:57 ereport.cpu.generic-sparc.gchip-uc@/SYS/MB/CMP0
2017-01-26/01:00:58 ereport.cpu.generic-sparc.gchip-uc@/SYS/MB/CMP0
2017-01-26/01:00:59 ereport.cpu.generic-sparc.gchip-uc@/SYS/MB/CMP0
2017-01-26/01:01:00 ereport.cpu.generic-sparc.gchip-uc@/SYS/MB/CMP0
2017-01-26/01:01:01 ereport.cpu.generic-sparc.gchip-uc@/SYS/MB/CMP0
...
2017-01-26/01:01:10 ereport.cpu.generic-sparc.gchip-uc@/SYS/MB/CMP0
2017-01-26/01:01:11 ereport.cpu.generic-sparc.gchip-uc@/SYS/MB/CMP0
2017-01-26/01:01:12 ereport.cpu.generic-sparc.gchip-uc@/SYS/MB/CMP0
2017-01-26/01:01:13 ereport.cpu.generic-sparc.gchip-uc@/SYS/MB/CMP0
2017-01-26/01:01:14 ereport.cpu.generic-sparc.gchip-uc@/SYS/MB/CMP0
2017-01-26/01:01:15 ereport.cpu.generic-sparc.gchip-uc@/SYS/MB/CMP0
…
2017-01-26/01:17:22 ereport.chassis.pok.fault-info@/SYS/MB
2017-01-26/01:17:22 ereport.chassis.pok.fault-info@/SYS/MB
2017-01-26/01:17:23 ereport.chassis.pok.fault-asserted@/SYS/MB
Workaround
There is no workaround for this issue.
Resolution
This issue is addressed in the following releases:
SPARC Platform
- SPARC S7-2 Servers with Firmware version 9.7.5.b (patch 25790079)
- SPARC S7-2L Servers with Firmware version 9.7.5.b (patch 25790080)
Patches
<Patch 25790079>
<Patch 25790080>
History
22-Feb-2017: Document released, status is Workaround
07-Mar-2017: Update to "Note" in Cont. Factors for Netra not affected
10-Apr-2017: Update Resolution section for patches, issue is Resolved
The ILOM change that caused this issue is tracked as
CR:24824221 - Cloud server IFC support for S7 systems.
To track the patch process, please refer to the following Internal page:
https://stbeehive.oracle.com/content/dav/st/Systems%20Security%20and%20Release%20Management/Public%20Documents/Sun%20System%20Firmware/Charts%20%26%20Roadmaps/SysFW-Chart.html
The MOS document which tracks updates to the FW can be found at: (Internal Only)
SPARC S7-2 (1 or 2 Processor) and S7-2L Firmware and Patches <Document:2111742.1>
Questions regarding any portion of this document should be addressed to
sunalertpublication_us_grp@oracle.com and copy the
submitter/responsible engineer listed below.
Internal Contributor/Submitter: anthony.rulli@oracle.com
Internal Eng Responsible Engineer: anthony.rulli@oracle.com
Oracle Knowledge Analyst: david.mariotto@oracle.com
Internal Eng Business Unit Group: SPARC
Internal Associated SRs:
Internal Resolution Patches: 25790079, 25790080
References
<BUG:25505837> - EREPORT.CPU.GENERIC-SPARC.GCHIP-UC IS GENERATED FOR TEMP UNDER TJSHUT ON S7-2
Attachments
This solution has no attachment