Asset ID: |
1-79-1524707.1 |
Update Date: | 2017-11-20 |
Keywords: | |
Solution Type
Predictive Self-Healing Sure
Solution
1524707.1
:
M10-cpu.fe - Internal fatal error within a SPARC64 X CPU chip core
Related Items |
- Fujitsu M10-4
- Fujitsu M10-1
- Fujitsu M10-4S
|
Related Categories |
- PLA-Support>Sun Systems>Sun_Other>Sun Collections>SN-OTH: Sun PSH
|
In this Document
Applies to:
Fujitsu M10-1
Fujitsu M10-4
Fujitsu M10-4S
SPARC
Purpose
Provide additional information for message ID: M10-cpu.fe
Fujitsu fault codes:
05010000, 05010001, 05010004, 05010101, 05010104, 05010120, 05010132,
05010002, 05010003, 05010005, 05010100, 05010102, 05010103, 05010105,
05010110, 05010130, 05010131, 05010133, 02000300, 02000301, 02000302,
02000801, 02000802, 02000803, 02000b00, 02001415, 02001416
Details
Type
- Hardware Fault
- cpu.fe
Severity
- Critical
Description
-
Fault due to an internal fatal error within a SPARC64 X CPU chip core detected by the CPU chip hardware.
Automated Response
- The domain using this CPU chip is reset.
Impact
-
A CPU core or cores, or the entire CPU chip, are/is deconfigured. For M10-1 systems, this means the platform is unbootable.
Indicted Hardware
- This error could be due to faulty memory.
- Before replacing the MBU/CMU check for faulty memory and if present replace the DIMM(s) and verify the fault on the MBU/CMU is cleared.
- If there are no faulty DIMMs then replace the MBU/CMU.
- For M10-1 systems the Motherboard is marked for replacement. For M10-4 and M10-4S systems, the related Motherboard (CMUU or CMUL) is marked for replacement.
If the fault was detected while running POST, such events are listed in the following categories:
- fe-stick-start-err: 02000302 The stick register did not based on a comparison with Tick
- fe-stick-incr-err: 02000300 The stick register is not being incremented over time
- fe-stick-stop-err: 02000301The stick register did not stop despite a request to do so
- fe-l2cache-way-deconfigured 02000801, 02000803 At least one way of the L2 cache has been deconfigured due to a L2 cache fault - (detected by checking ASI_AFSR)
- - fe-l2cache-ue - UE (Uncorrectable Error) detected in L2 cache (detected by checking ASI_AFSR)
- fe-bus-err: 02000802 Bus error or timeout detected (detected by checking ASI_AFSR)
- fe-stchg-chip-err 02000b00 A CPU chip level error is detected when checking ASI_STCHG
- fe-unexpected-trap: 02001415, 02001416 Unexpected trap occurred due to this CPU chip
For M10-4/4S system with 4 CPU chip in a box, configuration of PCIe fabric will be changed triggered by CPU chip deconfiguration. This is taken place when ioreconfigure is set to true by setpparmode command.
Configuration of PCIe fabric is changed as follows:
- When CPU#0 on CMUL is deconfigured:
Configuration of PCIe switch 0 is changed to allow access from CPU#0 on CMUU to built-in SAS chip, USB chip and GbE i/f #0 and #1.
Configuration of PCIe switch 1 is changed to allow access from CPU#0 on CMUU to the first PCIe slot.
- When CPU#1 on CMUL is deconfigured:
Configuration of PCIe switch 2 is changed to allow access from CPU#1 on CMUU to the fourth and fifth PCIe slots.
Configuration of PCIe switch 3 is changed to allow access from CPU#1 on CMUU to the eighth and ninth PCIe slot.
- When CPU#0 on CMUU is deconfigured:
Configuration of PCIe switch 0 is changed to allow access from CPU#0 on CMUL to built-in GbE i/f #2 and #3.
Configuration of PCIe switch 1 is changed to allow access from CPU#0 on CMUL to the second and third PCIe slots.
- When CPU#1 on CMUU is deconfigured:
Configuration of PCIe switch 2 is changed to allow access from CPU#1 on CMUL to the sixth and seventh PCIe slots.
Configuration of PCIe switch 3 is changed to allow access from CPU#1 on CMUL to the tenth and eleventh PCIe slot.
Suggested Action for System Administrator
- The recommended service action for this event is to schedule replacement of the affected component(s) at the earliest possible convenience. Although the hardware may be functioning, it is not intended nor recommended that the faulted component(s) remain in the system for a prolonged period of time.
Refer to the following document for the latest procedures for displaying event content in preparation for submitting a service request and applying any post-repair actions that may be required.
PSH Procedural Article for Fujitsu M10 Diagnosis (Doc ID 1525156.1)

Attachments
This solution has no attachment