Asset ID: |
1-72-2002924.1 |
Update Date: | 2017-10-05 |
Keywords: | |
Solution Type
Problem Resolution Sure
Solution
2002924.1
:
System went down showing Erorrs SUN4V-8002-PX and SUN4V-8002-MJ
Related Categories |
- PLA-Support>Sun Systems>SPARC>CMT>SN-SPARC: T4
|
In this Document
Created from <SR 3-10571215051>
Applies to:
SPARC T4-2 - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.
Symptoms
Server shutdown suddenly
Changes
Before the shutdown the system logged several ereports
Apr 12 00:57:54.0340 ereport.cpu.generic-sparc.c2c-link ------->chip-to-chip link correctable error
Apr 14 14:42:24.0906 ereport.cpu.generic-sparc.membuf-other ------->miscellaneous correctable memory buffer error
Apr 14 14:42:24.0912 ereport.cpu.generic-sparc.membuf-crc-failover ------->Buffer-on-Board lane failover error
Apr 14 15:45:51.5967 ereport.cpu.generic-sparc.membuf-crc -------> Buffer-on-Board recoverable CRC error
Apr 14 16:22:25.3912 ereport.cpu.generic-sparc.inconsistent -------->error data received with internally inconsistent format or values
Apr 14 16:22:23.5733 ereport.cpu.generic-sparc.c2c-prot-uc --------> chip-to-chip link protocol uncorrectable error
Cause
The ILOM marked the Memory Raiser 0 as faulted
This uncorrectable error was the reason why the system went down in order to avoid any data corruption, this UE error was triggered by a defective Memory raiser due to the CRC errors in the interconnect between a memory buffer and its memory controller.
The system has only shown errors with MR0 at ILOM, but also the system is complaining about CMP1 Possible FRU sensor or I2C bus problem /SYS/MB/CMP1/V_VSB
/SP/faultmgmt/1 | fru | /SYS/MB/CMP1/MR0
/SP/faultmgmt/1/ | class | fault.memory.memlink-failover
faults/0 | |
/SP/faultmgmt/1/ | sunw-msg-id | SUN4V-8002-PX
faults/0 | |
/SP/faultmgmt/1/ | component | /HOST
faults/0 | |
/SP/faultmgmt/1/ | uuid | fd0b19d2-63a8-630a-abf5-fa719f5eec
faults/0 | | 61
/SP/faultmgmt/1/ | timestamp | 2015-04-14/12:42:37
faults/0 | |
/SP/faultmgmt/1/ | system_component_seri | 1302BDY936
faults/0 | al_number |
/SP/faultmgmt/1/ | system_component_part | 31414200+1+1
faults/0 | _number |
/SP/faultmgmt/1/ | system_component_name | SPARC T4-2
faults/0 | |
/SP/faultmgmt/1/ | system_component_manu | Oracle Corporation
faults/0 | facturer |
/SP/faultmgmt/1/ | chassis_serial_number | 1302BDY936
faults/0 | |
/SP/faultmgmt/1/ | chassis_part_number | 31414200+1+1
faults/0 | |
/SP/faultmgmt/1/ | chassis_name | SPARC T4-2
faults/0 | |
/SP/faultmgmt/1/ | chassis_manufacturer | Oracle Corporation
faults/0 | |
/SP/faultmgmt/1/ | system_serial_number | 1302BDY936
faults/0 | |
/SP/faultmgmt/1/ | system_part_number | 31414200+1+1
faults/0 | |
/SP/faultmgmt/1/ | system_name | SPARC T4-2
faults/0 | |
/SP/faultmgmt/1/ | system_manufacturer | Oracle Corporation
faults/0 | |
/SP/faultmgmt/1/ | fru_name | MEM_RISER
faults/0 | |
/SP/faultmgmt/1/ | fru_manufacturer | 7696 MITAC COMPUTER LTD GUANGDONG
faults/0 | | CN
/SP/faultmgmt/1/ | fru_serial_number | 489089M+1247TA0CNN
faults/0 | |
/SP/faultmgmt/1/ | fru_rev_level | 02
faults/0 | |
/SP/faultmgmt/1/ | fru_part_number | 7051516
faults/0 | |
/SP/faultmgmt/1/ | mod-version | 1.16
faults/0 | |
/SP/faultmgmt/1/ | mod-name | eft
faults/0 | |
/SP/faultmgmt/1/ | fault_diagnosis | /HOST
faults/0 | |
/SP/faultmgmt/1/ | severity | Major
faults/0 | |
Solution
The replacement of the Memory Raiser 0 on CMP1 fixes the issue
Attachments
This solution has no attachment