Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-2002924.1
Update Date:2017-10-05
Keywords:

Solution Type  Problem Resolution Sure

Solution  2002924.1 :   System went down showing Erorrs SUN4V-8002-PX and SUN4V-8002-MJ  


Related Items
  • SPARC T4-2
  •  
Related Categories
  • PLA-Support>Sun Systems>SPARC>CMT>SN-SPARC: T4
  •  




In this Document
Symptoms
Changes
Cause
Solution


Created from <SR 3-10571215051>

Applies to:

SPARC T4-2 - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.

Symptoms

Server shutdown suddenly

Changes

Before the shutdown the system logged several ereports

Apr 12 00:57:54.0340 ereport.cpu.generic-sparc.c2c-link                    ------->chip-to-chip link correctable error
Apr 14 14:42:24.0906 ereport.cpu.generic-sparc.membuf-other           ------->miscellaneous correctable memory buffer error
Apr 14 14:42:24.0912 ereport.cpu.generic-sparc.membuf-crc-failover  ------->Buffer-on-Board lane failover error
Apr 14 15:45:51.5967 ereport.cpu.generic-sparc.membuf-crc              -------> Buffer-on-Board recoverable CRC error
Apr 14 16:22:25.3912 ereport.cpu.generic-sparc.inconsistent              -------->error data received with internally inconsistent format or values
Apr 14 16:22:23.5733 ereport.cpu.generic-sparc.c2c-prot-uc               --------> chip-to-chip link protocol uncorrectable error

Cause

The ILOM marked the Memory Raiser 0 as faulted

This uncorrectable error was the reason why the system went down in order to avoid any data corruption, this UE error was triggered by a defective Memory raiser due to the CRC errors in the interconnect between a memory buffer and its memory controller.

The system has only shown errors with MR0 at ILOM, but also the system is complaining about CMP1 Possible FRU sensor or I2C bus problem /SYS/MB/CMP1/V_VSB

 


/SP/faultmgmt/1 | fru | /SYS/MB/CMP1/MR0
/SP/faultmgmt/1/ | class | fault.memory.memlink-failover
 faults/0 | |
/SP/faultmgmt/1/ | sunw-msg-id | SUN4V-8002-PX
 faults/0 | |
/SP/faultmgmt/1/ | component | /HOST
 faults/0 | |
/SP/faultmgmt/1/ | uuid | fd0b19d2-63a8-630a-abf5-fa719f5eec
 faults/0 | | 61
/SP/faultmgmt/1/ | timestamp | 2015-04-14/12:42:37
 faults/0 | |
/SP/faultmgmt/1/ | system_component_seri | 1302BDY936
 faults/0 | al_number |
/SP/faultmgmt/1/ | system_component_part | 31414200+1+1
 faults/0 | _number |
/SP/faultmgmt/1/ | system_component_name | SPARC T4-2
 faults/0 | |
/SP/faultmgmt/1/ | system_component_manu | Oracle Corporation
 faults/0 | facturer |
/SP/faultmgmt/1/ | chassis_serial_number | 1302BDY936
 faults/0 | |
/SP/faultmgmt/1/ | chassis_part_number | 31414200+1+1
 faults/0 | |
/SP/faultmgmt/1/ | chassis_name | SPARC T4-2
 faults/0 | |
/SP/faultmgmt/1/ | chassis_manufacturer | Oracle Corporation
 faults/0 | |
/SP/faultmgmt/1/ | system_serial_number | 1302BDY936
 faults/0 | |
/SP/faultmgmt/1/ | system_part_number | 31414200+1+1
 faults/0 | |
/SP/faultmgmt/1/ | system_name | SPARC T4-2
 faults/0 | |
/SP/faultmgmt/1/ | system_manufacturer | Oracle Corporation
 faults/0 | |
/SP/faultmgmt/1/ | fru_name | MEM_RISER
 faults/0 | |
/SP/faultmgmt/1/ | fru_manufacturer | 7696 MITAC COMPUTER LTD GUANGDONG
 faults/0 | | CN
/SP/faultmgmt/1/ | fru_serial_number | 489089M+1247TA0CNN
 faults/0 | |
/SP/faultmgmt/1/ | fru_rev_level | 02
 faults/0 | |
/SP/faultmgmt/1/ | fru_part_number | 7051516
 faults/0 | |
/SP/faultmgmt/1/ | mod-version | 1.16
 faults/0 | |
/SP/faultmgmt/1/ | mod-name | eft
 faults/0 | |
/SP/faultmgmt/1/ | fault_diagnosis | /HOST
 faults/0 | |
/SP/faultmgmt/1/ | severity | Major
 faults/0 | |
 

 

Solution

 The replacement of the Memory Raiser 0 on CMP1 fixes the issue


Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback