Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1578111.1
Update Date:2017-05-01
Keywords:

Solution Type  Problem Resolution Sure

Solution  1578111.1 :   T5240 seeing a recurrence of SUN4V-8001-KX following a CPU fatal error  


Related Items
  • Sun SPARC Enterprise T5240 Server
  •  
Related Categories
  • PLA-Support>Sun Systems>SPARC>CMT>SN-SPARC: T5xx0
  •  


A T5240 sees many SUN4V-8001-KX errors after a CPU fatal error.

In this Document
Symptoms
Cause
Solution
References


Created from <SR 3-7653510813>

Applies to:

Sun SPARC Enterprise T5240 Server - Version All Versions and later
Information in this document applies to any platform.

Symptoms

 System will most likely experience at least one fatal error:

Jun 30 12:13:35: Chassis |major   : "Jun 30 12:13:35 ERROR: [CMP1  ] Received Fatal Error"

System will then see many occurrences of the FMA error SUN4V-8001-KX (The number of level 2 cache control errors has exceeded acceptable levels. ):

Jun 30 12:27:23: Chassis |major   : "Host detected fault, MSGID: SUN4V-8001-KX"
Jun 01 15:43:50: Chassis |critical: "Host has been powered off"
Jun 27 17:39:32: Chassis |major   : "Host detected fault, MSGID: SUN4V-8001-KX"
Jun 27 17:47:11: Chassis |major   : "Host has been powered on"
Jun 27 17:53:58: Chassis |major   : "Host is running"
Jul 08 14:18:56: Chassis |major   : "Host detected fault, MSGID: SUN4V-8001-KX"
Jul 08 15:18:48: Chassis |major   : "Host detected fault, MSGID: SUN4V-8001-KX"
Jul 08 15:19:47: Chassis |major   : "Host has been powered on"
Jul 08 15:26:47: Chassis |major   : "Host is running"
Jul 11 21:26:16: Chassis |critical: "Host has been powered off"
Jul 11 22:23:05: Chassis |major   : "Host has been powered on"
Jul 11 22:29:48: Chassis |major   : "Host is running"
Jul 12 15:11:46: Chassis |critical: "Host has been powered off"
Jul 12 15:21:53: Audit   |major   : "Upgrade Succeeded"  <-------- At this point the firmware was updated. It didn't resolve the issue.
Jul 12 15:24:44: Chassis |major   : "Host detected fault, MSGID: SUN4V-8001-KX"
Jul 12 15:29:32: Chassis |major   : "Host has been powered on"
Jul 12 15:36:53: Chassis |major   : "Host is running"
Aug 08 19:13:36: Reset   |major   : "Reset of /SP initiated by root.  Success unless failure noted."
Aug 08 19:15:49: Chassis |major   : "Aug  8 19:15:49 ERROR: Unable to connect to snmpd: No such file or directory"
Aug 08 19:16:10: Chassis |major   : "Host detected fault, MSGID: SUN4V-8001-KX"

Cause

 Issue is probably caused by a problem in the CPU perhaps L2 cache.

Solution

1) Clear all the errors in FMA and ILOM. See How To Clear FMA faults from Solaris[TM] and SC (System Controller) on T1000/T2000 T5120/T5220/T5140/T5240/T5440, T3-1/T3-2/T3-4, T4-1/T4-2/T4-4 (Doc ID 1004229.1) for instructions.

2) If the above doesn't clear the error completely and it recurs, please open an SR with Oracle Support to evaluate the system. An explorer and a snapshot will be required for further diagnosis of the issue.

References

<NOTE:1004229.1> - How To Clear FMA faults from Solaris[TM] and SC (System Controller) on T1000/T2000 T5120/T5220/T5140/T5240/T5440, T3-1/T3-2/T3-4, T4-1/T4-2/T4-4

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback