Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1296435.1
Update Date:2017-09-07
Keywords:

Solution Type  Problem Resolution Sure

Solution  1296435.1 :   Sun SPARC[TM] Enterprise M4000/M5000 - MBU_B and MEMB being faulted with SCF-8004-8X, SCF-8000-1D, and SCF-8005-MJ errors.  


Related Items
  • Sun SPARC Enterprise M4000 Server
  •  
  • Sun SPARC Enterprise M5000 Server
  •  
Related Categories
  • PLA-Support>Sun Systems>SPARC>Enterprise>SN-SPARC: Mx000
  •  
  • _Old GCS Categories>Sun Microsystems>Servers>OPL Servers
  •  




In this Document
Symptoms
Changes
Cause
Solution
References


Applies to:

Sun SPARC Enterprise M5000 Server - Version Not Applicable and later
Sun SPARC Enterprise M4000 Server - Version Not Applicable and later
Information in this document applies to any platform.

Symptoms

Showlogs monitor have the following errors:

Jan 20 14:59:35 XSCF Warning: /UNSPECIFIED:SCF:spurious unit interrupt
Jan 20 15:00:00 XSCF last message repeated 11 times
Jan 20 15:01:53 XSCF Alarm: /MBU_B/MEMB#5:ANALYZE:MAC detected clock fatal failure
Jan 20 15:02:41 XSCF Warning: /MBU_B:SCF:SC test error
Jan 20 15:02:57 XSCF Warning: /MBU_B:SCF:SC test error
Jan 20 15:03:16 XSCF Warning: /MBU_B:SCF:SC test error
Jan 20 15:03:17 XSCF Warning: /MBU_B:SCF:SC test error

FMA will record these same events as:

Jan 20 20:59:48.7163 051fbd14-0465-492b-aa42-6a705e31f05f SCF-8004-8X
Jan 20 21:01:46.3456 10355763-1a55-424c-ba69-bd2f3789e3d4 SCF-8000-1D
Jan 20 21:02:26.2880 12bdb2c7-6b8c-49ff-b37f-0ec60d958d40 SCF-8005-MJ

or
Jan 20 20:59:48.7163 ereport.chassis.SPARC-Enterprise.asic.cpu.power.intr-fail
Jan 20 21:01:46.3456 ereport.chassis.SPARC-Enterprise.if.fe-asic-clk
Jan 20 21:02:26.2880 ereport.chassis.SPARC-Enterprise.asic.sc.test

The error to note is the SCF-8000-1D which is a clock distribution error.

Additional failure scenario:

In certain rare occasion, the 'clock fatal failure' is not seen and the following failure signature is seen. The following solution still applies.

Showlogs monitor:

Nov  5 20:56:01 lonshspltp10a-m Alarm: /MBU_B/MEMB#0,/MBU_B:SCF:Critical low voltage error(detector=187)
Nov  5 20:56:11 lonshspltp10a-m Alarm: /MBU_B/MEMB#1,/MBU_B:SCF:Critical low voltage error(detector=187)
Nov  5 20:56:32 lonshspltp10a-m Alarm: /MBU_B/MEMB#2,/MBU_B:SCF:Critical low voltage error(detector=187)
Nov  5 20:56:45 lonshspltp10a-m Alarm: /MBU_B/MEMB#3,/MBU_B:SCF:Critical low voltage error(detector=187)
Nov  5 20:56:50 lonshspltp10a-m Warning: /MBU_B:SCF:Abnormal reaction of LSI (compare)
Nov  5 20:57:08 lonshspltp10a-m Alarm: /MBU_B/MEMB#4,/MBU_B:SCF:Critical low voltage error(detector=187)
Nov  5 20:57:13 lonshspltp10a-m Warning: /MBU_B:SCF:Abnormal reaction of LSI (compare)
Nov  5 20:57:16 lonshspltp10a-m Warning: /MBU_B:SCF:Abnormal reaction of LSI (compare)

Following FMA MSG-IDs are seen:
   UUID: 2841095a-2956-4f63-b3e4-ec38124de691 MSG-ID: SCF-8004-3Y
   UUID: 68506c5e-c87b-4d36-93bc-a5100fb7faf3 MSG-ID: SCF-8002-K2
Showstatus:
*   MBU_B Status:Faulted;
*       MEMB#0 Status:Faulted;
*       MEMB#1 Status:Faulted;
*       MEMB#2 Status:Faulted;
*       MEMB#3 Status:Faulted;
*       MEMB#4 Status:Faulted;

Loosing power from IOU#0 may cause other errors that are not listed here but should be considered victims of the power loss.

Changes

 

Cause

Power is supplied to the MEMB and the system clock by IOU#0.

Solution

Contact your service provider, this is a known condition that will require replacement hardware.

The FMA messages in XCP1111 now include the IOU as a suspect component. 

If system is below XCP1111 then an XCP upgrade should be scheduled.

IOU#0 should be replaced by an authorized service provider.

The other faulted component are not faulty and should not be replaced.

XCP versions prior to XCP1115 will need to use service mode to clearfault all faulted components, which should include the MBU and a MEMB.

Action Plan should include a pointer to :
Sun SPARC(R)Enterprise M3000/M4000/M5000/M8000/M9000 (OPL) Servers: Fault clearing and LEDs behavior (Doc ID 1007101.1) and include the service mode password if possible.

 

References

<NOTE:1007101.1> - Sun SPARC(R)Enterprise M3000/M4000/M5000/M8000/M9000 (OPL) Servers: Fault clearing and LEDs behavior

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback