Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-79-1021874.1
Update Date:2017-11-29
Keywords:

Solution Type  Predictive Self-Healing Sure

Solution  1021874.1 :   SCF-8004-G4 - A component in a FRU signaled it has failed.  


Related Items
  • Sun SPARC Enterprise M8000 Server
  •  
  • Sun SPARC Enterprise M4000 Server
  •  
  • Sun SPARC Enterprise M3000 Server
  •  
  • Sun SPARC Enterprise M9000-32 Server
  •  
  • Sun SPARC Enterprise M5000 Server
  •  
  • Sun SPARC Enterprise M9000-64 Server
  •  
Related Categories
  • PLA-Support>Sun Systems>Sun_Other>Sun Collections>SN-OTH: Sun PSH
  •  

PreviouslyPublishedAs
SCF-8004-G4


Applies to:

Sun SPARC Enterprise M3000 Server - Version All Versions and later
Sun SPARC Enterprise M4000 Server - Version All Versions and later
Sun SPARC Enterprise M8000 Server - Version All Versions and later
Sun SPARC Enterprise M9000-32 Server - Version All Versions and later
Sun SPARC Enterprise M5000 Server - Version All Versions and later
All Platforms

Purpose

Provide additional information for message ID: SCF-8004-G4

Scope

 

Details

Predictive Self-Healing Article
A component in a FRU signaled it has failed.

Type

Fault
  fault.chassis.device.fail

Severity

Critical

Description

A component in a FRU has signaled it has failed. Please consult the detail section of the knowledge article for additional information.

Automated Response

The component in the FRU will be deconfigured (which may cause the FRU to be deconfigured). Please consult the detail section of the knowledge article for additional information.

Impact

Domains may be reset, the platform may become unbootable, or the platform may be powered down. Please consult the detail section of the knowledge article for additional information.

Suggested Action for System Administrator

Component(s) can be marked faulty for this error when there are issues with the power supplied to the system.
Components marked faulty due to problems with the input power will need to have the fault cleared ( if xcp level is below xcp1115 by a service technician ) but should not be replaced.
Verify the power provided to the system is stable and schedule a repair action to replace the affected Field Replaceable Unit (FRU).

Details

A component in a FRU has signaled it has failed.

The FRU or device that has signaled that it has failed can be identified from the FMRI of the fault event.

The following section provides a list of the potential FRUs and devices
that can signal a failure condition, and specifies the impact and automated
actions taken by the platform or domain in response to the fault.



   SPARC Enterprise M3000 platform:

      DDC ( DC to DC converter ) on Motherboard Unit (MBU):

         Platform is powered down and will not be powered up.
         Platform becomes unbootable.


   SPARC Enterprise M4000/M5000 platforms:


      PSU:
 
         PSU is deconfigured.

         If there are now insufficient PSU's to power the platform,
         then the platform is powered down and the platform becomes unbootable.
         Otherwise, no other action is taken.


      DDC ( DC to DC converter ) on a CPU Module:
 
         CPU Module is powered down and the domain using the CPU Module is reset.
         CPU Module is deconfigured.


      DDC_A on a Motherboard:
      DDC_A on an IOU:
      DDC_B on a Motherboard:
      DDC_B on a DDC Riser on an IOU:

         Platform is powered down and will not be powered up.
         Platform becomes unbootable.




   SPARC Enterprise M8000/M9000 platforms:


      PSU:
 
         PSU is deconfigured.

         If there are insufficient operational PSUs to power the platform,
         then the platform is powered down and the platform becomes unbootable.
         Otherwise, no other action is taken


      DDC_A on a Backplane (BP_A):

         If there remain sufficient operational DDC_A's, then the DDC_A is deconfigured.
         If there remain non-operational DDC_A's, then platform is powered down
         and the platform becomes unbootable.


      DDC ( DC to DC converter ) on a CPU Module (CPUM):

         DDC's on the CPU Module are redundant.
         
         If there remain sufficient operational DDC's on the CPU Module,
         then no action is taken and nothing is deconfigured.

         Otherwise, the CPU Module is powered down and the domain using the CPU Module is reset.
         CPU Module is deconfigured.


      DDC on a CPU/Memory Unit (CMU):

         DDC's on the CMU are redundant.
         
         If there remain sufficient operational DDC's on the CMU,
         then no action is taken and nothing is deconfigured.

         Otherwise, the CMU is powered down and the domains using the CMU are reset.
         CMU is deconfigured, along with all of its associated I/O.


      SSM on a CPU/Memory Unit (CMU):

         CMU is powered down and the domains using the CMU are reset.
         CMU is deconfigured, along with all of its associated I/O.


      DDC on an Crossbar Unit (XBU):

         DDC's on an XBU are redundant.
         
         If there remain sufficient operational DDC's on an XBU,
         then no action is taken and nothing is deconfigured.

         Otherwise, platform is reset and the crossbar way is deconfigured.

            If this is the first crossbar way being deconfigured,
            then the platform will operate with reduced performance.

            If a crossbar way has already been deconfigured,
            then the platform will be powered down.


      Slow Start Module (SSM) on a Crossbar Unit (XBU):

         XBU is powered off.
         Platform is reset and the crossbar way is deconfigured.

            If this is the first crossbar way being deconfigured,
            then the platform will operate with reduced performance.

            If a crossbar way has already been deconfigured,
            then the platform will be powered down.


      DDC on an IO Unit (IOU):

         DDC's on an IOU are redundant.
         
         If there remain sufficient operational DDC's on an IOU,
         then no action is taken and nothing is deconfigured.

         Otherwise, the IOU is powered off and the domains using the IOU are reset.
         IOU is deconfigured.



      Slow Start Module (SSM) on an IO Unit (IOU):

         IOU is powered off and the domains using the IOU are reset.
         IOU is deconfigured.



      DDC on a Clock Unit (CLKU):

         DDC's on a Clock Unit are N+1 redundant.
         
         If there remain sufficient operational DDC's on a Clock Unit,
         then no action is taken and nothing is deconfigured.

         If this is the active Clock Unit and there are insufficient operational
         DDCs on the active Clock Unit, then the active Clock Unit is powered off,
         the platform is reset, and the active Clock Unit is deconfigured.
 
         If this is the standby Clock Unit and there are insufficient operational
         DDCs on the standby Clock Unit, then the standby Clock Unit
         is powered off and no further action is taken.


Component(s) can be marked faulty for this error when there are issues with the power supplied to the system.
Components marked faulty due to problems with the input power will need to have the fault cleared ( if xcp level is below xcp1115 by a service technician ) but should not be replaced.
Verify the power provided to the system is stable and schedule a repair action to replace the affected Field Replaceable Unit (FRU).


Step 1. Collect the Fault Message  (two methods)


   Single-line fault message displayed on the XSCF console:

   Mar 27 15:24:40 nwk-dc2-1-sc0 fmd: SOURCE: sde, REV: 1.12, CSN: nwk-dc2-1  
   EVENT-ID: c2837d94-fecf-41bd-a99d-d6e7ce08ead4
   Refer to http://www.sun.com/msg/ SCF-8004-G4 for detailed information.


   Complete fault message using 'fmdump -m' on the XSCF console:

   MSG-ID:  SCF-8004-G4, TYPE: Fault, VER: 1, SEVERITY: Critical
   EVENT-TIME: Tue Mar 27 15:24:40 PDT 2007
   PLATFORM: SPARC-Enterprise, CSN: nwk-dc2-1, HOSTNAME: nwk-dc2-1-sc0
   SOURCE: sde, REV: 1.12
   EVENT-ID: c2837d94-fecf-41bd-a99d-d6e7ce08ead4
   DESC: A component in a FRU has signaled it has failed.
   Please consult the detail section of the knowledge article for additional information.
   Refer to http://www.sun.com/msg/SCF-8004-G4 for more information.
   AUTO-RESPONSE: The component in the FRU will be deconfigured (which may cause the FRU to be deconfigured). Please
   consult the detail section of the knowledge article for additional information.
   IMPACT: Domains may be reset, the platform may become unbootable, or the platform may be powered down.
   Please consult the detail section of the knowledge article for additional information.
   REC-ACTION: Schedule a repair action to replace the affected Field Replaceable Unit (FRU),
   the identity of which can be determined using fmdump -v -u EVENT_ID.
   Please consult the detail section of the knowledge article for additional information.


Step 2. Collect the output from the 'fmdump  -v  -u  ' command
 


   SPARC Enterprise platform example:

   xscf> fmdump -v -u c2837d94-fecf-41bd-a99d-d6e7ce08ead4

         TIME                 UUID                                 MSG-ID
         Mar 27 15:24:40.5019 c2837d94-fecf-41bd-a99d-d6e7ce08ead4 SCF-8004-G4
  100%  fault.chassis.device.fail

        Problem in: hc:///chassis=0/psu=0
           Affects: hc:///chassis=0/psu=0
               FRU: hc://:product-id=SPARC-Enterprise:chassis-id=nwk-dc2-1:
                    server-id=nwk-dc2-1-sc0:
                    part=CA01022-0690:revision=31/component=/PSU#0



Step 3. Contact your Authorized Service Provider.



If you require additional information, please refer to Document 1002526.1.

 


Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback