Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-79-1021914.1
Update Date:2014-07-07
Keywords:

Solution Type  Predictive Self-Healing Sure

Solution  1021914.1 :   SCF-8005-CA - Fatal error detected within a SC chip.  


Related Items
  • Sun SPARC Enterprise M8000 Server
  •  
  • Sun SPARC Enterprise M4000 Server
  •  
  • Sun SPARC Enterprise M3000 Server
  •  
  • Sun SPARC Enterprise M9000-32 Server
  •  
  • Sun SPARC Enterprise M4000 Server
  •  
  • Sun SPARC Enterprise M5000 Server
  •  
  • Sun SPARC Enterprise M9000-64 Server
  •  
Related Categories
  • PLA-Support>Sun Systems>Sun_Other>Sun Collections>SN-OTH: Sun PSH
  •  

PreviouslyPublishedAs
SCF-8005-CA


Applies to:

Sun SPARC Enterprise M3000 Server - Version All Versions and later
Sun SPARC Enterprise M5000 Server - Version All Versions and later
Sun SPARC Enterprise M8000 Server - Version All Versions and later
Sun SPARC Enterprise M9000-64 Server - Version All Versions and later
Sun SPARC Enterprise M9000-32 Server - Version All Versions and later
Oracle Solaris on SPARC (64-bit)

Purpose

Provide additional information for message ID: SCF-8005-CA

Details

Predictive Self-Healing Article
Fatal error detected within a SC chip.

Type

Fault
  fault.chassis.SPARC-Enterprise.asic.sc.fe

Severity

Critical

Description

A fatal error was detected within a SC chip.

Automated Response

Some number of CPU chips, some number of MAC chips, and some amount of IO will be deconfigured. The platform may become unbootable. Please consult the detail section of the knowledge article for additional information.

Impact

One or more domains will be reset. The platform may become unbootable. Please consult the detail section of the knowledge article for additional information.

Suggested Action for System Administrator

Schedule a repair action to replace the affected Field Replaceable Unit (FRU), the identity of which can be determined using fmdump -v -u EVENT_ID. Please consult the detail section of the knowledge article for additional information.

Details

A fatal error was detected within a SC chip.


   SPARC Enterprise M3000 platform:

      The JSC chip replaces both the SC chip and the MAC chip.

      The type of fault can be determined from the "Affects:" FMRI in the fmdump output.
      The types of faults that can be detected are:
   
         -Common logic internal to the JSC chip;
         -Part of the JSC chip that interfaces to an IO channel;
         -Part of the JSC chip that interfaces to a CPU chip;


         Faults that are common logic internal to the JSC chip:

            Platform becomes unbootable.


         Faults that are part of the JSC chip that interfaces to an IO channel:

            Domain using the IO Channel is reset.
            IO Channel will be deconfigured after the domain is reset.
            Platform becomes unbootable.

            IO Channel 0 provides access to all internal I/O and PCI-Express slots 0, 1, 2, and 3.

             All IO on the platform is deconfigured and the platform becomes unbootable.


         Faults that are on part of the JSC chip that interfaces to a CPU chip:

            Domain using the CPU chip is reset.
            CPU chip will be deconfigured after the domain is reset.




   SPARC Enterprise M4000 platform:


      The type of fault can be determined from the "Affects:" FMRI in the fmdump output.
      The types of faults that can be detected are:
   
         -Common logic internal to the SC chip;
         -Part of the SC chip that interfaces to an IO channel;
         -Part of the SC chip that interfaces to a CPU chip;
         -Part of the SC chip that interfaces to a MAC chip;
         -Part of the SC chip that interfaces to a specific XSB.

   
         Faults that are common logic internal to the SC chip:

            Platform becomes unbootable.


         Faults that are part of the SC chip that interfaces to an IO channel:

            Domain using the IO Channel is reset.
            IO Channel will be deconfigured after the domain is reset.


               Availability of I/O devices depends on which IO Channel was deconfigured.
   
                  IO Channel 0 provides access to PCI-X slot 0, PCI-Express slots 1 and 2,
                  and all the internal I/O.

                  IO Channel 1 provides access to PCI-Express slots 3 and 4.


         Faults that are on part of the SC chip that interfaces to a CPU chip:

            Domain using the CPU chip is reset.
            CPU chip will be deconfigured after the domain is reset.


         Faults that are on part of the SC chip that interfaces to a MAC chip:

            Domain using the MAC chip is reset.
            8 DIMMs associated with this MAC will be deconfigured after the domain is reset.


         Faults that are part of the SC chip that interfaces to a specific XSB:

            Domain using the XSB is reset.
            XSB will be deconfigured after the domain is reset.




   SPARC Enterprise M5000 platform:


      The type of fault can be determined from the "Affects:" FMRI in the fmdump output.
      The types of faults that can be detected are:
   
         -Common logic internal to the SC chip;
         -Part of the SC chip that interfaces to an IO channel;
         -Part of the SC chip that interfaces to a CPU chip;
         -Part of the SC chip that interfaces to a MAC chip;
         -Part of the SC chip that interfaces to a specific XSB.

   
         Faults that are common logic internal to the SC chip:

            All domains are reset.

            One-half of the Motherboard with the faulty SC chip will be deconfigured,
            along with its associated I/O, after the domains are reset.


         Faults that are part of the SC chip that interfaces to an IO channel:

            Domain using the IO Channel is reset.
            IO Channel will be deconfigured after the domain is reset.


               Availability of I/O devices depends on which IO Channel was deconfigured.
            
                  IO Channel 0 provides access to PCI-X slot 0, PCI-Express slots 1 and 2,
                  and all the internal I/O.

                     IOU#0: Internal I/O includes the DVD drive, the DAT drive, HDD#0 and HDD#1.
                     IOU#1: Internal I/O includes HDD#2 and HDD#3.

                  IO Channel 1 provides access to PCI-Express slots 3 and 4.


         Faults that are on part of the SC chip that interfaces to a CPU chip:

            Domain using the CPU chip is reset.
            CPU chip will be deconfigured after the domain is reset.


         Faults that are on part of the SC chip that interfaces to a MAC chip:

            Domain using the MAC chip is reset.
            8 DIMMs associated with this MAC will be deconfigured after the domain is reset.


         Faults that are on part of the SC chip that interfaces to a specific XSB:

            Domain using the XSB is reset.
            XSB will be deconfigured after the domain is reset.

 


   SPARC Enterprise M8000/M9000 platforms:


      The type of fault can be determined from the "Affects:" FMRI in the fmdump output.
      The types of faults that can be detected are:
   
         -Common logic internal to the SC chip;
         -Part of the SC chip that interfaces to an IO channel;
         -Part of the SC chip that interfaces to a CPU chip;
         -Part of the SC chip that interfaces to a MAC chip;
         -Part of the SC chip that interfaces to a specific XSB.

   
         Faults that are common logic internal to the SC chip:

            Domains using the CMU are reset.
            CMU and all its associated I/O will be deconfigured after the domains are reset.


         Faults that are part of the SC chip that interfaces to an IO channel:

            Domain using the IO Channel is reset.
            IO Channel will be deconfigured after the domain is reset.


               Availability of I/O devices depends on which IO Channel was deconfigured.

                  IO Channel 0 on IOC chip 0 provides access to PCI-Express slots 0,1 on this IOU.

                  IO Channel 1 on IOC chip 0 provides access to PCI-Express slots 2,3 on this IOU.

                  IO Channel 0 on IOC chip 1 provides access to PCI-Express slots 4,5 on this IOU.

                  IO Channel 1 on IOC chip 1 provides access to PCI-Express slots 6,7 on this IOU.


         Faults that are on part of the SC chip that interfaces to a CPU chip:

            Domain using the CPU chip is reset.
            CPU chip will be deconfigured after the domain is reset.


         Faults that are on part of the SC chip that interfaces to a MAC chip:

            Domains using the MAC chip are reset.
            16 DIMMs associated with this MAC will be deconfigured after the domains are reset.

    
         Faults that are on part of the SC chip that interfaces to a specific XSB:
   
            Domain using the XSB is reset.
            XSB (CPU, Memory, and I/O) for the CMU in question will be deconfigured
            after the domain is reset.



The recommended service action for this event is to schedule the replacement of the affected FRU.


Step 1. Collect the fault message (use one of the following methods):


   Single-line fault message displayed on the XSCF console:

   Mar 20 21:43:19 san-ff2-21-0 fmd: SOURCE: sde, REV: 1.12, CSN: 7860000772  
   EVENT-ID: 19a29925-4abf-4231-974a-1ba16ff9c848
   Refer to http://www.sun.com/msg/SCF-8005-CA for detailed information.


   Complete fault message using 'fmdump -m' on the XSCF console:

   MSG-ID: SCF-8005-CA, TYPE: Fault, VER: 1, SEVERITY: Critical
   EVENT-TIME: Tue Mar 20 21:43:19 UTC 2007
   PLATFORM: SPARC-Enterprise, CSN: 7860000772, HOSTNAME: san-ff2-21-0
   SOURCE: sde, REV: 1.12
   EVENT-ID: 19a29925-4abf-4231-974a-1ba16ff9c848
   DESC: A fatal error was detected within a SC chip.
   Refer to http://www.sun.com/msg/SCF-8005-CA for more information.
   AUTO-RESPONSE: Some number of CPU chips, some number of MAC chips,
   and some amount of IO will be deconfigured.  The platform may become unbootable.
   Please consult the detail section of the knowledge article for additional information.
   IMPACT: One or more domains will be reset. The platform may become unbootable.
   Please consult the detail section of the knowledge article for additional information.
   REC-ACTION: Schedule a repair action to replace the affected Field Replaceable Unit (FRU),
   the identity of which can be determined using fmdump -v -u EVENT_ID.
   Please consult the detail section of the knowledge article for additional information.


Step 2. Collect the output from the 'fmdump -v -u EVENT_ID' command


        xscf> fmdump -v -u 19a29925-4abf-4231-974a-1ba16ff9c848

        TIME                 UUID                                 MSG-ID
        Mar 20 21:43:19.4692 19a29925-4abf-4231-974a-1ba16ff9c848 SCF-8005-CA
  100%  fault.chassis.SPARC-Enterprise.asic.sc.fe

        Problem in: hc:///chassis=0/cmu=0/sc=0
           Affects: hc:///chassis=0/cmu=0
               FRU: hc://:product-id=SPARC-Enterprise:chassis-id=7860000772:
                    server-id=san-ff2-21-0:
                    part=CA20393-B50X 001AA:revision=0101/component=/MBU_B


Step 3. Contact your Authorized Service Provider.



If you require additional information, please refer to Document 1002526.1.

 


Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback