Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-1525156.1
Update Date:2017-03-22
Keywords:

Solution Type  Technical Instruction Sure

Solution  1525156.1 :   Fujitsu M10/M12 Servers: Error Log Specified FRU Replacements  


Related Items
  • Fujitsu M10-4S
  •  
  • Fujitsu M10-4
  •  
  • Fujitsu SPARC M12-2
  •  
  • Fujitsu SPARC M12-2S
  •  
  • Fujitsu M10-1
  •  
Related Categories
  • PLA-Support>Sun Systems>SPARC>Enterprise>SN-SPARC: Fujitsu M10
  •  




Applies to:

Fujitsu M10-4S - Version All Versions to All Versions [Release All Releases]
Fujitsu M10-1 - Version All Versions to All Versions [Release All Releases]
Fujitsu M10-4 - Version All Versions to All Versions [Release All Releases]
Fujitsu SPARC M12-2 - Version All Versions to All Versions [Release All Releases]
Fujitsu SPARC M12-2S - Version All Versions to All Versions [Release All Releases]
All Platforms

Goal

Investigating a M10/M12 error message specified FRU indictment.

This document details how to initiate a Service Action Plan to investigate whether a hardware component should be replaced as implicated by the XSCF Fault codes on a M10/M12 system.

NOTE:  The implicated hardware component(s) is referred as a Field Replaceable Unit (FRU) throughout this document.

Solution

This document makes a few assumptions:

  • An error event caused an automated recovery action to take place on a system (panic/reboot/errors/etc).
  • The diagnosis engine determined that a FRU(s) is Faulty and may have automatically disabled or deconfigured the suspect FRU(s).
  • The diagnosis engine produced a DIAGCODE which when looked up in My Oracle Support recommends replacing a FRU and may refer to this document.
To discuss this information further with Oracle experts and industry peers, we encourage you to review, join or start a discussion in the M-series Servers

1. Collect the DIAGCODE Fault Message.

The output can be displayed using "showlogs error" Example output is as follows:

Date: Nov 08 06:41:50 UTC 2012
    Code: 40002000-0053000000ff0000ff-019100a00000000000000000
    Status: Warning                Occurred: Nov 08 06:41:50.460 UTC 2012
    FRU : /BB#9/CMUL
    Msg: Hardware access error.

2. Extract message id from "Code".

    Code: 40002000-0053000000ff0000ff-019100a00000000000000000
  In this example, message id is 019100a0 which can be looked up in My Oracle Support.

2. Collect the fault information to prepare to log a Service Request:

  • The "DIAGCODE" from step 1.
  • Collect a snapshot <Document 2097446.1>
    • If it is not possible to gather a snapshot then collect the following commands:
      version -c xcp -v
      showpcl -a
      showboards -a
      showhardconf
      showpparstatus -a
      showcod
      showcodusage
      showlogs error
      showlogs monitor
      showlogs -p # console
      showlogs -p # panic

  • Specify whether the first FRU in DIAGCODE output has been recently serviced, replaced, or errored.
  • Specify your contact information so the Oracle Support Services engineer can contact you to schedule the service.

4. Contact Oracle Support Services or your local service representative and open a "Service Request".

  • Attach the collected snapshot to the SR or upload the information to Oracle ( see <Document 1020199.1> ) with the SR number as the prefix to the file name.

5. Review FRU Replacement Methods information to prepare your configuration for the FRU replacement.

  • Components can be replaced using multiple different FRU Replacement Methods depending on which platform is involved, the specific FRU in question, and whether it is redundantly configured.
  • It is recommended to review <Document 1526831.1> to be aware of these replacement methods and prepare for the service.

6. A Oracle Support Services Engineer may need additional data to be collected. If so they will specify the data to collect.

Please assist in capturing requested data so Oracle can resolve your issue with as little delay as possible. The most likely data requested will be:

  • PPAR Explorer: Data Collection for SPARC M10 Servers ( 1153444.1 )

 


Internal Only - Oracle Support Services Steps

1. Verify the fault event requires FRU replacement.

Confirm the fault event message, and all data are from the same date and implicate the same FRU component.
Confirm whether the implicated FRU(s) resources have been disabled or deconfigured.
Use the Predictive Self-Healing ( PSH ) Knowledge Article to confirm the event requires FRU replacement.
Check the Product Issues Page for breaking news or known issues.

 2. Verify that the error message contains the list of FRU indictments for this fault event.

The list of FRUs is displayed in the order in which they are intended to be replaced (percentage of likelihood).
Replacements should proceed in the order of likeliest FRU to least likely.
Prepare to have the Primary FRU replaced (proceed to the next step).

3. Verify the FRU replacement method that can be used for the specific FRU requiring service and the configuration in question.

The customer may have specified a desired method to use, so verify if the method desired is possible.
Reference:  <Document 1526831.1> M10 CRU / FRU Replacement Methods

4. Create the Service Action Plan and report the recommendations to the Customer/End User.

Use the Action Plan Creator Tool to create the Service Action Plan.
Provide the Customer/End User with the summary of the action plan, including any changes to the FRU which will be replaced and/or the method in which the replacement will proceed.

5. Dispatch the replacement to the appropriate field resources and choose the appropriate Canned Action Plan in ATR.

Reference the Service Manual for the Platform type and FRU in question if needed: service manual

 
6. Contact the Customer/End User and confirm the fix.

Confirm that the FRU replacement resolved the issue and no errors have repeated for at least 24 hours.
Confirm all resources are re-enabled and configured into the domain(s) properly.

7. If the exact same fault event repeats, go back to step 2 and replace the next likeliest FRU listed in error log output.

If the same error persists and all FRUs in the list have been replaced or you are unsure of the next steps, collaborate with the next level of support for further investigation.

 

References

<NOTE:1332409.1> - How to Repair FMA Module Errors Seen in 'fmadm faulty'

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback