Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-1381773.1
Update Date:2017-11-27
Keywords:

Solution Type  Technical Instruction Sure

Solution  1381773.1 :   How to clear FMA logs on the ILOM or Solaris on x86 platforms  


Related Items
  • Sun Fire X2200 M2 Server
  •  
  • Exadata X4-2 Hardware
  •  
  • Sun Fire X4150 Server
  •  
  • Sun Fire X4440 Server
  •  
  • Sun Fire X4540 Server
  •  
  • Sun Fire X4275 Server
  •  
  • Sun Fire X4200 Server
  •  
  • Sun Fire X4250 Server
  •  
  • Sun Fire X4200 M2 Server
  •  
  • Exalogic Elastic Cloud X4-2 Hardware
  •  
  • Sun Fire X4240 Server
  •  
  • Sun Fire X4600 M2 Server
  •  
  • Sun Fire X2270 Server
  •  
  • Sun Fire X4140 Server
  •  
  • Sun Fire X4470 Server
  •  
  • Sun Server X2-8
  •  
  • Sun Fire X4170 Server
  •  
  • Sun Fire X4100 M2 Server
  •  
  • Sun Fire X2100 M2 Server
  •  
  • Sun Fire X4270 M2 Server
  •  
  • Sun Fire X2250 Server
  •  
  • Sun Fire X4270 Server
  •  
  • Sun Fire X4640 Server
  •  
  • Sun Fire X2270 M2 Server
  •  
  • Sun Fire X4600 Server
  •  
  • Sun Fire X4100 Server
  •  
  • Sun Fire X2100 Server
  •  
  • Sun Fire X4800 Server
  •  
  • Sun Fire X4500 Server
  •  
  • Sun Fire X4170 M2 Server
  •  
  • Sun Server X2-4
  •  
  • Sun Fire X4450 Server
  •  
Related Categories
  • PLA-Support>Sun Systems>Sun_Other>Sun Collections>SN-OTH: x64-CAP VCAP
  •  




In this Document
Goal
Solution
References


Applies to:

Sun Fire X4170 Server - Version Not Applicable and later
Sun Fire X2100 Server - Version Not Applicable and later
Sun Fire X2200 M2 Server - Version Not Applicable and later
Sun Fire X4200 M2 Server - Version Not Applicable and later
Sun Fire X4800 Server - Version Not Applicable and later
Information in this document applies to any platform.

Goal

How to clear FMA logs on the ILOM or Solaris on x86 platforms

Solution

DISPATCH INSTRUCTIONS

WHAT SKILLS DOES THE ENGINEER NEED:(IS A SITE ENGINEER AVAILABLE?)

X64 hardware and Solaris 10 Trained.

TASK COMPLEXITY:0

TIME ESTIMATE: 20 minutes

FIELD ENGINEER INSTRUCTIONS

WHAT STATE SHOULD THE SYSTEM BE IN TO BE READY TO PERFORM THE RESOLUTION ACTIVITY? :

Need access to the ILOM and Solaris. System can be powered on.

WHAT ACTION DOES THE ENGINEER NEED TO TAKE:

This article provides standard procedures for viewing details of a hardware fault diagnosed by the ILOM-based fault managers.

Information contained in this article includes the preparation required when opening a service request and actions required to modify the fault status after completion of the repair action.

Reference doc: PSH Procedural Article for ILOM-Based Diagnosis (Doc ID 1155200.1)

Section A - Displaying Fault Event Information

This section describes specific procedures for viewing the details of diagnosed fault, such as, the impacted resources and the replaceable parts that have been identified as being faulty. Execution of these procedures should be performed prior to manually submitting a service request.

The Fault Management Shell is the preferred method for displaying the details of a diagnosed fault. However, support for this command shell varies depending ILOM release level and server product model.

Determine if the Fault Management Shell is supported on your product by logging into to the ILOM command interface as root and executing the the command indicated in the procedures below.

Note: The host name may be substituted in place of the IP address of the Service Processor when logging into the ILOM CLI.

Sample:

% ssh -l root <IP address of Service Processor>

 

Or access ILOM via Serial MGT port if the ILOM not setted IP address.

How to access ILOM via Serial MGT port:
1. Connect a terminal (or a PC running terminal emulation software) to the server serial port.
2. Ensure that the server hardware is installed and cables are inserted.
3. Verify that your terminal, laptop, PC, or terminal server is operational.
4. Configure the terminal device or the terminal emulation software running on a laptop or PC to the following settings:
* 8,N,1: eight data bits, no parity, one stop bit
* 9600 baud
* Disable hardware flow control (CTS/RTS)
* Disable software flow control (XON/XOFF)
5. Connect a null serial modem cable from the server™s back panel RJ45 serial port to a terminal device (if not connected already).
6. Press Enter on the terminal device to establish a connection between the terminal device and the ILOM service processor (SP). The following prompt appears.
->
7. Type the default user name root, and then type the default password: changeme to log in to the ILOM SP.
The LOM displays a default command prompt, indicating that you have successfully logged in:
->

Now you logged in ILOM.

-> show /SP/faultmgmt/shell

/SP/faultmgmt/shell
Targets:

 

The above indicates the Fault Management Shell is supported. Proceed to section A.1 Using the Fault Management Shell.

-> show /SP/faultmgmt/shell
show: No such target /SP/faultmgmt/shell

 

The above indicates the Fault Management Shell is not supported on your product. Proceed to section A.2 Using the ILOM Command Line Interface.

Section A.1 Using the Fault Management Shell

The following procedure assumes you are logged into the ILOM command line interface as root per the instructions above.

* Enter the fault management shell to obtain pertinent information about the fault.

-> start /SP/faultmgmt/shell
Are you sure you want to start /SP/faultmgmt/shell(y/n)? y

faultmgmtsp>

 

* Use the 'fmadm faulty' command to identify the faulty component/FRU.

Example 1

The Example output shown below identifies the suspect FRU as "/SYS/FANBD/FM0", which represents the full physical path to the FRU. The hierarchical path "/SYS" represents the chassis, "FANBD" represents the fan board, and "FM0" represents the Fan Module.

For the example below the fan does not contain a FRUID so the part number and serial number are displayed as 'unknown'. When this information is available, these fields will contain valid information. See Example 2 below.

faultmgmtsp> fmadm faulty
------------------- ------------------------------------ -------------- --------
Time UUID msgid Severity
------------------- ------------------------------------ -------------- --------
2010-08-17/20:19:09 c1060771-8f6e-eb1f-a65c-bb47d261a1d4 SPT-8000-3R Major

Fault class : fault.chassis.device.fan.fail

FRU : /SYS/FANBD/FM0
(Part Number: unknown)
(Serial Number: unknown)

Description : Fan tachometer speed is below its normal operating range.
..

Action : The administrator should review the ILOM event log for
additional information pertaining to this diagnosis. Please refer to the Details section of the Knowledge Article for
additional information.

 

Example 2
The Example 2 output shown below identifies the suspect FRU as '/SYS/MB'. The hierarchical path "/SYS" represents the chassis, '/MB'

represents the Mother Board.

faultmgmtsp> fmadm faulty
------------------- ------------------------------------ -------------- --------
Time UUID msgid Severity
------------------- ------------------------------------ -------------- --------
2010-08-30/14:44:36 2a4e3a37-b243-e071-8b26-f65cb5d015f1 SPT-8000-DH Critical

Fault class : fault.chassis.voltage.fail

FRU : /SYS/MB
(Part Number: 541-3857-07)
(Serial Number: xxxxxxx-xxxxxxxxxx)

..
Action : The administrator should review the ILOM event log for
additional information pertaining to this diagnosis. Please
refer to the Details section of the Knowledge Article for
additional information.

 

Section A.2 -  Using the Standard ILOM Command Line Interface


The following procedure assumes you are logged into the ILOM command line interface as root per the instructions above.

Use the following commands described below to identify the faulty component / FRU.

The sample output shown below in steps A1-A3 identify the suspect FRU as "/SYS/MB/P0", which represents the full physical path to FRU, whereby "SYS" represents the chassis, "MB" represents the motherboard, and "P0" represents the processor.

Refer to either the service label on top cover or silk screen labeling on the motherboard to locate processor "P0".

Step 1 List all known faults in the system.

Example:

-> show /SP/faultmgmt

/SP/faultmgmt
Targets:
0 (/SYS/MB/P0)

 

Step 2. List the state of a faulted processor


Example:

-> show /SYS/MB/P0

/SYS/MB/P0
Targets:
D0
D1
D2
D3
D4
D5
D6
D7
D8
PRSNT
SERVICE

Properties:
type = Host Processor
fru_name = Genuine Intel(R) CPU 000 @ 2.67GHz
fru_manufacturer = Intel
fru_version = 04
fru_part_number = 060A
fault_state = Faulted
clear_fault_action = (none)

 

Step 3. List the contents of the ILOM event log

Example:

-> show /SP/logs/event/list

6313 Sun Dec 28 09:54:57 2008 Fault Fault critical
Fault detected at time = Sun Dec 28 09:54:57 2008.
The suspect component: /SYS/MB/P0 has fault.cpu.intel.l1itlb with probability=100.
Refer to http://www.sun.com/msg/SPX86-8000-TX for details.

 


Section B - Post-Repair Procedures


This section describes specific procedures that may be required to modify the status of faults that have been repaired and return impacted resources to normal operation.

On some products the ILOM fault management function can determine if the associated FRUs have been replaced and automatically clear the associated fault status. In some cases it cannot and the fault will have to be changed manually.

To determine if the fault is still present run the same commands applied in section A.1 or A.2 (Fault Management Shell or ILOM Command Line Interface) as appropriate. If the fault is no longer present then no further action is required. If it is still present then follow the procedures described in Section C.1 or C.2 to manually clear the fault.

In some cases evidence of this same fault may also be stored by the Solaris fault manager. If Solaris was in fact the operating system running, then follow the procedures in Section C of the following document to determine if additional post-repair action is required:
Reference doc:
PSH Procedural Article for Solaris FMA-Based Diagnosis (Doc ID 1173733.1)

Section B.1 Using Fault Management Shell to Clear the Fault

* Enter the fault management shell.


-> start /SP/faultmgmt/shell
Are you sure you want to start /SP/faultmgmt/shell (y/n) ? y

faultmgmtsp>

 

*Use 'fmadm' to clear the fault.

Usage: fmadm <subcommand>
    acquit <FRU>                : acquit faults on a FRU
    acquit <UUID>               : acquit faults associated with UUID
    acquit <FRU> <UUID>         : acquit faults specified by
                                  (FRU, UUID) combination
    replaced <FRU>              : replaced faults on a FRU
    repaired <FRU>              : repaired faults on a FRU
    repair <FRU>                : repair faults on a FRU

Note:
   - replaced is used when a CRU / FRU has been replaced
   - repaired will be used for a CRU / FRU which haven't been replaced to fix the issue
     e.g  re-seating, tightening a connector, straightening a connector pin, or re-flashing firmware
   - acquit should be used in case the CRU / FRU has been determined not to be suspect of containing a fault

faultmgmtsp> fmadm replaced <FRU>
faultmgmtsp> fmadm repair <FRU>
faultmgmtsp> fmadm acquit <FRU>
faultmgmtsp> fmadm repaired <FRU>

Example:
faultmgmtsp> fmadm replaced /SYS/MB

faultmgmtsp> fmadm acquit <UUID>
Example:
faultmgmtsp> fmadm acquit 9df39f93-f356-6d26-e081-e4f3a9872c2f

 
Section B.2 Using the ILOM Command Line Interface to Clear the Fault


Login to the ILOM command line interface as 'root' and use the following commands to clear the fault.

Example:

-> set /SYS/MB/P0 clear_fault_action=true
Are you sure you want to clear /SYS/MB/P0 (y/n)? y
Set 'clear_fault_action' to 'true'

 
How to clear FMA logs in Solaris

Please refer to PSH Procedural Article for Solaris FMA-Based Diagnosis (Doc ID 1173733.1) for full datails on displaying Fault Event Information, post-repair procedures, using fmadm acquit, Using fmadm repaired and returning affected resources to operation


Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback