![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
||
Solution Type Problem Resolution Sure Solution 1643464.1 : [SPARC T3/T4/T5 and T7] OBP reports "One or more resources have been retired, please run 'show faulty' on the SP" on console
In this Document
Applies to:SPARC T5-2 - Version All Versions to All Versions [Release All Releases]SPARC T5-4 - Version All Versions to All Versions [Release All Releases] SPARC T5-8 - Version All Versions to All Versions [Release All Releases] SPARC T4-1 - Version All Versions to All Versions [Release All Releases] SPARC T4-2 - Version All Versions to All Versions [Release All Releases] Information in this document applies to any platform. SymptomsWhen a the SPARC T3,T4 or T5 system is powered on ( start /SYS ) and the following WARNING message is logged in the console of the system. [LDOM service or guest domain]. It indicates that one or more system components have been disabled or degraded. Console Message ---------------- WARNING: One or more resources have been retired, please run 'show faulty' on the SP. Example 1 [ console message ] SPARC T5-8, No Keyboard Copyright (c) 1998, 2014, Oracle and/or its affiliates. All rights reserved. OpenBoot 4.35.5.a, 1.9987 TB memory available, Serial #103641986. Ethernet address 0:10:e0:2d:73:82, Host ID: 862d7382. WARNING: One or more resources have been retired, please run 'show faulty' on the SP. Boot device: disk File and args: SunOS Release 5.11 Version 11.1 64-bit Copyright (c) 1983, 2012, Oracle and/or its affiliates. All rights reserved. / CauseSystem components may be "degraded" by the ILOM (FDD) or Solaris (FMA) fault engine. A system component may also be "disabled" by a user. Once the component has been degraded or disabled it will no longer be visible in OBP and Solaris. To search for disabled or degraded component from ILOM the following ILOM CLI command may be used show -l all /SYS current_config_state==(disabled,degraded) Multiple components could be manually disabled from the ILOM CLI, the ILOM CLI command "show components" will list all the components that could disabled or degraded on the platform. The following example indicates that the PCI slot component was manually disabled by the operator. Example 2 [ list user disabled component ] -> show -l all /SYS current_config_state==(disabled,degraded)
/SYS/RCSA/PCIE9
Targets:
CAR
Properties:
type = Slot
requested_config_state = Disabled
current_config_state = Disabled
disable_reason = By user
Commands:
cd
show
->
The following example indicates that the PCI slot component was degraded by the system Fault Engine. Example 3 [ list degraded component ] -> show -l all /SYS current_config_state==(disabled,degraded)
/SYS/RCSA/PCIE9
Targets:
CAR
Properties:
type = Slot
requested_config_state = Enabled
current_config_state = Disabled
disable_reason = Diagnosed faulty
Commands:
cd
show
->
SolutionFor components that have been manually disabled, Manually re-enabling a component from ILOM CLI will require a system restart STEP 1. set <component label> requested_config_state=enabled STEP 2. stop /SYS STEP 3. start /SYS Example 3 [ re-enabling a component ] -> show -l all /SYS current_config_state==(disabled,degraded)
/SYS/RCSA/PCIE9
Targets:
CAR
Properties:
type = Slot
requested_config_state = Disabled
current_config_state = Disabled
disable_reason = By user
Commands:
cd
show
->
-> set /SYS/RCSA/PCIE9 requested_config_state=enabled
Set 'requested_config_state' to 'enabled'
-> show -d properties /SYS/RCSA/PCIE9
/SYS/RCSA/PCIE9
Properties:
type = Slot
requested_config_state = Enabled
current_config_state = Disabled
disable_reason = Configuration Rules
->
-> stop -f /SYS
Are you sure you want to immediately stop /SYS (y/n)? y
Stopping /SYS immediately
-> show -d properties /SYS/RCSA/PCIE9
/SYS/RCSA/PCIE9
Properties:
type = Slot
requested_config_state = Enabled
current_config_state = Enabled
disable_reason = None
-> start /SYS
Are you sure you want to start /SYS (y/n)? y
Starting /SYS
->
For components that have been degraded by the system fault engine, the suspected faulty component can be determined by running "show faulty" or starting the ILOM fault management shell [/SP/faultmgmt/shell]. Once the suspected faulty components have been replaced or have been verified to be not faulty the following procedure could be carried out. STEP 1. Shutdown Platform stop /SYS STEP 2. Enter system fault management shell start /SP/faultmgmt/shell STEP 3. List faulty components reported by the system fmadm faulty STEP 4. acquit or repair the faulty events using the uuid fmadm acquit <uuid> STEP 5. verify that there are no degraded components in ILOM fault management shell fmadm faulty -r STEP 6. exit faultmanagement shell exit STEP 7. re-verify that there are no degraded components in ILOM show faulty STEP 8. Start the platform start /SYS Example 4 [ The following example was carried out after replacing /SYS/MB/PCIE6 ] SPARC T5-2, No Keyboard
Copyright (c) 1998, 2013, Oracle and/or its affiliates. All rights reserved.
OpenBoot 4.35.4, 255.0000 GB memory available, Serial #104142XXX.
Ethernet address 0:10:e0:35:XX:XX, Host ID: 8635XXXX.
WARNING: One or more resources have been retired, please run 'show faulty' on the SP.
Boot device: disk File and args:
-> stop /System
Are you sure you want to stop /System (y/n)? y
Stopping /System
-> start /SP/faultmgmt/shell
Are you sure you want to start /SP/faultmgmt/shell (y/n)? y
faultmgmtsp> fmadm faulty
------------------- ------------------------------------ -------------- --------
Time UUID msgid Severity
------------------- ------------------------------------ -------------- --------
2014-02-28/00:29:15 56d8bb58-0b42-426b-dcb8-f318462c438c PCIEX-8000-0A Critical
Problem Status : solved
Diag Engine : [unknown]
System
Manufacturer : Oracle Corporation
Name : SPARC T5-2
Part_Number : 31845050+1+1
Serial_Number : AK00107XXX
----------------------------------------
Suspect 1 of 1
Fault class : fault.io.pciex.device-interr
Certainty : 100%
Affects : /SYS/MB/PCIE8
Status : faulted
FRU
Status : not present
Location : /SYS/MB/PCIE8
Chassis
Manufacturer : Oracle Corporation
Name : SPARC T5-2
Part_Number : 31845050+1+1
Serial_Number : AK00107XXX
Description : A fault has been diagnosed by the Host Operating System.
Response : The service required LED on the chassis and on the affected
FRU may be illuminated.
Impact : No SP impact.
Action : Refer to the associated reference document at
http://support.oracle.com/msg/PCIEX-8000-0A for the latest
service procedures and policies regarding this diagnosis.
------------------- ------------------------------------ -------------- --------
Time UUID msgid Severity
------------------- ------------------------------------ -------------- --------
2014-02-28/00:29:22 90374df4-2819-6e8d-cac7-982b2a90e8ed PCIEX-8000-0A Critical
Problem Status : solved
Diag Engine : [unknown]
System
Manufacturer : Oracle Corporation
Name : SPARC T5-2
Part_Number : 31845050+1+1
Serial_Number : AK00107XXX
----------------------------------------
Suspect 1 of 1
Fault class : fault.io.pciex.device-interr
Certainty : 100%
Affects : /SYS/MB/PCIE6
Status : faulted
FRU
Status : not present
Location : /SYS/MB/PCIE6
Chassis
Manufacturer : Oracle Corporation
Name : SPARC T5-2
Part_Number : 31845050+1+1
Serial_Number : AK00107XXX
Description : A fault has been diagnosed by the Host Operating System.
Response : The service required LED on the chassis and on the affected
FRU may be illuminated.
Impact : No SP impact.
Action : Refer to the associated reference document at
http://support.oracle.com/msg/PCIEX-8000-0A for the latest
service procedures and policies regarding this diagnosis.
faultmgmtsp>
faultmgmtsp> fmadm repair /SYS/MB
faultmgmtsp> fmadm acquit /SYS/MB
faultmgmtsp> fmadm acquit 90374df4-2819-6e8d-cac7-982b2a90e8ed
faultmgmtsp> fmadm acquit 56d8bb58-0b42-426b-dcb8-f318462c438c
faultmgmtsp> fmadm faulty -r
No faults found
faultmgmtsp> fmadm rotate errlog
faultmgmtsp> fmadm rotate fltlog
faultmgmtsp> exit
>
-> show faulty
Target | Property | Value
---------------------------------------------+-----------------------------------------------------+----------------------------------------------------------------------------
-> start /SYS
There are situations where a single DIMM fault would disable other dimms due to minimum dimm confguration requirements, in the following example the console will report the following messages indicating that the bank could not be configured due to configuration rules.
2014-05-04 17:13:56 2:0:0> NOTICE: SPARC-T5 Revision 1.2 Speed 3600MHz
2014-05-04 17:15:05 0:0:0> NOTICE: Initializing Memory 2014-05-04 17:16:30 2:0:0> ERROR: /SYS/PM1/CM0/CMP/BOB4/CH1/D0: DIMM is not populated in order on the BOB. Not configured 2014-05-04 17:16:31 2:0:0> ERROR: /SYS/PM1/CM0/CMP/BOB0/CH0/D0: DIMM population chip symmetry rule violation. Not configured 2014-05-04 17:16:32 2:0:0> ERROR: /SYS/PM1/CM0/CMP/BOB0/CH1/D0: DIMM population chip symmetry rule violation. Not configured 2014-05-04 17:16:32 2:0:0> ERROR: /SYS/PM1/CM0/CMP/BOB2/CH0/D0: DIMM population chip symmetry rule violation. Not configured 2014-05-04 17:16:33 2:0:0> ERROR: /SYS/PM1/CM0/CMP/BOB2/CH1/D0: DIMM population chip symmetry rule violation. Not configured 2014-05-04 17:16:34 2:0:0> ERROR: /SYS/PM1/CM0/CMP/BOB6/CH0/D0: DIMM population chip symmetry rule violation. Not configured 2014-05-04 17:16:35 2:0:0> ERROR: /SYS/PM1/CM0/CMP/BOB6/CH1/D0: DIMM population chip symmetry rule violation. Not configured 2014-05-04 17:17:13 0:0:0> NOTICE: Initializing MCU 0 Memory Link 0 2014-05-04 17:17:30 0:0:0> NOTICE: Initializing MCU 0 Memory Link 1
If "fmadm repair" or "fmadm acquit" command does not re-enable the other DIMMs disabled due to "Symmetry Rule" , manually clear each DIMM with the following command set <COMPONENT PATH> clear_fault_action=true
reference to this behavior maybe verfied from the following /nyx-1.3.x/src/hostconfig/common/src/gmd_config.c References<NOTE:1614738.1> - [SPARC T4/T5/M5 and M6] FMA I/O retirement : PCI devices can be seen from OBP but disappear when System Boots up into SolarisAttachments This solution has no attachment |
||||||||||||||||||
|
||||||||||||||||||