![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||
Solution Type Problem Resolution Sure Solution 1643464.1 : [SPARC T3/T4/T5 and T7] OBP reports "One or more resources have been retired, please run 'show faulty' on the SP" on console
In this Document
Applies to:SPARC T5-2 - Version All Versions to All Versions [Release All Releases]SPARC T5-4 - Version All Versions to All Versions [Release All Releases] SPARC T5-8 - Version All Versions to All Versions [Release All Releases] SPARC T4-1 - Version All Versions to All Versions [Release All Releases] SPARC T4-2 - Version All Versions to All Versions [Release All Releases] Information in this document applies to any platform. SymptomsWhen a the SPARC T3,T4 or T5 system is powered on ( start /SYS ) and the following WARNING message is logged in the console of the system. [LDOM service or guest domain]. It indicates that one or more system components have been disabled or degraded. Console Message ---------------- WARNING: One or more resources have been retired, please run 'show faulty' on the SP. Example 1 [ console message ] SPARC T5-8, No Keyboard Copyright (c) 1998, 2014, Oracle and/or its affiliates. All rights reserved. OpenBoot 4.35.5.a, 1.9987 TB memory available, Serial #103641986. Ethernet address 0:10:e0:2d:73:82, Host ID: 862d7382. WARNING: One or more resources have been retired, please run 'show faulty' on the SP. Boot device: disk File and args: SunOS Release 5.11 Version 11.1 64-bit Copyright (c) 1983, 2012, Oracle and/or its affiliates. All rights reserved. / CauseSystem components may be "degraded" by the ILOM (FDD) or Solaris (FMA) fault engine. A system component may also be "disabled" by a user. Once the component has been degraded or disabled it will no longer be visible in OBP and Solaris. To search for disabled or degraded component from ILOM the following ILOM CLI command may be used show -l all /SYS current_config_state==(disabled,degraded) Multiple components could be manually disabled from the ILOM CLI, the ILOM CLI command "show components" will list all the components that could disabled or degraded on the platform. The following example indicates that the PCI slot component was manually disabled by the operator. Example 2 [ list user disabled component ] -> show -l all /SYS current_config_state==(disabled,degraded) /SYS/RCSA/PCIE9 Targets: CAR Properties: type = Slot requested_config_state = Disabled current_config_state = Disabled disable_reason = By user Commands: cd show -> The following example indicates that the PCI slot component was degraded by the system Fault Engine. Example 3 [ list degraded component ] -> show -l all /SYS current_config_state==(disabled,degraded) /SYS/RCSA/PCIE9 Targets: CAR Properties: type = Slot requested_config_state = Enabled current_config_state = Disabled disable_reason = Diagnosed faulty Commands: cd show -> SolutionFor components that have been manually disabled, Manually re-enabling a component from ILOM CLI will require a system restart STEP 1. set <component label> requested_config_state=enabled STEP 2. stop /SYS STEP 3. start /SYS Example 3 [ re-enabling a component ] -> show -l all /SYS current_config_state==(disabled,degraded) /SYS/RCSA/PCIE9 Targets: CAR Properties: type = Slot requested_config_state = Disabled current_config_state = Disabled disable_reason = By user Commands: cd show -> -> set /SYS/RCSA/PCIE9 requested_config_state=enabled Set 'requested_config_state' to 'enabled' -> show -d properties /SYS/RCSA/PCIE9 /SYS/RCSA/PCIE9 Properties: type = Slot requested_config_state = Enabled current_config_state = Disabled disable_reason = Configuration Rules -> -> stop -f /SYS Are you sure you want to immediately stop /SYS (y/n)? y Stopping /SYS immediately -> show -d properties /SYS/RCSA/PCIE9 /SYS/RCSA/PCIE9 Properties: type = Slot requested_config_state = Enabled current_config_state = Enabled disable_reason = None -> start /SYS Are you sure you want to start /SYS (y/n)? y Starting /SYS -> For components that have been degraded by the system fault engine, the suspected faulty component can be determined by running "show faulty" or starting the ILOM fault management shell [/SP/faultmgmt/shell]. Once the suspected faulty components have been replaced or have been verified to be not faulty the following procedure could be carried out. STEP 1. Shutdown Platform stop /SYS STEP 2. Enter system fault management shell start /SP/faultmgmt/shell STEP 3. List faulty components reported by the system fmadm faulty STEP 4. acquit or repair the faulty events using the uuid fmadm acquit <uuid> STEP 5. verify that there are no degraded components in ILOM fault management shell fmadm faulty -r STEP 6. exit faultmanagement shell exit STEP 7. re-verify that there are no degraded components in ILOM show faulty STEP 8. Start the platform start /SYS Example 4 [ The following example was carried out after replacing /SYS/MB/PCIE6 ] SPARC T5-2, No Keyboard Copyright (c) 1998, 2013, Oracle and/or its affiliates. All rights reserved. OpenBoot 4.35.4, 255.0000 GB memory available, Serial #104142XXX. Ethernet address 0:10:e0:35:XX:XX, Host ID: 8635XXXX. WARNING: One or more resources have been retired, please run 'show faulty' on the SP. Boot device: disk File and args: -> stop /System Are you sure you want to stop /System (y/n)? y Stopping /System -> start /SP/faultmgmt/shell Are you sure you want to start /SP/faultmgmt/shell (y/n)? y faultmgmtsp> fmadm faulty ------------------- ------------------------------------ -------------- -------- Time UUID msgid Severity ------------------- ------------------------------------ -------------- -------- 2014-02-28/00:29:15 56d8bb58-0b42-426b-dcb8-f318462c438c PCIEX-8000-0A Critical Problem Status : solved Diag Engine : [unknown] System Manufacturer : Oracle Corporation Name : SPARC T5-2 Part_Number : 31845050+1+1 Serial_Number : AK00107XXX ---------------------------------------- Suspect 1 of 1 Fault class : fault.io.pciex.device-interr Certainty : 100% Affects : /SYS/MB/PCIE8 Status : faulted FRU Status : not present Location : /SYS/MB/PCIE8 Chassis Manufacturer : Oracle Corporation Name : SPARC T5-2 Part_Number : 31845050+1+1 Serial_Number : AK00107XXX Description : A fault has been diagnosed by the Host Operating System. Response : The service required LED on the chassis and on the affected FRU may be illuminated. Impact : No SP impact. Action : Refer to the associated reference document at http://support.oracle.com/msg/PCIEX-8000-0A for the latest service procedures and policies regarding this diagnosis. ------------------- ------------------------------------ -------------- -------- Time UUID msgid Severity ------------------- ------------------------------------ -------------- -------- 2014-02-28/00:29:22 90374df4-2819-6e8d-cac7-982b2a90e8ed PCIEX-8000-0A Critical Problem Status : solved Diag Engine : [unknown] System Manufacturer : Oracle Corporation Name : SPARC T5-2 Part_Number : 31845050+1+1 Serial_Number : AK00107XXX ---------------------------------------- Suspect 1 of 1 Fault class : fault.io.pciex.device-interr Certainty : 100% Affects : /SYS/MB/PCIE6 Status : faulted FRU Status : not present Location : /SYS/MB/PCIE6 Chassis Manufacturer : Oracle Corporation Name : SPARC T5-2 Part_Number : 31845050+1+1 Serial_Number : AK00107XXX Description : A fault has been diagnosed by the Host Operating System. Response : The service required LED on the chassis and on the affected FRU may be illuminated. Impact : No SP impact. Action : Refer to the associated reference document at http://support.oracle.com/msg/PCIEX-8000-0A for the latest service procedures and policies regarding this diagnosis. faultmgmtsp> faultmgmtsp> fmadm repair /SYS/MB faultmgmtsp> fmadm acquit /SYS/MB faultmgmtsp> fmadm acquit 90374df4-2819-6e8d-cac7-982b2a90e8ed faultmgmtsp> fmadm acquit 56d8bb58-0b42-426b-dcb8-f318462c438c faultmgmtsp> fmadm faulty -r No faults found faultmgmtsp> fmadm rotate errlog faultmgmtsp> fmadm rotate fltlog faultmgmtsp> exit > -> show faulty Target | Property | Value ---------------------------------------------+-----------------------------------------------------+---------------------------------------------------------------------------- -> start /SYS
There are situations where a single DIMM fault would disable other dimms due to minimum dimm confguration requirements, in the following example the console will report the following messages indicating that the bank could not be configured due to configuration rules.
2014-05-04 17:13:56 2:0:0> NOTICE: SPARC-T5 Revision 1.2 Speed 3600MHz
2014-05-04 17:15:05 0:0:0> NOTICE: Initializing Memory 2014-05-04 17:16:30 2:0:0> ERROR: /SYS/PM1/CM0/CMP/BOB4/CH1/D0: DIMM is not populated in order on the BOB. Not configured 2014-05-04 17:16:31 2:0:0> ERROR: /SYS/PM1/CM0/CMP/BOB0/CH0/D0: DIMM population chip symmetry rule violation. Not configured 2014-05-04 17:16:32 2:0:0> ERROR: /SYS/PM1/CM0/CMP/BOB0/CH1/D0: DIMM population chip symmetry rule violation. Not configured 2014-05-04 17:16:32 2:0:0> ERROR: /SYS/PM1/CM0/CMP/BOB2/CH0/D0: DIMM population chip symmetry rule violation. Not configured 2014-05-04 17:16:33 2:0:0> ERROR: /SYS/PM1/CM0/CMP/BOB2/CH1/D0: DIMM population chip symmetry rule violation. Not configured 2014-05-04 17:16:34 2:0:0> ERROR: /SYS/PM1/CM0/CMP/BOB6/CH0/D0: DIMM population chip symmetry rule violation. Not configured 2014-05-04 17:16:35 2:0:0> ERROR: /SYS/PM1/CM0/CMP/BOB6/CH1/D0: DIMM population chip symmetry rule violation. Not configured 2014-05-04 17:17:13 0:0:0> NOTICE: Initializing MCU 0 Memory Link 0 2014-05-04 17:17:30 0:0:0> NOTICE: Initializing MCU 0 Memory Link 1
If "fmadm repair" or "fmadm acquit" command does not re-enable the other DIMMs disabled due to "Symmetry Rule" , manually clear each DIMM with the following command set <COMPONENT PATH> clear_fault_action=true
reference to this behavior maybe verfied from the following /nyx-1.3.x/src/hostconfig/common/src/gmd_config.c References<NOTE:1614738.1> - [SPARC T4/T5/M5 and M6] FMA I/O retirement : PCI devices can be seen from OBP but disappear when System Boots up into SolarisAttachments This solution has no attachment |
||||||||||||||||||
|