![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||||||
Solution Type Predictive Self-Healing Sure Solution 1554985.1 : How to investigate the Auto Service Request "ASR:Probable power supply failure" on X4150, X4250 and X4450 systems
This article describes activity required by a System Administrator to verify whether action has to be taken when a X86 ASR power supply alarm has occurred on a X4x50 based system In this Document
Applies to:Sun Fire X4450 Server - Version Not Applicable to Not Applicable [Release N/A]Sun Fire X4150 Server - Version Not Applicable to Not Applicable [Release N/A] Sun Fire X4250 Server - Version Not Applicable to Not Applicable [Release N/A] Sun SPARC Enterprise T5440 Server x86 PurposeThis article describes activity required by a System Administrator to verify whether action has to be taken on a X86 ASR power supply alarm. ScopeThis document is intended for system administrators and support personnel. DetailsAuto Service Request (ASR) provides automatic failure detection and SR creation for Oracle X86 systems. See http://www.oracle.com/us/asr/index.html for more information on ASR. Description of the ASR Event:Power supply events can be both transient or persistent. They can be generated by external changes and actions, most notably by the removal of AC from a power supply. Additional checks may need to performed in order to understand the cause of this ASR event. If a persistent failure has occurred, or if a power event or events cannot be explained by changes in the supplied power or work being carried out on the machine, then further investigation by a support engineer may be required If the event has been been caused by changes in site power or a similar event then no action need be taken. Please find an example ASR alarm at the bottom of this document How to verify if there is a genuine Power Supply issue:Step 1: Identify the system that experienced this ASR Alarm. The Auto Service Request (ASR) will be logged against the serial number of the machine that generated the alarm. The information provided by the alarm will contain the hostname of the machine or the machine's Service Processor.
Step 2: To verify that a power supply has a persistent failure, perform either Step2a, Step 2b or Step 2c of this document. The ASR alarm will identify the power supply number that has generated the alarm. This power supply can be checked by using the one of the methods below: "value = State Asserted" for PWROK indicates the power supply DC output is functioning "value = State Asserted" for VINOK indicates the AC input to the power supply is present If the AC input for a particular power supply is "State Deasserted" then the corresponding DC output of the power supply (PWROK) will also be "State Deasserted" and the site infrastructure should be checked to discover the reason for the loss of AC. Note on a server which is powered off PWROK will be in state "Deasserted" for all power supplies. On a system that is powered up and VINOK is Asserted (On) for a power supply but PWROK is "Deasserted" (Off) for that power supply then the DraftSR should be promoted to a full SR. Step 2a If the utility ipmitool is available then the ipmitool command "sensor" can be used to check the incoming AC and DC output of a power supply. It can also be used to check whether a particular supply is indicating a fault. Example:
ipmitool -H <sp_address or name> -U root sensor <output omitted> PS0/VINOK | 0x2 | discrete | 0x0200| na | na | na | na | na | na <output omitted>
Note "0x2 indicates "Asserted" and "0x1" indicates "Deasserted"
Log onto the Command Line Interface (CLI) of the server's service processor and use the commands below to check the relevant power supply status under /SYS/PSx where "x" is the power supplied implicated in the ASR alarm. "value = State Asserted" under PWROK indicates the power supply DC output is functioning "value = State Asserted" under VINOK indicates the AC input to the power supply is present Note the following output will vary slightly according to the version of ILOM. If the AC input for a particular power supply is "State Deasserted" then the corresponding DC output of the power supply (PWROK) will also be "State Deasserted" and the site infrastructure should be checked to discover the reason for the loss of AC. The status of hardware fault indicators on the power supply can also be checked see example below: -> show -d properties /SYS/PS0/VINOK Example; To check for a Power Supply Fan Fault: -> show /SYS/PS0 /SYS/PS0 Targets: CUR_FAULT FAN_FAULT INPUT_POWER I_IN I_OUT OUTPUT_POWER PRSNT PWROK TEMP_FAULT VINOK VOLT_FAULT V_IN V_OUT <output omitted> -> show -d properties /SYS/PS0/FAN_FAULT Step 2c Log onto the the Browser User Interface (BUI) of the ILOM Navigate to the System Monitoring Tab > Sensor readings Then select "Type: Power Supply" as filter and check the relevant /SYS/PSx/PWROK and /SYS/PWx/VINOK entries. An example is shown below: "value = State Asserted" for the relevant power supply's PWROK indicates the power supply DC output is functioning "value = State Asserted" for the relevant powers supply's VINOK indicates AC to the power supply is present Check if any of the fault indicators are asserted which would indicate a hardware failure
Step 3: If a failure is not persistent, no further action is required. The SR will close in 14 days. If the failure has been verified as persistent or is a cause of concern, then engage a support engineer by one of the following methods a) Update the SR - A support engineer will be assigned to assist. b) Phone your local Oracle support number and request the SR be assigned to the next available engineer. If ILOM is version 3.0 or above you should upload a snapshot file to allow further analysis, otherwise upload output of relevant commands above and if possible provide the output of ipmitool command "sel elist"
Example alarm:Hostname: SP_Host_name Attachments This solution has no attachment |
||||||||||||||||||||||
|