Asset ID: |
1-72-1987972.1 |
Update Date: | 2017-04-07 |
Keywords: | |
Solution Type
Problem Resolution Sure
Solution
1987972.1
:
A Production T5120 Server Is Down Frequently Due To The Sensing Of Critical Voltage
Related Items |
- Sun SPARC Enterprise T5120 Server
|
Related Categories |
- PLA-Support>Sun Systems>SPARC>CMT>SN-SPARC: T5xx0
- _KM>Content>Documentation
|
In this Document
Created from <SR 3-10393322941>
Applies to:
Sun SPARC Enterprise T5120 Server - Version All Versions to All Versions [Release All Releases]
Oracle Solaris on SPARC (64-bit)
Oracle Solaris on SPARC (32-bit)
Symptoms
Server is shutting down due to Voltage issue
Cause
The value on the Voltage sensor is out of spec and every time it goes above 1.20 V it will shutdown the server this is normal behavior, the fact that the voltage is above the threshold is not.
The voltage margining circuit "VDDIO 1.1v in the Motherboard" to margin power rails within the system is not operating at the correct value. It's one of the core voltages and could lead to unpredictable behavior.
You can check this on the logs
Example
--------------------------------------------------------------------------------
Voltage sensors (in Volts):
--------------------------------------------------------------------------------
Sensor Status Voltage LowSoft LowWarn HighWarn HighSoft
--------------------------------------------------------------------------------
/SYS/MB/V_+3V3_STBY OK 3.37 3.13 3.17 3.53 3.60
/SYS/MB/V_+3V3_MAIN OK 3.36 3.06 3.10 3.49 3.53
/SYS/MB/V_+12V0_MAIN OK 12.22 10.90 11.15 12.85 13.10
/SYS/MB/V_VDDIO WARNING 1.18 1.00 1.02 1.18 1.20
/SYS/MB/V_VCORE OK 1.21 1.10 1.14 1.30 1.31
/SYS/MB/V_VMEML OK 1.84 1.64 1.68 1.93 1.98
/SYS/MB/V_VMEMR OK 1.85 1.64 1.68 1.93 1.98
/SYS/MB/V_VBAT OK 3.10 -- 2.69 -- --
--------------------------------------------------------------------------------
Mar 09 02:48:25: IPMI |major : "ID = 733e : 03/09/2015 : 02:48:25 : Voltage : /MB/V_VDDIO : Upper Critical going high : reading 1.20 >= threshold 1.20 Volts"
Mar 09 02:48:37: IPMI |major : "ID = 7340 : 03/09/2015 : 02:48:37 : Voltage : /MB/V_VDDIO : Upper Critical going high : reading 1.20 >= threshold 1.20 Volts"
Mar 09 02:48:55: IPMI |major : "ID = 7342 : 03/09/2015 : 02:48:55 : Voltage : /MB/V_VDDIO : Upper Critical going high : reading 1.20 >= threshold 1.20 Volts"
Mar 09 02:49:18: IPMI |major : "ID = 7344 : 03/09/2015 : 02:49:18 : Voltage : /MB/V_VDDIO : Upper Critical going high : reading 1.20 >= threshold 1.20 Volts"
Mar 09 02:49:46: IPMI |major : "ID = 7346 : 03/09/2015 : 02:49:46 : Voltage : /MB/V_VDDIO : Upper Critical going high : reading 1.20 >= threshold 1.20 Volts"
Mar 09 02:49:58: IPMI |major : "ID = 7348 : 03/09/2015 : 02:49:58 : Voltage : /MB/V_VDDIO : Upper Critical going high : reading 1.20 >= threshold 1.20 Volts"
Mar 09 02:50:15: IPMI |major : "ID = 734a : 03/09/2015 : 02:50:15 : Voltage : /MB/V_VDDIO : Upper Critical going high : reading 1.20 >= threshold 1.20 Volts"
Mar 09 02:51:00: Fault |critical: "SP detected fault at time Mon Mar 9 02:51:00 2015. V_VDDIO at /SYS/MB has reached high critical threshold."
Mar 09 02:51:02: Chassis |critical: "Critical voltage value : host is being shut down"
Mar 09 02:51:09: IPMI |major : "ID = 734c : 03/09/2015 : 02:51:09 : Voltage : /MB/V_VDDIO : Upper Critical going high : reading 1.20 >= threshold 1.20 Volts"
Mar 09 02:52:04: Chassis |major : "V_VDDIO at /SYS/MB has reached high warning threshold."
Mar 09 02:52:06: IPMI |major : "ID = 734e : 03/09/2015 : 02:52:06 : Voltage : /MB/V_VDDIO : Upper Critical going high : reading 1.20 >= threshold 1.20 Volts"
Mar 09 02:52:29: IPMI |major : "ID = 7350 : 03/09/2015 : 02:52:29 : Voltage : /MB/V_VDDIO : Upper Critical going high : reading 1.20 >= threshold 1.20 Volts"
Mar 09 02:52:46: IPMI |major : "ID = 7352 : 03/09/2015 : 02:52:46 : Voltage : /MB/V_VDDIO : Upper Critical going high : reading 1.20 >= threshold 1.20 Volts"
Mar 09 02:52:57: IPMI |major : "ID = 7354 : 03/09/2015 : 02:52:57 : Voltage : /MB/V_VDDIO : Upper Critical going high : reading 1.20 >= threshold 1.20 Volts"
Mar 09 02:53:05: Chassis |critical: "Host has been powered off"
Mar 09 08:29:01: Chassis |major : "Host has been powered on"
Mar 09 08:33:45: Chassis |major : "Host is running"
Mar 09 08:39:54: Chassis |major : "V_VDDIO at /SYS/MB has reached high warning threshold."
Voltage sensors:
----------------------------------------------------------------
Location Sensor Status
----------------------------------------------------------------
SYS/MB V_VMEML ok
SYS/MB V_VMEMR ok
SYS/MB V_+3V3_STBY ok
SYS/MB V_VCORE ok
SYS/MB V_+3V3_MAIN ok
SYS/MB V_VDDIO warning (1.176volts )
SYS/MB V_+12V0_MAIN ok
SYS/MB V_VBAT ok
SYS/PS0 V_IN_MAIN ok
SYS/PS0 V_OUT_MAIN ok
SYS/PS1 V_IN_MAIN ok
SYS/PS1 V_OUT_MAIN ok
FROM SNAPSHOT
/MB/V_VDDIO | 1.176 | Volts | nc | 0.996 | 0.996 | 1.020 | 1.176 | 1.200 | 1.212
/SYS/MB/V_VDDIO
Properties:
type = Voltage
ipmi_name = /MB/V_VDDIO
class = Threshold Sensor
value = 1.176 Volts
upper_nonrecov_threshold = 1.212 Volts
upper_critical_threshold = 1.200 Volts
upper_noncritical_threshold = 1.176 Volts
lower_noncritical_threshold = 1.020 Volts
lower_critical_threshold = 0.996 Volts
lower_nonrecov_threshold = 0.996 Volts
alarm_status = warning
Solution
The motherboard on the system needs to be replaced.
Attachments
This solution has no attachment