Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1982289.1
Update Date:2018-03-08
Keywords:

Solution Type  Problem Resolution Sure

Solution  1982289.1 :   SPARC T5-4 systems may suffer a power glitch during an ac power cord cycle (disconnecting/connecting power cables) when parallel boot and host auto power on are been enabled (which are the default values)  


Related Items
  • SPARC T5-4
  •  
Related Categories
  • PLA-Support>Sun Systems>SPARC>CMT>SN-SPARC: T5
  •  




In this Document
Symptoms
Changes
Cause
Solution
References


Applies to:

SPARC T5-4 - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.
Applies only to SPARC T5-4 (does not apply for SPARC T5-2, SPARC T5-8)

Symptoms

SPARC T5-4 systems may suffer a power glitch during an ac power cord cycle (disconnecting/connecting power cables) when parallel boot (-> show /SP/policy PARALLEL_BOOT) and host auto power on (-> show /SP/policy HOST_AUTO_POWER_ON) are enabled (which are the default values).

 

Note: This issue has been seen only on the SPARC T5-4 systems (not seen on SPARC T5-2 and SPARC T5-8 systems) with the following system firmware versions: 9.3.0.d (20229460), 9.3.0.b (20034528), 9.2.1.b (19264423), but it might be also occur on higher/lower firmware versions. To determine the platform and firmware version, below commands can be used from the sp (ilom): 

-> show /SYS product_name
   product_name = SPARC T5-4

-> show /HOST sysfw_version
   sysfw_version = Sun System Firmware 9.3.0.d 2014/12/09 14:14
   sysfw_version = Sun System Firmware 9.3.0.b 2014/11/13 17:52
   sysfw_version = Sun System Firmware 9.2.1.b 2014/07/11 14:46

-> show /System model
   model = SPARC T5-4

-> show /System system_fw_version
   system_fw_version = Sun System Firmware 9.3.0.d 2014/12/09 14:14
   system_fw_version = Sun System Firmware 9.3.0.b 2014/11/13 17:52
   system_fw_version = Sun System Firmware 9.2.1.b 2014/07/11 14:46

and to check the sp policy settings, below commands can be used:

-> show /SP/policy
   HOST_AUTO_POWER_ON = enabled
   HOST_COOLDOWN = disabled
   HOST_LAST_POWER_STATE = disabled
   HOST_POWER_ON_DELAY = disabled
   PARALLEL_BOOT = enabled
   VGA_REAR_PORT = disabled

 

After an ac power cord cycle (disconnecting/connecting power cables) when parallel boot (-> show /SP/policy PARALLEL_BOOT) and host auto power on (-> show /SP/policy HOST_AUTO_POWER_ON) are enabled (which are the default values), the following events are reported:

System firmware 9.3.0.d
- @persist@faultdiags@ereports.log
 xxxx-xx-xx/xx:xx:xx ereport.chassis.power.glitch-toomany@/SYS
 [unrecognized]
 REG_0x2d20010 = 0x0
 REG_0x2d20011 = 0x0
 detector = /SYS/DC_GLITCH
 hidden = true
- @usr@local@bin@fmdump_-ev.out
 xxxx-xx-xx/xx:xx:xx ereport.chassis.power.glitch-toomany@/SYS
- @persist@faultdiags@faults.log
 -
- @usr@local@bin@fmdump_-v.out
 -
- @usr@local@bin@fmadm_faulty.out
 -
- @coredump@sp_trace@logs@CRIT.log
 POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:322 host_sys_irq_cb: received power glitch irq
 POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:357 host_sys_irq_cb: glitch reg 0x2d20010 : 0x00
 POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:357 host_sys_irq_cb: glitch reg 0x2d20011 : 0x00
 POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:357 host_sys_irq_cb: glitch reg 0x2d20012 : 0x00
 POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:357 host_sys_irq_cb: glitch reg 0x2d20013 : 0x00
 POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:357 host_sys_irq_cb: glitch reg 0x2d20000 : 0x10
 POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:357 host_sys_irq_cb: glitch reg 0x2d20001 : 0x90
 POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:357 host_sys_irq_cb: glitch reg 0x2d20008 : 0xf9
 POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:357 host_sys_irq_cb: glitch reg 0x2d20009 : 0x0f
 POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:357 host_sys_irq_cb: glitch reg 0x2d2000a : 0xff
 POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:357 host_sys_irq_cb: glitch reg 0x2d2000b : 0xff
- @usr@local@bin@spshexec_show_-script_@X@logs@event@list.out
 xxx xxx xxx xx xx:xx:xx xxxx System Log minor Host: Standby
 xxx xxx xxx xx xx:xx:xx xxxx Power Off major Power to /SYS has been turned off by: SP, Reason: Power glitch detected
 xxx xxx xxx xx xx:xx:xx xxxx HOST Fault critical Host powerglitch detected, powering off.
 xxx xxx xxx xx xx:xx:xx xxxx System Log minor Host: Warm start
- @usr@local@bin@spshexec_show_faulty.out
 -

 

System firmware 9.3.0.b or 9.2.1.b
- @persist@faultdiags@ereports.log
 xxxx-xx-xx/xx:xx:xx ereport.chassis.power.glitch-toomany@/SYS
 detector = /SYS/DC_GLITCH
 hidden = true
- @usr@local@bin@fmdump_-ev.out
 xxxx-xx-xx/xx:xx:xx ereport.chassis.power.glitch-toomany@/SYS
- @persist@faultdiags@faults.log
 xxxx-xx-xx/xx:xx:xx xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx SPT-8000-DH
 fault = fault.chassis.voltage.fail@/SYS
 certainty = 100.0 %
 FRU = /SYS
 ASRU = /SYS
 resource = /SYS
 _list_sz = 1
 _list_idx = 0
 diagnosis_engine = fdd 1.0
 system_component_serial_number = xxxxxxxxxx
 system_component_part_number = xxxxxxxx+x+x
 system_component_name = SPARC T5-4
 system_component_manufacturer = Oracle Corporation
 chassis_serial_number = xxxxxxxxxx
 chassis_part_number = xxxxxxxx+x+x
 chassis_name = SPARC T5-4
 chassis_manufacturer = Oracle Corporation
 system_serial_number = xxxxxxxxxx
 system_part_number = xxxxxxxx+x+x
 system_name = SPARC T5-4
 system_manufacturer = Oracle Corporation
 fru_manufacturer = Oracle Corporation
 fru_name = SPARC T5-4
 fru_part_number = xxxxxxxx+x+x
 fru_serial_number = xxxxxxxxxx
- @usr@local@bin@fmdump_-v.out
 xxxx-xx-xx/xx:xx:xx xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx SPT-8000-DH
 fault = fault.chassis.voltage.fail@/SYS
 certainty = 100.0 %
 FRU = /SYS
 ASRU = /SYS
 resource = /SYS
- @usr@local@bin@fmadm_faulty.out
 ------------------- ------------------------------------ -------------- --------
 Time UUID msgid Severity
 ------------------- ------------------------------------ -------------- --------
 xxxx-xx-xx/xx:xx:xx xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx SPT-8000-DH Critical
 Problem Status : solved
 Diag Engine : fdd 1.0
 System
 Manufacturer : Oracle Corporation
 Name : SPARC T5-4
 Part_Number : xxxxxxxx+x+x
 Serial_Number : xxxxxxxxxx
 ----------------------------------------
 Suspect 1 of 1
 Fault class : fault.chassis.voltage.fail
 Certainty : 100%
 Affects : /SYS
 Status : faulted
 FRU
 Status : faulty
 Location : /SYS
 Manufacturer : Oracle Corporation
 Name : SPARC T5-4
 Part_Number : xxxxxxxx+x+x
 Serial_Number : xxxxxxxxxx
 Chassis
 Manufacturer : Oracle Corporation
 Name : SPARC T5-4
 Part_Number : xxxxxxxx+x+x
 Serial_Number : xxxxxxxxxx
 Description : A chassis voltage supply is operating outside of the allowable range.
 Response : The system will be powered off. The chassis-wide service required LED will be illuminated.
 Impact : The system is not usable until repaired. ILOM will not allow the system to be powered on until repaired.
 Action : Please refer to the associated reference document at http://support.oracle.com/msg/SPT-8000-DH for the latest service procedures and policies regarding this diagnosis.
- @coredump@sp_trace@logs@CRIT.log
 POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:322 host_sys_irq_cb: received power glitch irq
 POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:357 host_sys_irq_cb: glitch reg 0x2d20010 : 0x00
 POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:357 host_sys_irq_cb: glitch reg 0x2d20011 : 0x00
 POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:357 host_sys_irq_cb: glitch reg 0x2d20012 : 0x00
 POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:357 host_sys_irq_cb: glitch reg 0x2d20013 : 0x00
 POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:357 host_sys_irq_cb: glitch reg 0x2d20000 : 0x10
 POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:357 host_sys_irq_cb: glitch reg 0x2d20001 : 0x90
 POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:357 host_sys_irq_cb: glitch reg 0x2d20008 : 0xf9
 POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:357 host_sys_irq_cb: glitch reg 0x2d20009 : 0x0f
 POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:357 host_sys_irq_cb: glitch reg 0x2d2000a : 0xff
 POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:357 host_sys_irq_cb: glitch reg 0x2d2000b : 0xff
- @usr@local@bin@spshexec_show_-script_@X@logs@event@list.out
 xxx xxx xxx xx xx:xx:xx xxxx System Log minor Host: Standby
 xxx xxx xxx xx xx:xx:xx xxxx Power Off major Power to /SYS has been turned off by: SP, Reason: Power glitch detected
 xxx xxx xxx xx xx:xx:xx xxxx HOST Fault critical Host powerglitch detected, powering off.
 xxx xxx xxx xx xx:xx:xx xxxx System Log minor Host: Warm start
- @usr@local@bin@spshexec_show_faulty.out
 ------------------- ------------------------------------ -------------- --------
 Time UUID msgid Severity
 ------------------- ------------------------------------ -------------- --------
 xxxx-xx-xx/xx:xx:xx xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx SPT-8000-DH Critical
 Problem Status : solved
 Diag Engine : fdd 1.0
 System
 Manufacturer : Oracle Corporation
 Name : SPARC T5-4
 Part_Number : xxxxxxxx+x+x
 Serial_Number : xxxxxxxxxx
 ----------------------------------------
 Suspect 1 of 1
 Fault class : fault.chassis.voltage.fail
 Certainty : 100%
 Affects : /SYS
 Status : faulted
 FRU
 Status : faulty
 Location : /SYS
 Manufacturer : Oracle Corporation
 Name : SPARC T5-4
 Part_Number : xxxxxxxx+x+x
 Serial_Number : xxxxxxxxxx
 Chassis
 Manufacturer : Oracle Corporation
 Name : SPARC T5-4
 Part_Number : xxxxxxxx+x+x
 Serial_Number : xxxxxxxxxx
 Description : A chassis voltage supply is operating outside of the allowable range.
 Response : The system will be powered off. The chassis-wide service required LED will be illuminated.
 Impact : The system is not usable until repaired. ILOM will not allow the system to be powered on until repaired.
 Action : Please refer to the associated reference document at http://support.oracle.com/msg/SPT-8000-DH for the latest service procedures and policies regarding this diagnosis.

 

Changes

 

Cause

Race condition, which will not be fixed.

Old bug has been closed:

- Bug 16303591: DC_GLITCH occurs for /SYS poweron in parallel with SP reset
  - Status: 84 - Closed, not feasible to fix
  - This is being caused by a race condition on the DR power Control on PM3 when in a T5-4 when we are in Auto Power-on. It was improperly gated in the FPGA Power control Glitch detection.
    This is not showing up in the Glitch registers due to the PM3 Present signal which is gating the POK (Good).

and a new bug has been created which has been closed also:

- Bug 20521293: DC_GLITCH occurs for /SYS poweron in parallel with SP reset
  - Status: 84 - Closed, not feasible to fix
  - The T5 MB FPGA can -NOT- be updated in the field. Marking as "Not feasible to fix"
  - The current ILOM policy was changed such that on encountering a "Glitch" it will retry the system power-on. Since this is not a AC-on Parallel boot power on the glitch (if not a true system failure) will not
    be encountered and the system will operate as expected.
  - As noted in this bug a doc has already been created to avoid hardware replacement so this should address all the issues with this system behavior.

Solution

To prevent this issue from occurring, disable parallel boot prior to power cycling the system:

  

-> show /SP/policy PARALLEL_BOOT
   PARALLEL_BOOT = enabled
-> set /SP/policy PARALLEL_BOOT=disabled
   *** WARNING ***: If PARALLEL_BOOT is set to disabled, then the HOST will no longer be able to power on when SP-less or the SP is in degraded mode.
   Are you sure you want to set PARALLEL_BOOT=disabled (y/n)? y
   Set 'PARALLEL_BOOT' to 'disabled'
-> show /SP/policy PARALLEL_BOOT
   PARALLEL_BOOT = disabled 

Note:
- if the sp is faulted and not accessable (SP-less)
  - press power button for more then 5 seconds, so that the host (system) will be started
- if the sp is faulted and accessable (SP in degraded mode)
  - clear faults from ilom and ilom fault management shell
  - startup system


If this issue has previously occured, clear the fault from the Service Processor (ILOM cli and/or ILOM fault management shell) before starting the system:

-> show faulty
-> set /SYS clear_fault_action=true
-> show faulty

-> start /SP/faultmgmt/shell/
   Are you sure you want to start /SP/faultmgmt/shell (y/n)? y
   faultmgmtsp> fmadm faulty
   faultmgmtsp> fmadm repair /SYS
   faultmgmtsp> fmadm faulty
   faultmgmtsp> exit

-> start /SYS

 

If the issue still exists, please open a service request at oracle support.

 

Please let me (Devrim Sen) know if any customer is hitting to this issue and provide the following details: sr#, product type, product serial.

  


Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback