![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||||
Solution Type Problem Resolution Sure Solution 1982289.1 : SPARC T5-4 systems may suffer a power glitch during an ac power cord cycle (disconnecting/connecting power cables) when parallel boot and host auto power on are been enabled (which are the default values)
In this Document
Applies to:SPARC T5-4 - Version All Versions to All Versions [Release All Releases]Information in this document applies to any platform. Applies only to SPARC T5-4 (does not apply for SPARC T5-2, SPARC T5-8) SymptomsSPARC T5-4 systems may suffer a power glitch during an ac power cord cycle (disconnecting/connecting power cables) when parallel boot (-> show /SP/policy PARALLEL_BOOT) and host auto power on (-> show /SP/policy HOST_AUTO_POWER_ON) are enabled (which are the default values).
Note: This issue has been seen only on the SPARC T5-4 systems (not seen on SPARC T5-2 and SPARC T5-8 systems) with the following system firmware versions: 9.3.0.d (20229460), 9.3.0.b (20034528), 9.2.1.b (19264423), but it might be also occur on higher/lower firmware versions. To determine the platform and firmware version, below commands can be used from the sp (ilom): -> show /SYS product_name product_name = SPARC T5-4 -> show /HOST sysfw_version sysfw_version = Sun System Firmware 9.3.0.d 2014/12/09 14:14 sysfw_version = Sun System Firmware 9.3.0.b 2014/11/13 17:52 sysfw_version = Sun System Firmware 9.2.1.b 2014/07/11 14:46 -> show /System model model = SPARC T5-4 -> show /System system_fw_version system_fw_version = Sun System Firmware 9.3.0.d 2014/12/09 14:14 system_fw_version = Sun System Firmware 9.3.0.b 2014/11/13 17:52 system_fw_version = Sun System Firmware 9.2.1.b 2014/07/11 14:46 and to check the sp policy settings, below commands can be used: -> show /SP/policy HOST_AUTO_POWER_ON = enabled HOST_COOLDOWN = disabled HOST_LAST_POWER_STATE = disabled HOST_POWER_ON_DELAY = disabled PARALLEL_BOOT = enabled VGA_REAR_PORT = disabled
After an ac power cord cycle (disconnecting/connecting power cables) when parallel boot (-> show /SP/policy PARALLEL_BOOT) and host auto power on (-> show /SP/policy HOST_AUTO_POWER_ON) are enabled (which are the default values), the following events are reported: System firmware 9.3.0.d - @persist@faultdiags@ereports.log xxxx-xx-xx/xx:xx:xx ereport.chassis.power.glitch-toomany@/SYS [unrecognized] REG_0x2d20010 = 0x0 REG_0x2d20011 = 0x0 detector = /SYS/DC_GLITCH hidden = true - @usr@local@bin@fmdump_-ev.out xxxx-xx-xx/xx:xx:xx ereport.chassis.power.glitch-toomany@/SYS - @persist@faultdiags@faults.log - - @usr@local@bin@fmdump_-v.out - - @usr@local@bin@fmadm_faulty.out - - @coredump@sp_trace@logs@CRIT.log POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:322 host_sys_irq_cb: received power glitch irq POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:357 host_sys_irq_cb: glitch reg 0x2d20010 : 0x00 POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:357 host_sys_irq_cb: glitch reg 0x2d20011 : 0x00 POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:357 host_sys_irq_cb: glitch reg 0x2d20012 : 0x00 POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:357 host_sys_irq_cb: glitch reg 0x2d20013 : 0x00 POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:357 host_sys_irq_cb: glitch reg 0x2d20000 : 0x10 POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:357 host_sys_irq_cb: glitch reg 0x2d20001 : 0x90 POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:357 host_sys_irq_cb: glitch reg 0x2d20008 : 0xf9 POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:357 host_sys_irq_cb: glitch reg 0x2d20009 : 0x0f POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:357 host_sys_irq_cb: glitch reg 0x2d2000a : 0xff POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:357 host_sys_irq_cb: glitch reg 0x2d2000b : 0xff - @usr@local@bin@spshexec_show_-script_@X@logs@event@list.out xxx xxx xxx xx xx:xx:xx xxxx System Log minor Host: Standby xxx xxx xxx xx xx:xx:xx xxxx Power Off major Power to /SYS has been turned off by: SP, Reason: Power glitch detected xxx xxx xxx xx xx:xx:xx xxxx HOST Fault critical Host powerglitch detected, powering off. xxx xxx xxx xx xx:xx:xx xxxx System Log minor Host: Warm start - @usr@local@bin@spshexec_show_faulty.out -
System firmware 9.3.0.b or 9.2.1.b - @persist@faultdiags@ereports.log xxxx-xx-xx/xx:xx:xx ereport.chassis.power.glitch-toomany@/SYS detector = /SYS/DC_GLITCH hidden = true - @usr@local@bin@fmdump_-ev.out xxxx-xx-xx/xx:xx:xx ereport.chassis.power.glitch-toomany@/SYS - @persist@faultdiags@faults.log xxxx-xx-xx/xx:xx:xx xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx SPT-8000-DH fault = fault.chassis.voltage.fail@/SYS certainty = 100.0 % FRU = /SYS ASRU = /SYS resource = /SYS _list_sz = 1 _list_idx = 0 diagnosis_engine = fdd 1.0 system_component_serial_number = xxxxxxxxxx system_component_part_number = xxxxxxxx+x+x system_component_name = SPARC T5-4 system_component_manufacturer = Oracle Corporation chassis_serial_number = xxxxxxxxxx chassis_part_number = xxxxxxxx+x+x chassis_name = SPARC T5-4 chassis_manufacturer = Oracle Corporation system_serial_number = xxxxxxxxxx system_part_number = xxxxxxxx+x+x system_name = SPARC T5-4 system_manufacturer = Oracle Corporation fru_manufacturer = Oracle Corporation fru_name = SPARC T5-4 fru_part_number = xxxxxxxx+x+x fru_serial_number = xxxxxxxxxx - @usr@local@bin@fmdump_-v.out xxxx-xx-xx/xx:xx:xx xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx SPT-8000-DH fault = fault.chassis.voltage.fail@/SYS certainty = 100.0 % FRU = /SYS ASRU = /SYS resource = /SYS - @usr@local@bin@fmadm_faulty.out ------------------- ------------------------------------ -------------- -------- Time UUID msgid Severity ------------------- ------------------------------------ -------------- -------- xxxx-xx-xx/xx:xx:xx xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx SPT-8000-DH Critical Problem Status : solved Diag Engine : fdd 1.0 System Manufacturer : Oracle Corporation Name : SPARC T5-4 Part_Number : xxxxxxxx+x+x Serial_Number : xxxxxxxxxx ---------------------------------------- Suspect 1 of 1 Fault class : fault.chassis.voltage.fail Certainty : 100% Affects : /SYS Status : faulted FRU Status : faulty Location : /SYS Manufacturer : Oracle Corporation Name : SPARC T5-4 Part_Number : xxxxxxxx+x+x Serial_Number : xxxxxxxxxx Chassis Manufacturer : Oracle Corporation Name : SPARC T5-4 Part_Number : xxxxxxxx+x+x Serial_Number : xxxxxxxxxx Description : A chassis voltage supply is operating outside of the allowable range. Response : The system will be powered off. The chassis-wide service required LED will be illuminated. Impact : The system is not usable until repaired. ILOM will not allow the system to be powered on until repaired. Action : Please refer to the associated reference document at http://support.oracle.com/msg/SPT-8000-DH for the latest service procedures and policies regarding this diagnosis. - @coredump@sp_trace@logs@CRIT.log POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:322 host_sys_irq_cb: received power glitch irq POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:357 host_sys_irq_cb: glitch reg 0x2d20010 : 0x00 POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:357 host_sys_irq_cb: glitch reg 0x2d20011 : 0x00 POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:357 host_sys_irq_cb: glitch reg 0x2d20012 : 0x00 POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:357 host_sys_irq_cb: glitch reg 0x2d20013 : 0x00 POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:357 host_sys_irq_cb: glitch reg 0x2d20000 : 0x10 POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:357 host_sys_irq_cb: glitch reg 0x2d20001 : 0x90 POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:357 host_sys_irq_cb: glitch reg 0x2d20008 : 0xf9 POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:357 host_sys_irq_cb: glitch reg 0x2d20009 : 0x0f POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:357 host_sys_irq_cb: glitch reg 0x2d2000a : 0xff POD MAX xxxx-xx-xx xx:xx:xx.xxxxxx 1382 host_sys_irq_cb.c:357 host_sys_irq_cb: glitch reg 0x2d2000b : 0xff - @usr@local@bin@spshexec_show_-script_@X@logs@event@list.out xxx xxx xxx xx xx:xx:xx xxxx System Log minor Host: Standby xxx xxx xxx xx xx:xx:xx xxxx Power Off major Power to /SYS has been turned off by: SP, Reason: Power glitch detected xxx xxx xxx xx xx:xx:xx xxxx HOST Fault critical Host powerglitch detected, powering off. xxx xxx xxx xx xx:xx:xx xxxx System Log minor Host: Warm start - @usr@local@bin@spshexec_show_faulty.out ------------------- ------------------------------------ -------------- -------- Time UUID msgid Severity ------------------- ------------------------------------ -------------- -------- xxxx-xx-xx/xx:xx:xx xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx SPT-8000-DH Critical Problem Status : solved Diag Engine : fdd 1.0 System Manufacturer : Oracle Corporation Name : SPARC T5-4 Part_Number : xxxxxxxx+x+x Serial_Number : xxxxxxxxxx ---------------------------------------- Suspect 1 of 1 Fault class : fault.chassis.voltage.fail Certainty : 100% Affects : /SYS Status : faulted FRU Status : faulty Location : /SYS Manufacturer : Oracle Corporation Name : SPARC T5-4 Part_Number : xxxxxxxx+x+x Serial_Number : xxxxxxxxxx Chassis Manufacturer : Oracle Corporation Name : SPARC T5-4 Part_Number : xxxxxxxx+x+x Serial_Number : xxxxxxxxxx Description : A chassis voltage supply is operating outside of the allowable range. Response : The system will be powered off. The chassis-wide service required LED will be illuminated. Impact : The system is not usable until repaired. ILOM will not allow the system to be powered on until repaired. Action : Please refer to the associated reference document at http://support.oracle.com/msg/SPT-8000-DH for the latest service procedures and policies regarding this diagnosis.
Changes
CauseRace condition, which will not be fixed. Old bug has been closed:
- Bug 16303591: DC_GLITCH occurs for /SYS poweron in parallel with SP reset - Status: 84 - Closed, not feasible to fix - This is being caused by a race condition on the DR power Control on PM3 when in a T5-4 when we are in Auto Power-on. It was improperly gated in the FPGA Power control Glitch detection. This is not showing up in the Glitch registers due to the PM3 Present signal which is gating the POK (Good). and a new bug has been created which has been closed also: - Bug 20521293: DC_GLITCH occurs for /SYS poweron in parallel with SP reset - Status: 84 - Closed, not feasible to fix - The T5 MB FPGA can -NOT- be updated in the field. Marking as "Not feasible to fix" - The current ILOM policy was changed such that on encountering a "Glitch" it will retry the system power-on. Since this is not a AC-on Parallel boot power on the glitch (if not a true system failure) will not be encountered and the system will operate as expected. - As noted in this bug a doc has already been created to avoid hardware replacement so this should address all the issues with this system behavior. SolutionTo prevent this issue from occurring, disable parallel boot prior to power cycling the system:
-> show /SP/policy PARALLEL_BOOT PARALLEL_BOOT = enabled -> set /SP/policy PARALLEL_BOOT=disabled *** WARNING ***: If PARALLEL_BOOT is set to disabled, then the HOST will no longer be able to power on when SP-less or the SP is in degraded mode. Are you sure you want to set PARALLEL_BOOT=disabled (y/n)? y Set 'PARALLEL_BOOT' to 'disabled' -> show /SP/policy PARALLEL_BOOT PARALLEL_BOOT = disabled Note: - if the sp is faulted and not accessable (SP-less) - press power button for more then 5 seconds, so that the host (system) will be started - if the sp is faulted and accessable (SP in degraded mode) - clear faults from ilom and ilom fault management shell - startup system
-> show faulty -> set /SYS clear_fault_action=true -> show faulty -> start /SP/faultmgmt/shell/ Are you sure you want to start /SP/faultmgmt/shell (y/n)? y faultmgmtsp> fmadm faulty faultmgmtsp> fmadm repair /SYS faultmgmtsp> fmadm faulty faultmgmtsp> exit -> start /SYS
If the issue still exists, please open a service request at oracle support.
Please let me (Devrim Sen) know if any customer is hitting to this issue and provide the following details: sr#, product type, product serial.
Attachments This solution has no attachment |
||||||||||||||||||||
|