Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-2105476.1
Update Date:2017-10-05
Keywords:

Solution Type  Technical Instruction Sure

Solution  2105476.1 :   How to Perform On Site Diagnosis for a Down System for SPARC T7-1, T7-2, T7-4 and SPARC S7-2, S7-2L, Netra S7-2 Servers  


Related Items
  • SPARC S7-2
  •  
  • SPARC T7-1
  •  
  • Netra SPARC S7-2
  •  
  • SPARC T7-2
  •  
  • SPARC T7-4
  •  
  • SPARC S7-2L
  •  
Related Categories
  • PLA-Support>Sun Systems>Sun_Other>Sun Collections>SN-OTH: SPARC-CAP VCAP
  •  




Oracle Confidential PARTNER - Available to partners (SUN).
Reason: FRU CAP

Applies to:

SPARC T7-2 - Version All Versions and later
SPARC T7-4 - Version All Versions and later
SPARC T7-1 - Version All Versions and later
SPARC S7-2 - Version All Versions and later
SPARC S7-2L - Version All Versions and later
Information in this document applies to any platform.

Goal

To aid Field Engineers in On site diagnosis of Down Hard Systems

*****************************************************************************
To report errors or request improvements on this procedure,
please go to http://support.us.oracle.com and put a comment on Doc ID: 2105476.1
*****************************************************************************

Solution

DISPATCH INSTRUCTIONS

WHAT SKILLS DOES THE ENGINEER NEED:
SPARC T7-x, S7-x Service Processor, ILOM/ALOM Application, OS Solaris Skills

TIME ESTIMATE: 120 Minutes

TASK COMPLEXITY: 2

FIELD ENGINEER INSTRUCTIONS

PROBLEM OVERVIEW:
System Down (unable to boot)

WHAT STATE SHOULD THE SYSTEM BE IN TO BE READY TO PERFORM THE RESOLUTION ACTIVITY? :

Down Hard, unknown reason.

WHAT ACTION DOES THE ENGINEER NEED TO TAKE:

1. Validate whether the system is powered on or has shutdown.
# Are the LEDs lit? If nothing is powered on, then the issue is external to the server.
# Let the customer investigate the system's power source, cords, power supplies, etc. for a potential issue.
# Refer to Doc ID 1018104.1 for help on diagnosing power-on issues on Sparc platforms.

2. Validate that there are no Fault LEDs on System Components.
# When the Service LED is on, the ILOM commands show /SP/faultmgmt or show faulty provide details about any faults that can cause this indicator to be lit.
# Also, if a Fault LED(s) is on, it may be a good idea to set the virtual keyswitch to Diag (see step 3) and monitor the POST execution to check if any faults are reported.

3. Verify if the system's keyswitch is "On".

The system keyswitch on SPARC T7-x and S7-x systems is virtual, not physical. To change the position of the virtual keyswitch, use the ILOM command -> set /SYS keyswitch_state=value. You can check the position of the virtual keyswitch via -> show /SYS keyswitch_state.

# The virtual keyswitch should be set to 'Normal' when the system is in normal operation.
# When the keyswitch is set to 'standby', this will disable the poweron command or button from operating.
# If the virtual keyswitch is set to 'diag', this will force the system to run servicemode diagnostics and you'll need to look at the console or terminal output to validate if POST is executing. The system should boot after POST completes.
# The 'locked' position of the keyswitch disables load -source (Firmware Update) and set /HOST/ send_break_action=break commands, but it doesn't affect the start /SYS command or power button.

4. Request the console messages (extended POST output when possible).
# If able to connect to the console, request the output with messages being displayed. Doc ID 1004222.1 explains how to setup console logging and gather diagnostic information.
# Each customer site is different, so they may have this port attached to a dumb terminal or a terminal concentrator.
# Try to validate if the system is executing Power On Self Test (POST). If the system is executing POST, the testing MUST complete before the system can be booted. Interrupting POST (to get faster to OK prompt) will cause the system to go to an undefined state.

5. If you are able to get to OBP, try to type in "boot" and monitor the boot process.

To troubleshoot known product and boot issues consult consult Product Issues doc 1957783.1 (T7-1), doc 1984007.1 (T7-2), doc 1984030.1 (T7-4), doc 2111741.1 (S7-2, S7-2L and Netra S7-2)

# The /SP/policy PARALLEL_BOOT property, when enabled, allows the host to boot/poweron in parallel with the SP if an auto-power policy (HOST_AUTO_POWER_ON or HOST_LAST_POWER_STATE) is on or a user presses the power button while the SP is in the process of booting. ILOM has to be running in order to allow the host to power on when the power button is pressed or the auto-power policies are set.
# When this property is set to disabled, the SP boots first, then the host boots.

6. If the Service Processor is accessible, collect a Snapshot, as it will contain critical and valuable information to troubleshoot the failure. If you can't collect a snapshot, as a last resort get ILOM output from root user commands:

version,   show /SP/logs/event/list,    show faulty,    show -l all /,    show /HOST/console/history,     show /HOST/console/bootlog

NOTE: See also document: Troubleshooting data needed for T3-x, T4-x, T5-x, T7-x, S7-x and T8-x servers (Doc ID 1470580.1)

CAUTION:  If while on site the Field Engineer is unable to perform the above process, cannot solve the problem or requires additional assistance, collect as much information pertaining to the boot failure as possible (console logs, error messages, snapshot, etc), and update this SR requesting the SR be transferred to the next available Engineer, otherwise the SR may auto-close when the FE completes his site visit. 

 

OBTAIN CUSTOMER ACCEPTANCE
- WHAT ACTION DOES THE CUSTOMER NEED TO TAKE TO RETURN THE SYSTEM TO AN OPERATIONAL STATE:

Customer should verify system is stable for return to production.

PARTS NOTE:
No parts required for this action plan. Parts may end up being required, but they are not part of this Action plan. Another Action Plan may be necessary.

REFERENCE INFORMATION:

Product Documentation: Service Manuals, Admin Manuals, Product Notes:

T7-1:  http://docs.oracle.com/cd/E54976_01/index.html
T7-2:  http://docs.oracle.com/cd/E54983_01/index.html
T7-4:  http://docs.oracle.com/cd/E54990_01/index.html
S7-2:  http://docs.oracle.com/cd/E72372_01/index.html
S7-2L: http://docs.oracle.com/cd/E72363_01/index.html
Netra S7-2:http://docs.oracle.com/cd/E72798_01/index.html

 

 

 

References

<NOTE:1470580.1> - Troubleshooting data needed for T3-x, T4-x, T5-x, T7-x, S7-x, T8-x servers

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback