Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-1368728.1
Update Date:2017-11-30
Keywords:

Solution Type  Technical Instruction Sure

Solution  1368728.1 :   How to Perform On Site Diagnosis for a Down System for SPARC Enterprise T5120/T5220, T5140/T5240/T5440/T6320/T6340, Netra T5220 and Netra T5440 Servers :ATR:1368728.1:2  


Related Items
  • Sun SPARC Enterprise T5220 Server
  •  
  • Sun SPARC Enterprise T5240 Server
  •  
  • Sun Blade T6320 Server Module
  •  
  • Sun Netra T5220 Server
  •  
  • Sun SPARC Enterprise T5140 Server
  •  
  • Sun Netra T5440 Server
  •  
  • Sun SPARC Enterprise T5120 Server
  •  
  • Sun SPARC Enterprise T5440 Server
  •  
  • Sun Blade T6340 Server Module
  •  
Related Categories
  • PLA-Support>Sun Systems>Sun_Other>Sun Collections>SN-OTH: SPARC-CAP VCAP
  •  




Oracle Confidential PARTNER - Available to partners (SUN).
Reason: FRU CAP

Applies to:

Sun SPARC Enterprise T5120 Server - Version All Versions and later
Sun SPARC Enterprise T5240 Server - Version All Versions and later
Sun Netra T5220 Server - Version All Versions and later
Sun Blade T6320 Server Module - Version All Versions and later
Sun SPARC Enterprise T5440 Server - Version All Versions and later
Information in this document applies to any platform.

Goal

To aid Field Engineers in On site diagnosis of Down Hard Systems

********************************************************************************
To report errors or request improvements on this procedure,
please go to http://support.us.oracle.com and put a comment on Doc ID: 1368728.1
********************************************************************************

Solution

DISPATCH INSTRUCTIONS

WHAT SKILLS DOES THE ENGINEER NEED:(IS A SITE ENGINEER AVAILABLE?)
System Controller, ILOM/ALOM Application, Intermidiate Solaris Skills

TIME ESTIMATE: 120 Minutes

TASK COMPLEXITY: 2

FIELD ENGINEER INSTRUCTIONS

PROBLEM OVERVIEW:
System Down (unable to boot)

WHAT STATE SHOULD THE SYSTEM BE IN TO BE READY TO PERFORM THE RESOLUTION ACTIVITY? :

Down Hard, unknown reason.

WHAT ACTION DOES THE ENGINEER NEED TO TAKE:

1. Validate whether the system is powered on or has shutdown.
# Are the LEDs lit? If nothing is powered on, then the issue is external to the server.
# Let the customer investigate the system's power source, cords, power supplies, etc. for a potential issue.
# Refer to Doc ID 1018104.1 for help on diagnosing poweron issues on Sparc platforms.


2. Validate that there are no Fault LEDs on System Components.
# When the Service LED is on, the ILOM command show /SP/faultmgmt or the ALOM command showfaults provide details about any faults that can cause this indicator to be lit.
# Also, if a Fault LED(s) is on, it may be a good idea to set the virtual keyswitch to Diag (see step 3) and monitor the POST execution to check if any faults are reported.

3. Verify if the system's keyswitch is "On".
# The system keyswitch on T5x20/T5x40 is virtual, not physical. To change the position of the virtual keyswitch, use the ALOM command sc> setkeyswitch. To check the position of the virtual keyswitch use sc> showkeyswitch. The keyswitch should be set to 'Normal' when the system is in normal operation.
# When the keyswitch is set to 'Standby', this will disable the poweron command or button from operating.
# If the virtual keyswitch is set to 'Diag', this will force the system to run servicemode diagnostics and you'll need to look at the console output to validate if POST is executing. The system should boot after POST completes.
# The 'Locked' position of the keyswitch disables flashupdate and break commands, but it doesn't affect the poweron command or button.


4. Request the console messages (extended POST output when possible).
# If able to connect to the console, request the output with messages being displayed. Doc ID 1004222.1 explains how to setup console logging and gather diagnostic information.
# Each customer site is different, so they may have this port attached to a dumb terminal or a terminal concentrator.
# Try to validate if the system is executing Power On Self Test (POST).
If it is executing POST, the testing MUST complete before the system can be booted. Interrupting POST (to get faster to OK prompt) will cause the system to go to an undefined state. From the "ok" prompt, type "boot" and monitor the boot process

5. If you are able to get to OBP, try to type in "boot" and monitor the boot process. To troubleshoot boot issues refer to doc 1002932.1. For KNOWN BOOT ISSUES consult Product issues doc 1332340.1 (T5120/T5220, Netra T5220), doc 1317808.1 (T5140/T5240, Netra T5440), doc 1340173.1 (T6320) and doc 1340484.1 (T6340).

6. If the Service Processor is accessible, collect a Snapshot, as it will contain valuable information to troubleshoot the failure.  If snapshot collection not possible, collect data from the following ALOM compatibility user commands:

showfru,   showhost,   showenvironment (just after the poweron command given),   showlogs -v,   consolehistory -v,   showfaults -v,   showplatform

If an ALOM compatability user is not configured, as a last resort get ILOM output from root user commands:

version,   show /SP/logs/event/list,    show faulty,    show -l all /,    show /HOST/console/history,     show /HOST/console/bootlog

7. If unsure how to proceed, or unable to perform the above process, collect as much information pertaining to the boot failure as possible (console logs, error messages, etc), call back in and request next available engineer.

NOTE: refer to document "Troubleshooting data needed for T5xx0 servers" (Doc ID 1475104.1)

OBTAIN CUSTOMER ACCEPTANCE
- WHAT ACTION DOES THE CUSTOMER NEED TO TAKE TO RETURN THE SYSTEM TO AN OPERATIONAL STATE:

Customer should verify system is stable for return to production.

CAUTION: If while on site the Field Engineer cannot solve the problem or requires additional assistance, update this SR requesting the SR be transferred to the next available Engineer, otherwise the SR may auto-close when the FE completes his site visit.


PARTS NOTE:
No parts required for this action plan. Parts may end up being required, but they are not part of this Action plan. Another Action Plan may be necessary.

REFERENCE INFORMATION:

Product Documentation: Service Manuals, Admin Manuals, Product Notes:
T5120/T5220: http://download.oracle.com/docs/cd/E19839-01/index.html
T5140/T5240: http://download.oracle.com/docs/cd/E19712-01/index.html
SE T5440:    http://docs.oracle.com/cd/E19488-01/index.html
Netra T5220: http://download.oracle.com/docs/cd/E19350-01/index.html
Netra T5440: http://download.oracle.com/docs/cd/E19874-01/index.html

T6320:  http://download.oracle.com/docs/cd/E19745-01/index.html
T6340:  http://download.oracle.com/docs/cd/E19826-01/index.html

References

<NOTE:1475104.1> - Troubleshooting data needed for T5xx0 servers

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback