Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-79-1540858.1
Update Date:2017-10-18
Keywords:

Solution Type  Predictive Self-Healing Sure

Solution  1540858.1 :   Sun Fire [TM] SF3800/SF4800/SF4810/SF6800 - E4900/E6900: Reference for Improving Remote Diagnosibility  


Related Items
  • Sun Fire E6900 Server
  •  
  • Sun Fire 3800 Server
  •  
  • Sun Fire 6800 Server
  •  
  • Sun Fire E4900 Server
  •  
  • Sun Fire 4800 Server
  •  
  • Sun Fire 4810 Server
  •  
Related Categories
  • PLA-Support>Sun Systems>SPARC>Enterprise>SN-SPARC: SF-x8x0/Ex900
  •  




In this Document
Purpose
Details
 
 Sun Fire [TM] SF3800/SF4800/SF4810/SF6800 - E4900/E6900 - Best Practices
 Sun Fire [TM] SF3800/SF4800/SF4810/SF6800 - E4900/E6900 - Remote collection of data from both domain and System Controller (SC)
 Sun Fire [TM] SF3800/SF4800/SF4810/SF6800 - E4900/E6900 - What data is needed in order to troubleshoot my software or hardware problem?
References


Applies to:

Sun Fire E4900 Server
Sun Fire E6900 Server
Sun Fire 3800 Server
Sun Fire 4800 Server
Sun Fire 4810 Server
Information in this document applies to any platform.

Purpose

This document is used for configuring different aspects of the server to help improve remote diagnosability, enabling quicker more accurate resolution of hardware and software issues.

Details

To discuss this information further with Oracle experts and industry peers, we encourage you to review, join or start a discussion in an appropriate My Oracle Support Community - Oracle Sun Technologies Community - SPARC Legacy Servers.


Sun Fire [TM] SF3800/SF4800/SF4810/SF6800 - E4900/E6900 - Best Practices


Sun Fire Midframe & Entry-Level Servers Best Practices Update for Firmware 5.20.x

The Best Practice document contains information on the following topics (This is it's Table of Contents):

        Introduction........................................................................... 1
        Platform Configuration ................................................................ 2
             Configuring the RS-232 Serial Port ............................................... 2
             Configuring the Ethernet Port .................................................... 3
             Configuring a Switched Private Network ........................................... 3
             Configuring the Alarms Port (Entry-Level only).................................... 5
             Periodic Sun Fire SC Reboots...................................................... 6
             Configuring SC Failover (Midframe server only).................................... 7
             Setting the Date and Time on the Platform ........................................ 11
             Configuring SNTP (Midframe Servers only).......................................... 12
             Changing POST Levels and Other Settings .......................................... 13

        Configuring the Midframe and Entry-Level Service Processor ............................ 16
             Configuring the SP to receive Log Messages........................................ 17
             Sun MC Software .................................................................. 19
             Preparing for Firmware Updates ................................................... 19
             Explorer Data Collector .......................................................... 20
             Monitoring Domain Consoles ....................................................... 22

        Platform and Domain Administration .................................................... 23

        Platform Security ..................................................................... 28
             Recommendations for User Authorization ........................................... 29
             Serial Port Access ............................................................... 29
             Telnet and Secure Shell Sessions ................................................. 30
             Keyswitch Settings (Midframe Server Only)......................................... 33

        Error Analysis, Diagnosis and Recovery ................................................ 33

        Maintenance Functions ................................................................. 34
             Periodic Server Maintenance ...................................................... 34
             Restoring the Sun Fire SC Configuration .......................................... 35
             Updating the Firmware and Real Time Operating System ............................. 36
             Removing the SC from Platform Use ................................................ 37

    
    
The following doc gives 'Best Practice's and step by step instructions on how to configure a loghost on the
Sun Fire[TM] 3800-6800 and E4900/E6900 servers.
'Best Practices' and configuring loghost on Sun Fire[TM] 3800,4800,4900,6800, and E6900 servers [Video] (Doc ID 1008676.1)

It is considered a 'Best Practice' to configure a 'loghost's for each server and production domain.
A 'Loghost' on a Solaris platform can permanently save messages that are logged in the System Controller's NVRAM buffer.
This will insure that they are not lost due to either a power event, or by rolling off of the small first in first out buffer in the system controller.
Properly stored, they can be quickly accessed if a domain outage occurs even if the server controlling the domain is unresponsive.
These files when sent to SUN Engineers can speed up troubleshooting and resolve problems quickly and accurately.




Sun Fire [TM] SF3800/SF4800/SF4810/SF6800 - E4900/E6900 - Remote collection of data from both domain and System Controller (SC)



See Sun Fire[TM] v1280, 3800, 4800, 4810, 6800, E2900, E4900, E6900 and Netra[TM] 1280, 2900 servers: How to collect scextended or 1280extended Explorer (Doc ID 1019066.1) for a more detailed explanation.

            - Extended explorer has to be run on a system that is able to telnet or ssh into the SC.
            - Frequently, the domain and the SCs are not on the same network.
              When this occurs, you might need to use explorer from the domain and scextended explorer from the loghost for troubleshooting purposes.
            - Oracle Explorer version 6.x and above is recommended for use.
              The latest version of Explorer can be downloaded via Oracle Explorer Data Collector - Product Information Center Document 1312847.1


As "root" from a system with telnet or ssh access to the SC(s);  Preferably the loghost:

        # /opt/SUNWexplo/bin/explorer -w default,scextended

        NOTES:

        default - runs the default explorer data collection modules on the system, if you only require the SC info omit this option.
        scextended - runs the system controller data collection module.
        fru - Not shown above because it is run by default in scextended since about 2005 (Explorer version 5.0).
        Causes the scextended module to also collect prtfru_-x.out which includes part and serial number information.



Alternatively this is a procedure to collect Sun Fire[TM] Midrange Server System Controller (SC) Data
when it is not possible to capture scextended or 1280extended Explorer data.

Procedure to manually collect System Controller (SC) level failure data on Sun Fire[TM] v1280, E2900, 3800, 4800, E4900, 6800, E6900, and Netra 1280, 1290 servers. (Doc ID 1003529.1)


1) Log into a system which has access to the Main System Controller (SC) and open a terminal window.

2) Open a script session so the following SC command output will be captured.

        $ script -a /tmp/scdatafile

3) Connect to the platform shell of the Main SC per your configuration's requirements (telnet, console, ssh, tip, etc):

        $ console main-sc
        $ telnet main-sc
        $ ssh main-sc
        $ tip main-sc

    NOTE: Do not reboot the main SC before collecting this data. Doing so may erase critical troubleshooting information in the SC's log buffer.

    Note – At 5.18.0, you will be offered the option of connecting with a secure shell (ssh) only if you have an SC V2 in the IB/SSC.
    In this case only SSH will work not telnet

4) From the platform shell, execute the following commands which will be captured in the script session that you opened previously:

       The platform shell of the current master SC will have a prompt of
       schostname:SC>  <--- note the uppercase SC

        showdate
        showsc -v
        connections
        showescape
        showkeyswitch
        showfailover -v

        showcodlicense -v
        showcodlicense -rv
        showcodusage -v

        showplatform -v
        showplatform -vda
        showplatform -vdb
        showplatform -vdc
        showplatform -vdd

        showboards -ev
        showcomponent
        showfru -r manr

        showchs -b  (will fail for fw below 5.20.15)
        And for each suspect or faulty component
        showchs -vc /N0/IB6    (for example)

        showdate -v
        showdate -v -d a
        showdate -v -d b
        showdate -v -d c
        showdate -v -d d

        showlogs -v
        showlogs -vp  (the -vp* commands will fail for systems with older SCs)
        showlogs -v -d a
        showlogs -vp -d a
        showlogs -v -d b
        showlogs -vp -d b
        showlogs -v -d c
        showlogs -vp -d c
        showlogs -v -d d
        showlogs -vp -d d

        showerrorbuffer
        showerrorbuffer -p

        showenvironment -ltuv

        history
        showdate -v



        Commands to issue from the platform shell of the current slave SC.
        (For those systems which have both SC0 and SC1.)

        The platform shell of a slave SC will have a prompt of
        schostname:sc>  <--- note the lowercase sc

        showdate
        showenvironment -ltuv
        showfailover -v
        showfru -r manr
        showlogs -v
        showlogs -vp
        showplatform -v
        showsc -v
        showdate -v
        history


        NOTE:  You might need to use a "Control right bracket" ("']") to disconnect, depending on how you have connected to the SC.

5) Exit the script session to save the collected data:

    Hit <control> D and you should get the message "script /tmp/scdatafile closed", "script done" or a similar message.
    Alternatively, you can also type "exit" at the prompt to close the script session.

6) Upload the data file (scdatafile in this example) utilizing the instructions in Oracle Diagnostic File Upload (Doc ID 1547088.2)

See Procedure to manually collect System Controller (SC) level failure data on Sun Fire[TM] v1280, E2900, 3800, 4800, E4900, 6800, E6900, and Netra 1280, 1290 servers. (Doc ID 1003529.1) for all the details

 

Sun Fire [TM] SF3800/SF4800/SF4810/SF6800 - E4900/E6900 - What data is needed in order to troubleshoot my software or hardware problem?



The following docs provide the Minimal/PREFERRED set of data required to progress the call in case of:

    CPU/Memory Errors
    Disk Errors
    System Crash
    System Hang
    Missing/Disabled Component
    Everything else...

 
Data Requirements reference: What data is needed in order to troubleshoot my software or hardware problem? (Doc ID 1019144.1)

 

ATTENTION: this does not replace the indication in the previous section of this Doc:

Sun Fire[TM]/Netra[TM] 1280/1290/E2900 - Remote collection of data from both domain and System Controller (LOM)


Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback