Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-79-1359404.1
Update Date:2018-03-14
Keywords:

Solution Type  Predictive Self-Healing Sure

Solution  1359404.1 :   Explosum - Explorer Hardware summary tool  


Related Items
  • Sun SPARC Enterprise T5120 Server
  •  
Related Categories
  • PLA-Support>Sun Systems>SPARC>Usx/Blade/Netra>SN-SPARC: USx
  •  
  • _Old GCS Categories>Sun Microsystems>Servers>Entry-Level Servers
  •  
  • _Old GCS Categories>Sun Microsystems>Servers>NEBS-Certified Servers
  •  
  • _Old GCS Categories>Sun Microsystems>Servers>CMT Servers
  •  




Oracle Confidential PARTNER - Available to partners (SUN).
Reason: Restricted Product Info

Applies to:

Sun SPARC Enterprise T5120 Server - Version Not Applicable and later
Information in this document applies to any platform.

Purpose

This lists how to run explosum & provides sample output.

Scope

 

Details

Explosum gathers information from an explorer & will provide an analysis of many known server issues at the end of the summary file, HWsummary.html.  Most output is filtered to only show erroneous entries or short status of various components.  This tool is targeted for VSP products, but can be very helpful when run on explorers from other SPARC or X64 products.  It will also work on some X64 SOSreports & SunDiag directories.  It places the summary into files HWsummary.html & SWsummary.html which are placed into the explorer's top directory.  The HWsummary file does not contain storage array or certain OS information which is included in the larger SWsummary file.  This tool consists of a script "es.sh" which performs some UNIX shell commands & then calls 2 compiled C programs "ce" & "es".  The script expands fmdump-eV files stored below the fma/var directory & places them into fma.  es then parses the explorer & converted files for hardware information.  If certain conditions are detected in es, it's return code to the script will cause an email be sent to affected backline engineers.  FRU & event data will be stored for most platform types if the Explorer contains a valid date, a Serial number, & run from an SR directory in cores3.

I have the latest SPARC version of this tool on cores3

   explosum .          (if already in the top directory of the explorer)
   explosum explorer-top-directory-path
   explosum . output-directory         (redirect output file)

A known working (usually older version) is on behive.  The Solaris X64 executable is esx & shell script is esx.sh.
   https://stbeehive.oracle.com/teamcollab/library/st/SummaryTools/Documents#dcid=334B:3BF0:afrh:38893C00F42F38A1E0404498C8A6612B000AD9E7AE70

Please have the customer run explorers to obtain SP related data to avoid manual data gathering/analysis, as follows:

   ALOM based:   explorer -w default,alomextended (preferred on T5xx0 servers)
   T3 & 4:          explorer -w ipmi,ipmiextended,ilomextended,default

Explorer 7.0 should be used to obtain data on Solaris 11 systems if possible.  Please note that the latest version of Explorer will contain more data to help isolate problems.

A method to run Explosum via the browser is to click the Explominer link at ISDE URL:  https://mos-cores.us.oracle.com/collectionviewer/prod/index.php .  Doing this will also add the required patch list to the summary!

 

An FE can download these 3 executables (es.sh, ce, es) from the SummaryTools beehive site & place them into the same directory on a Solaris based server.  It is then run by typing "es.sh ." when in the top directory of the explorer (if the executable directory is added to the PATH variable.

 


Explanation of tool's output:

******************************************************************************
explosum revision 6.22          (Explorer Revision: 6.6)  
Oracle4Ever      SR# 3-1234567890     15/01/27-18:33:
Hostname: tryo     HostID: 8645d49a     Platform: SPARC T5-2     Serial#: AK00182246
******************************************************************************

Internal:  System Config    PCI Config    Disk Config    Net Config    Fault Info    Logs    Analysis
External:  Explosum    Issues    FW    Troubleshooting    SSH


========== System Configuration ============

Lists any known major issues with loaded version of FW. If prtdiag not obtained, it checks SP related files for the System FW or OBP.
##### sysconfig/prtdiag-v.out   #####

 Sun System Firmware 7.1.8.a 2009/03/15 14:48
***  Major ILOM memory leaks were fixed by FW 7.2.7 so this should be upgraded soon.
 OBP 4.29.2 2009/03/12 06:53
  This OBP used in system FW: 7.1.8 through 8.d

##### etc/release   #####

Solaris 10 5/08 s10s_u5wos_10 SPARC

 

If Solaris 11 - lists the branch & documentation to determine the version.
##### patch+pkg/pkg_info-l.out See doc 1372094.1 #####

Branch: 0.175.0.6.0.6.0


##### sysconfig/uname-a.out   #####

KJP loaded = 147440-09

 

If ExploMiner run prior Explosum, it's required patches are listed.
##### ExploMiner_SPARC-T5-2_patches.nobody#####

This summary contains only REQUIRED Patches and their DEPENDENT Patches
Required: 120812-32 OpenGL 1.5: OpenGL Patch for Solaris
Required: 150011-04 VM Server for SPARC 3.0 ldmd patch
Required: 150400-20 SunOS 5.10: Kernel Patch

 

Lists each time KJP was upgraded.
##### patch+pkg/patch_listing   #####

Aug 23  2009 137137-09  S10 U6 10/08
Aug 23  2009 138888-01  S10 U7 Point
Mar 13  2011 139555-08  S10 U7 5/09
Mar 13  2011 141444-09  S10 U8 10/09
Mar 13  2011 142909-17  S10 U9 09/10
Mar 13  2011 144488-04  S10 U10 Point
Aug 21  2011 144488-17  S10 U10 Point
Jun 24 01:32 144500-19  S10 U10 8/11
Jun 24 01:36 147440-09  S10 U11 Point

 

Lists required FMA & CMT requiored patches.
##### patch+pkg/patch-list   #####

119578-30  FMA
126897-02  FMA
127755-01  FMA
145961-01  FMA
FMADM 146582-02 missing
FMD 147778-01 missing
FMD 147790-01 missing

 

========== LDom Configuration ============


##### sysconfig/ldm_-V.out #####

Logical Domains Manager (v 3.1.0.1)



##### sysconfig/virtinfo-a.out#####

Domain name: primary
Control domain: prod01

 ##### sysconfig/ldm_list_-l.outfile lists VCPU utilizations #####

 NAME STATE FLAGS CONS VCPU MEMORY UTIL NORM UPTIME
 primary active -n-cv- UART 32 32G 1.9% 2.0% 88d 23h 51m
 infra01 bound ------ 5000 64 32G 
 infra11 active -n---- 5002 64 64G 0.1% 0.1% 8d 24m
 midtier01 bound ------ 5001 32 32G 
 midtier11 active -n---- 5003 64 64G 0.0% 0.0% 8d 37m



##### sysconfig/svcs-av.outlists only LDom related services #####

STATE NSTATE STIME CTID FMRI
online - Oct_30 32 svc:/ldoms/agents:default
online - Oct_30 100 svc:/ldoms/vntsd:default
online - Oct_30 93 svc:/ldoms/ldmd:default



##### sysconfig/ldm_list-devices_-a.out See doc 1020212.1 #####

Number of cores: 32

 

Lists only items which have problems.
##### Tx000/showenvironment   #####

Supply     Status            Fan_Fault  Temp_Fault  Volt_Fault  Cur_Fault
/SYS/PS1    No Input Power       OFF       OFF          OFF         OFF



========== FRU Configuration ============

Displays FRU board numbers, failed components, & indication of possible non-certified DIMMs using manufacturer part number.  Link to doc for CMT certified DIMMs listed.
##### Tx000/showfru     See doc 1411086.1 #####

            Part       Manufacturer        Part #         Ser #               Max Temp         Status
             /SYS/MB  Mitac Internat     5111392-02  AU01UL               101 (28 degrees C)  0x64 (MAINTENANCE REQUIRED, SUSPECT, DE
            /SYS/PDB  FOXCONN            5017697-09  G05KFH               101 (28 degrees C)  0x00 (OK)
         /SYS/PADCRD  FOXCONN            5111255-03  A10YC9               101 (28 degrees C)  0x00 (OK)
          /SYS/SASBP  FOXCONN            5111256-01  A20TLN               101 (28 degrees C)  0x00 (OK)
         /SYS/FANBD0  FOXCONN            5017695-04  E07T59               101 (28 degrees C)  0x00 (OK)
         /SYS/FANBD1  FOXCONN            5017695-04  E07T99               101 (28 degrees C)  0x00 (OK)
            /SYS/PS0  Power-One          3002138-03  A718CU
            /SYS/PS1  Power-One          3002138-03  A718CZ

              DIMM               Manufacturer       Vendor Part #      Part #     Ser #              Status
       /SYS/MB/CMP0/BR0/CH0/D0  Hynix Semicond  HYMP125L72CP8D5-Y5    511-1151  1091A63A            0x64 (MAINTENANCE REQUIRED, SUSPECT, DE
       /SYS/MB/CMP0/BR0/CH0/D1  Hynix Semicond  HYMP125L72CP8D5-Y5    511-1151  10A1A65B            0x64 (MAINTENANCE REQUIRED, SUSPECT, DE
       /SYS/MB/CMP0/BR0/CH1/D0  Hynix Semicond  HYMP125L72CP8D5-Y5    511-1151  10C1A63A            0x64 (MAINTENANCE REQUIRED, SUSPECT, DE
       /SYS/MB/CMP0/BR0/CH1/D1  Hynix Semicond  HYMP125L72CP8D5-Y5    511-1151  1031A66B            0x64 (MAINTENANCE REQUIRED, SUSPECT, DE
       /SYS/MB/CMP0/BR1/CH0/D0  Hynix Semicond  HYMP125L72CP8D5-Y5    511-1151  1041A635            0x00 (OK)
       /SYS/MB/CMP0/BR1/CH0/D1  Hynix Semicond  HYMP125L72CP8D5-Y5    511-1151  1051A65D            0x00 (OK)
       /SYS/MB/CMP0/BR1/CH1/D0  Hynix Semicond  HYMP125L72CP8D5-Y5    511-1151  10B1A673            0x00 (OK)
       /SYS/MB/CMP0/BR1/CH1/D1  Hynix Semicond  HMP31GF7AFR4C-Y5D5    0000000   33306C58            0x00 (OK)     DIMM possibly not certified!!!
       /SYS/MB/CMP1/BR0/CH0/D0  Hynix Semicond  HYMP125L72CP8D5-Y5    511-1151  10B1A634            0x00 (OK)
       /SYS/MB/CMP1/BR0/CH0/D1  Hynix Semicond  HYMP125L72CP8D5-Y5    511-1151  1041A65C            0x00 (OK)
       /SYS/MB/CMP1/BR0/CH1/D0  Hynix Semicond  HYMP125L72CP8D5-Y5    511-1151  1051A65A            0x00 (OK)
       /SYS/MB/CMP1/BR0/CH1/D1  Hynix Semicond  HYMP125L72CP8D5-Y5    511-1151  10C1A65B            0x00 (OK)
       /SYS/MB/CMP1/BR1/CH0/D0  Hynix Semicond  HYMP125L72CP8D5-Y5    511-1151  10B1A65B            0x00 (OK)
       /SYS/MB/CMP1/BR1/CH0/D1  Hynix Semicond  HYMP125L72CP8D5-Y5    511-1151  1061A65D            0x00 (OK)
       /SYS/MB/CMP1/BR1/CH1/D0  Hynix Semicond  HYMP125L72CP8D5-Y5    511-1151  1031A65C            0x00 (OK)
       /SYS/MB/CMP1/BR1/CH1/D1  Hynix Semicond  HYMP125L72CP8D5-Y5    511-1151  1081A65D            0x00 (OK)

 

Somewhat duplicates the section above BUT indicates SSH FRU components & additional error checking.

##### SHOWFRU tool - Thanks to Doug Baker! #####

################################################################################
 Latest version 1.62 on cores2 at /cores_data/local/bin/showfru
 Report bugs, RFEs or if you have questions email doug.baker@oracle.com
 Further info http://panacea.central.sun.com/twiki/bin/view/Tools/ToolPageShowfru
################################################################################
/SYS/MB          Orderable part 540-7939 02
/SYS/MB                         511-1392 02 AU01UL
/SYS/MB/CMP0/BR0/CH0/D0         511-1151 01 1091A63A 0x64 (MAINTENANCE REQUIRED, SUSPECT, DEEMED FAULTY)
/SYS/MB/CMP0/BR0/CH0/D1         511-1151 01 10A1A65B 0x64 (MAINTENANCE REQUIRED, SUSPECT, DEEMED FAULTY)
/SYS/MB/CMP0/BR0/CH1/D0         511-1151 01 10C1A63A 0x64 (MAINTENANCE REQUIRED, SUSPECT, DEEMED FAULTY)
/SYS/MB/CMP0/BR0/CH1/D1         511-1151 01 1031A66B 0x64 (MAINTENANCE REQUIRED, SUSPECT, DEEMED FAULTY)
/SYS/MB/CMP0/BR1/CH0/D0         511-1151 01 1041A635 0x00 (OK)
/SYS/MB/CMP0/BR1/CH0/D1         511-1151 01 1051A65D 0x00 (OK)
/SYS/MB/CMP0/BR1/CH1/D0         511-1151 01 10B1A673 0x00 (OK)
/SYS/MB/CMP0/BR1/CH1/D1         511-1151 01 10C1A637 0x00 (OK)
/SYS/MB/CMP1/BR0/CH0/D0         511-1151 01 10B1A634 0x00 (OK)
/SYS/MB/CMP1/BR0/CH0/D1         511-1151 01 1041A65C 0x00 (OK)
/SYS/MB/CMP1/BR0/CH1/D0         511-1151 01 1051A65A 0x00 (OK)
/SYS/MB/CMP1/BR0/CH1/D1         511-1151 01 10C1A65B 0x00 (OK)
/SYS/MB/CMP1/BR1/CH0/D0         511-1151 01 10B1A65B 0x00 (OK)
/SYS/MB/CMP1/BR1/CH0/D1         511-1151 01 1061A65D 0x00 (OK)
/SYS/MB/CMP1/BR1/CH1/D0         511-1151 01 1031A65C 0x00 (OK)
/SYS/MB/CMP1/BR1/CH1/D1         511-1151 01 1081A65D 0x00 (OK)
/SYS/PDB         Orderable part 541-2073 09
/SYS/PDB                        501-7697 09 G05KFH
/SYS/PADCRD      Orderable part 541-3513 02
/SYS/PADCRD                     511-1255 03 A10YC9
/SYS/SASBP                      511-1256 01 A20TLN
/SYS/FANBD0      Orderable part 541-2211 04
/SYS/FANBD0                     501-7695 04 E07T59
/SYS/FANBD1      Orderable part 541-2211 04
/SYS/FANBD1                     501-7695 04 E07T99
/SYS/PS0                        300-2138 03 A718CU
/SYS/PS1                        300-2138 03 A718CZ

################################################################################
 CHS History of currently disabled Components, use -v to see full history
################################################################################
Component     : /SYS/MB
Time Stamp    : Thu, Apr 28 2011 14:00:36 GMT
New_Status    : 0x64 (MAINTENANCE REQUIRED, SUSPECT, DEEMED FAULTY)
Old_Status    : 0x64 (MAINTENANCE REQUIRED, SUSPECT, DEEMED FAULTY)
Initiator     : Fault Management
Component     : 0
Event_Code       : FMA Message R
Fault_Diag_Secs  :
FMA_String       : PCIEX-8000-0A
UUID:            : 6dd87a3b-8b9a-cb7c-f010-af2b3bb16786
DE_Name          : eft
DE_Version       : 1.16
Diagdata         : 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000


################################################################################
 WARNING: Components can get disabled due to software bugs always check if any
 of the known issues apply before replacing hardware
 PTS T2000 http://panacea/twiki/bin/view/Products/ProdIssuesSunFireT2000
 PTS T1000 http://panacea/twiki/bin/view/Products/ProdIssuesSunFireT1000
################################################################################

 

========== PCI Configuration ============

 

Lists CPU config, link to doc to correlate PCI card to Oracle part #, & list of possible problematic components.
##### sysconfig/prtdiag-v.out    See doc 1373995.1 #####

 System Configuration:  Sun Microsystems  sun4v Sun Netra T5220
 Memory size: 3968 Megabytes
CPU ID Frequency Implementation         Status
0      1165 MHz  SUNW,UltraSPARC-T2     on-line  
...
47     1165 MHz  SUNW,UltraSPARC-T2     on-line  

Slot +            Bus   Name +                            Model   
Status            Type  Path                                      
----------------------------------------------------------------------------
MB/NET0           PCIE  network-pciex8086,105e
                        /pci@0/pci@0/pci@1/pci@0/pci@2/network@0
MB/NET1           PCIE  network-pciex8086,105e
                        /pci@0/pci@0/pci@1/pci@0/pci@2/network@0,1
MB/NET2           PCIE  network-pciex8086,105e
                        /pci@0/pci@0/pci@1/pci@0/pci@3/network@0
MB/NET3           PCIE  network-pciex8086,105e
                        /pci@0/pci@0/pci@1/pci@0/pci@3/network@0,1
MB/SASHBA         PCIE  scsi-pciex1000,58                 LSI,1068E
                        /pci@0/pci@0/pci@2/scsi@0
MB/RISER1/PCIE1   PCIE  SUNW,qlc-pciex1077,2432           QLE2462     
                        /pci@0/pci@0/pci@8/pci@0/pci@1/SUNW,qlc@0
MB/RISER1/PCIE1   PCIE  SUNW,qlc-pciex1077,2432           QLE2462
                        /pci@0/pci@0/pci@8/pci@0/pci@1/SUNW,qlc@0,1
MB/RISER0/PCIE0   PCIE  network-pciex108e,abcd            SUNW,pcie-qgc
                        /pci@0/pci@0/pci@8/pci@0/pci@9/network@0
MB/RISER0/PCIE0   PCIE  network-pciex108e,abcd            SUNW,pcie-qgc
                        /pci@0/pci@0/pci@8/pci@0/pci@9/network@0,1
MB/RISER0/PCIE0   PCIE  network-pciex108e,abcd            SUNW,pcie-qgc
                        /pci@0/pci@0/pci@8/pci@0/pci@9/network@0,2
MB/RISER0/PCIE0   PCIE  network-pciex108e,abcd            SUNW,pcie-qgc
                        /pci@0/pci@0/pci@8/pci@0/pci@9/network@0,3
MB                PCIX  usb-pciclass,0c0310
                        /pci@0/pci@0/pci@1/pci@0/pci@1/pci@0/usb@0
MB                PCIX  usb-pciclass,0c0310
                        /pci@0/pci@0/pci@1/pci@0/pci@1/pci@0/usb@0,1
MB                PCIX  usb-pciclass,0c0320
                        /pci@0/pci@0/pci@1/pci@0/pci@1/pci@0/usb@0,2

--- Non OK sensor output ---
SYS/PS1                            I_IN_MAIN      disabled
SYS/PS1                            I_IN_LIMIT     disabled
SYS/PS1                            I_OUT_MAIN     disabled
SYS/PS1                            I_OUT_LIMIT    disabled
SYS/PS1                            V_IN_MAIN      disabled
SYS/PS1                            V_OUT_MAIN     disabled
SYS                                ACT            steady 

 

##### sysconfig/prtpicl-v.out  Part numbers may point to older version cards (bugs: 19263165 & 19355916) #####

      Label               WWN - MAC       Slot     Part#    Status/Drv  Path                                                Version
        /SYS/MB/XGBE0  00.10.e0.3e.94.5a                     ixgbe 0    /pci@300/pci@1/pci@0/pci@1/network@0                
         /SYS/MB/NET1  00.10.e0.3e.94.5b                     ixgbe 1    /pci@300/pci@1/pci@0/pci@1/network@0,1              
        /SYS/MB/PCIE1  90.e2.ba.5a.1d.40    1   375-3617-01  ixgbe 2    /pci@300/pci@1/pci@0/pci@4/network@0                Sun Dual 10GbE SFP+ PCIe 2.0 LP FCode 3.01 4/2/2012 
        /SYS/MB/PCIE1  90.e2.ba.5a.1d.41    1   375-3617-01  ixgbe 3    /pci@300/pci@1/pci@0/pci@4/network@0,1              Sun Dual 10GbE SFP+ PCIe 2.0 LP FCode 3.01 4/2/2012 
        /SYS/MB/XGBE1  00.10.e0.3e.94.5c                     ixgbe 4    /pci@3c0/pci@1/pci@0/pci@1/network@0                
         /SYS/MB/NET3  00.10.e0.3e.94.5d                     ixgbe 5    /pci@3c0/pci@1/pci@0/pci@1/network@0,1              
        /SYS/MB/PCIE2  90.e2.ba.5a.1c.40    2   375-3617-01  ixgbe 6    /pci@380/pci@1/pci@0/pci@5/network@0                Sun Dual 10GbE SFP+ PCIe 2.0 LP FCode 3.01 4/2/2012 
        /SYS/MB/PCIE2  90.e2.ba.5a.1c.41    2   375-3617-01  ixgbe 7    /pci@380/pci@1/pci@0/pci@5/network@0,1              Sun Dual 10GbE SFP+ PCIe 2.0 LP FCode 3.01 4/2/2012 
        /SYS/MB/PCIE3                       3   371-4306     emlxs 0    /pci@380/pci@1/pci@0/pci@6/SUNW,emlxs@0             LPe12002-S 
        /SYS/MB/PCIE3                       3   371-4306     emlxs 1    /pci@380/pci@1/pci@0/pci@6/SUNW,emlxs@0,1           LPe12002-S 
        /SYS/MB/PCIE4                       4   371-4306     emlxs 2    /pci@380/pci@1/pci@0/pci@7/SUNW,emlxs@0             LPe12002-S 
        /SYS/MB/PCIE4                       4   371-4306     emlxs 3    /pci@380/pci@1/pci@0/pci@7/SUNW,emlxs@0,1           LPe12002-S 

##### sysconfig/fcinfo.out   #####

      WWN               Dev         Model        FW          Serial          State   Speed  Link Sync Sign Prot InvT InvC
10000090fa51454c   /dev/cfg/c11   LPe12002-S LPe12002-  4925382+13440000AP   online    8Gb     7 3079    1    4 3765    0
10000090fa51454d   /dev/cfg/c12   LPe12002-S LPe12002-  4925382+13440000AP   online    8Gb     5 5970    1    4 7017    0
10000090fa51454e    /dev/cfg/c9   LPe12002-S LPe12002-  4925382+13440000B9   online    8Gb     5 2241    1    4 2164    0
10000090fa51454f   /dev/cfg/c10   LPe12002-S LPe12002-  4925382+13440000B9   online    8Gb    22 3420    4   12 2783    4

##### sysconfig/prtconf-v.out  See doc if retired!  See doc 1614738.1 #####

cfg     /dev/cfg/c9     /pci@380/pci@1/pci@0/pci@6/SUNW,emlxs@0/fp@0,0:fc  
cfg     /dev/cfg/c10    /pci@380/pci@1/pci@0/pci@6/SUNW,emlxs@0,1/fp@0,0:fc  
cfg     /dev/cfg/c11    /pci@380/pci@1/pci@0/pci@7/SUNW,emlxs@0/fp@0,0:fc  
cfg     /dev/cfg/c12    /pci@380/pci@1/pci@0/pci@7/SUNW,emlxs@0,1/fp@0,0:fc  



*************************** Disk Configuration *******************************

##### sysconfig/eeprom.out #####

boot-device=/pci@300/pci@1/pci@0/pci@2/scsi@0/disk@w3060943c1859cb2e,0:a disk net
use-nvramrc?=false
nvramrc: data not available.

##### etc/vfstab #####

#device device mount FS fsck mount mount
#to mount to fsck point type pass at boot options
swap - /tmp tmpfs - yes -
/dev/zvol/dsk/rpool/swap - - swap - no -

Lists internal drives since attempts to remove entries from known arrays.  Contact Don D if a new array to be added.
##### disks/diskinfo (internal volumes only listed) - SWsummary contains external disk info See SWsummary.html#Disk Configuration #####

   Location	Vendor		Product		Rev  Serial #	Dual Port
c2t3060943C1859CB2Ed0	LSI   Logical Volume	3000 LSIInternal	primary  
    c3t3d0	TEAC       DV-W28SS-W      	10A 	primary  
c4t38E02031C0D94CD3d0	LSI   Logical Volume	3000 LSIInternal	primary  
 

Lists internal drives since attempts to remove entries from known arrays.  Contact Don D if a new array to be added.
##### sysconfig/iostat-En.out (internal volumes only listed) - SWsummary contains external disk info See SWsummary.html#Disk Configuration #####
                           Disk           Size   Soft Hard Trans Media Ready NoDev Recov Illeg PFlAn
                               c1t0d0  146.81GB    0    0     0     0     0     0     0     0     0
                               c1t1d0  146.81GB    0    0     0     0     0     0     0     0     0
                               c0t0d0    0.00GB     0    0     0     0     0     0     0     4     0
                               c1t2d0  146.81GB    0    0     0     0     0     0     0     0     0


Lists internal drives since attempts to remove entries from known arrays.  Contact Don D if a new array to be added.
##### disks/format.out (internal volumes only listed) - SWsummary contains external disk info See SWsummary.html#Disk Configuration   #####
c1t0d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
   /pci@0/pci@0/pci@2/scsi@0/sd@0,0
c1t1d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
   /pci@0/pci@0/pci@2/scsi@0/sd@1,0
c1t2d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
   /pci@0/pci@0/pci@2/scsi@0/sd@2,0

##### etc/path_to_inst  (internal volumes only listed) #####
"/pci@0/pci@0/pci@1/pci@0/pci@1/pci@0/usb@0,2/hub@4/device@4/storage@0/disk@0,0" 2 "sd"
"/pci@0/pci@0/pci@2/scsi@0/sd@0,0" 0 "sd"
"/pci@0/pci@0/pci@2/scsi@0/sd@1,0" 1 "sd"
"/pci@0/pci@0/pci@2/scsi@0/sd@2,0" 3 "sd"

 

If a T3 - T7 platform, the disk # to pci path is displayed.
##### sysconfig/prtconf-v.out:  (no output implies possible zpool or HW RAID PCI card)  #####
Disk 5000cca0253b496c - 00000000 - /pci@400/pci@1/pci@0/pci@0/LSI,sas@0/disk@w5000cca0253b496d,0
Disk 5000cca0253b5728 - 00000001 - /pci@400/pci@1/pci@0/pci@0/LSI,sas@0/disk@w5000cca0253b5729,0
Disk 5001517bb28964a2 - 00000002 - /pci@400/pci@1/pci@0/pci@0/LSI,sas@0/disk@w5001517bb28964a2,0
Disk 5001517bb289649b - 00000003 - /pci@400/pci@1/pci@0/pci@0/LSI,sas@0/disk@w5001517bb289649b,0
Disk 5000cca0253b3b44 - 00000000 - /pci@700/pci@1/pci@0/pci@0/LSI,sas@0/disk@w5000cca0253b3b45,0
Disk 5000cca0253c4320 - 00000001 - /pci@700/pci@1/pci@0/pci@0/LSI,sas@0/disk@w5000cca0253c4321,0
Disk 5001517bb28963ef - 00000002 - /pci@700/pci@1/pci@0/pci@0/LSI,sas@0/disk@w5001517bb28963ef,0

 

Lists attached Arrays.
##### sysconfig/prtpicl-v.out#####

Array type = ZFS Storage 7335

 

HW RAID data gathered with Explorer 7.0

##### disks/raidctl_-l_-g.out   #####

Disk    Vendor    Product        Firmware    Capacity    Status    HSP
----------------------------------------------------------------------------
0.0.0    SEAGATE    ST914602SSUN146    0400        136.7G        GOOD    N/A

Disk    Vendor    Product        Firmware    Capacity    Status    HSP
----------------------------------------------------------------------------
0.1.0                    N/AGOOD    N/A

 

Lists Cougar card Information if newer version explorer used.
##### RAIDmanager/getconfig_1.out    See doc 1331121.1 #####

Controller: Optimal Ser #: 00820AA0059 BIOS: 5.2-0 (17757) Battery: Not Installed
Logical Dev: 0 Simple_volume Optimal 285685 MB (0,8) Bootable
Logical Dev: 1 Simple_volume Optimal 285685 MB (0,9)
Logical Dev: 2 5 Optimal 857075 MB (0,10)(0,11)(0,19)(0,18)
Logical Dev: 3 5 Optimal 857075 MB (0,14)(0,15)(0,16)(0,13)
Phys Dev: 0  Online   0,8    SEAGATE  ST930003SSUN300G  0D70  00090370E2HJ
Phys Dev: 1  Online   0,9    SEAGATE  ST930003SSUN300G  0D70  00090370E66S   HDD SMART error!
Phys Dev: 2  Online   0,10   SEAGATE  ST930003SSUN300G  0D70  00090370DACN
Phys Dev: 3  Online   0,11   SEAGATE  ST930003SSUN300G  0D70  00090370C3YD
Phys Dev: 4  Failed   0,12   SEAGATE  ST930003SSUN300G  0D70  00090370E8Y8
Phys Dev: 5  Online   0,13   SEAGATE  ST930003SSUN300G  0D70  00100371NQBQ
Phys Dev: 6  Online   0,14   SEAGATE  ST930003SSUN300G  0D70  00100271JPX2
Phys Dev: 7  Online   0,15   SEAGATE  ST930003SSUN300G  0D70  00090370EX1J
Phys Dev: 8  Online   0,16   SEAGATE  ST930003SSUN300G  0D70  00100271FZEA
Phys Dev: 9  Failed   0,17   SEAGATE  ST930003SSUN300G  0D70  00100271JB0F
Phys Dev: 10 Online   0,18   SEAGATE  ST930003SSUN300G  0D70  00100371MWL8
Phys Dev: 11 Online   0,19   SEAGATE  ST930003SSUN300G  0D70  00090370E1AD 


##### RAIDmanager/RaidEvt.log  (Please note that disks start a #0 & system labels lowest disk #1) #####

July 28, 2012 8:33:25 PM COT    WRN      402:A01C0S11L--    sbogadm04    S.M.A.R.T.  slot 3, S/N 001041G3M2DE        PFV3M2DE (Vendor: HITACHI Model: H103030SCSUN300G).
July 28, 2012 8:33:25 PM COT    WRN      402:A01C0S16L--    sbogadm04    S.M.A.R.T.  slot 8, S/N 001041G3Z7TE        PFV3Z7TE (Vendor: HITACHI Model: H103030SCSUN300G).
July 28, 2012 8:33:25 PM COT    WRN      402:A01C0S17L--    sbogadm04    S.M.A.R.T.  slot 9, S/N 001041G408DE        PFV408DE (Vendor: HITACHI Model: H103030SCSUN300G).
July 28, 2012 8:33:25 PM COT    WRN      402:A01C0S19L--    sbogadm04    S.M.A.R.T.  slot 11, S/N 001041G4011E        PFV4011E (Vendor: HITACHI Model: H103030SCSUN300G).
July 28, 2012 8:33:29 PM COT    INF    19434:A00C-S--L--    sbogadm04    User root logged into sbogadm04 with administrative privileges.
July 28, 2012 8:54:52 PM COT    INF        1:A00C-S--L--    sbogadm04    Successfully updated the controller image: sbogadm04, controller 1.
July 28, 2012 9:15:48 PM COT    INF    10572:A0-1C-S--L--    sbogadm04    Sun StorageTek RAID Manager started on TCP/IP port number 34,571.
July 28, 2012 9:22:29 PM COT    INF    19434:A00C-S--L--    sbogadm04    User root logged into sbogadm04 with administrative privileges.
July 28, 2012 9:15:48 PM COT    INF    10572:A0-1C-S--L--    sbogadm04    Sun StorageTek RAID Manager started on TCP/IP port number 34,571.
July 28, 2012 9:32:08 PM COT    INF    19434:A00C-S--L--    sbogadm04    User root logged into sbogadm04 with administrative privileges.
July 28, 2012 9:36:08 PM COT    WRN      402:A01C0S17L--    sbogadm04    S.M.A.R.T.  slot 9, S/N 001041G408DE        PFV408DE (Vendor: HITACHI Model: H103030SCSUN300G).
July 28, 2012 9:36:08 PM COT    WRN      402:A01C0S11L--    sbogadm04    S.M.A.R.T.  slot 3, S/N 001041G3M2DE        PFV3M2DE (Vendor: HITACHI Model: H103030SCSUN300G).
July 28, 2012 9:36:08 PM COT    WRN      402:A01C0S19L--    sbogadm04    S.M.A.R.T.  slot 11, S/N 001041G4011E        PFV4011E (Vendor: HITACHI Model: H103030SCSUN300G).
 

Lists Niwot card info if newer version explorer used.
##### RAIDmanager/MegaCli/CfgDsply-aALL.out    See doc 1397311.1 #####

DISK GROUP: 0  278.464 GB  Optimal  Primary-1, Secondary-0, RAID Level Qualifier-0
                       -----ERRORS-----
Disk Slot DevID  Port  Media Other Pred     State         FW        SAS Addr                       Drive                           SMART
  0    0    9  1(path0)    0    0    0  Online, Spun Up  A2B0  0x5000cca025103c25  HITACHI H106030SDSUN300GA2B01205N8XTEB           No
  1    1    8  0(path0)    0    0    0  Online, Spun Up  A2B0  0x5000cca025264115  HITACHI H106030SDSUN300GA2B01205NP15ZB           No


##### RAIDmanager/MegaCli/GetEvents-aALL.out   #####

Success in AdpEventLog

 

Lists Pool status & lower level objects.  Also contains a link to a usefull ZFS doc.
##### disks/zfs/zpool_status_-v.out    See doc 1004209.1 #####

Pool: rpool        ONLINE  c0t5000CCA01D87F058d0s0  c0t5000CCA01D8E8AA4d0s0 
Pool: zp-730test1  ONLINE  c0t600144F08F08C858000054C667EF0001d0 
Pool: zp-730test2  ONLINE  c0t600144F08F08C858000054C6681F0004d0

 

Lists Mirrors & submirrors with status & a link to a useful SVM doc.  Also lists lower level objects if in faulted status.
##### disks/svm/metastat-t.out    See doc 1003847.1 #####
d50: Mirror
d51: Submirror of d50    State: Okay         Mon Nov  1 11:54:32 2010
d100: Mirror
d41: Submirror of d100    State: Okay         Mon Nov  1 11:54:32 2010
d10: Mirror
d11: Submirror of d10    State: Okay         Mon Nov  1 11:54:32 2010
d0: Mirror
d1: Submirror of d0    State: Okay         Mon Nov  1 11:54:31 2010
d60: Mirror
d61: Submirror of d60    State: Okay         Mon Nov  1 11:54:33 2010

 

Lists Volumes & a link to a useful vxvm doc.  Also lists lower level objects if in faulted status.

##### disks/vxvm/vxprint-th.out See doc 1004532.1#####

v crash - ENABLED ACTIVE 31462500 ROUND - fsgen
 c0t5000C500334503B3d0 c0t5000C500334505F7d0
v rootvol - ENABLED ACTIVE 83887500 ROUND - root
 c0t5000C500334503B3d0 c0t5000C500334505F7d0
v swapvol - ENABLED SYNC 469762500 ROUND - swap
 c0t5000C500334503B3d0 c0t5000C500334503B3d0 c0t5000C500334505F7d0
v dumpvol - ENABLED ACTIVE 41943040 SELECT - fsgen
 c3t5005076802301E28d1
v oemvol - ENABLED ACTIVE 20971520 SELECT - fsgen
 c3t5005076802301E28d0
v vol01 - ENABLED ACTIVE 10485760 SELECT vol01-01 fsgen
 c3t5005076802301E28d0 c3t5005076802301E28d1 c3t5005076802301E28d2 c3t5005076802301E28d3
... 


##### sysconfig/ifconfig-a.out #####
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        inet 127.0.0.1 netmask ff000000
lo0:1: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        zone fwgams10
        inet 127.0.0.1 netmask ff000000
nxge0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
        inet 172.30.184.69 netmask ffffffe0 broadcast 172.30.184.95
        ether 0:21:28:58:1a:3c
nxge0:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
        zone fwgams10
        inet 172.30.184.70 netmask ffffffe0 broadcast 172.30.184.95
nxge1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
        inet 10.1.1.13 netmask ff000000 broadcast 10.255.255.255
        ether 0:21:28:58:1a:3d

*************************** Fault Information ********************************

##### sysconfig/crash/ls-al_var_crash*.out (no cores if empty) #####
-rw-r--r--   1 root     root           2 May 17 14:20 bounds
-rw-r--r--   1 root     root     1706728 May  8 22:57 unix.0
-rw-r--r--   1 root     root     1706728 May  9 11:56 unix.1
-rw-r--r--   1 root     root     1562250 May 10 13:08 unix.2
-rw-r--r--   1 root     root     1706728 May 10 15:22 unix.3
-rw-r--r--   1 root     root     1588723712 May 10 13:09 vmcore.2
-rw-r--r--   1 root     root     1880809472 May 10 15:23 vmcore.3


##### fma/fmdump.out #####
Apr 02 05:17:43.7549 6dd87a3b-8b9a-cb7c-f010-af2b3bb16786 PCIEX-8000-0A
Apr 04 09:57:49.8446 21c312ab-4c48-ee92-c762-e2680ae35b74 FMD-8000-0W
Apr 04 15:54:22.4906 b3895ac1-e2ad-c58f-f189-f2bf8fb0db53 SUN4V-8002-42
Apr 09 23:59:05.3990 773efc10-2b1b-4961-809d-86f3594da8d0 SUN4V-8000-E2
Apr 13 04:07:34.3343 e6cf8844-2c8c-c0b1-8e21-f273ed3fff76 SUN4V-8000-E2
May 10 03:23:18.1358 45b080f8-b0ce-eb40-d6ab-fc16e058ca1c SUN4V-8002-42

 

Lists CPU & memory related retirements.
##### fma/fmstat-a-mcpumem-retire.out#####

     cpu_blfails 0     failed cpu blacklists
     cpu_blsupp 0    cpu blacklists suppressed
     cpu_fails 0       cpu faults unresolveable
     cpu_flts 0         cpu faults resolved
     page_fails 0     page faults unresolveable
     page_flts 0      page faults resolved

 

Will indicate if the cpumem-diagnosis engine is offline for T5xx0s.
##### fma/fmadm-config.out #####


##### fma/fmdump-e.out (Skips CEs & other benign ereports)  #####
May 17 14:11:07.6296 ereport.fm.ferg.invalid
May 17 14:11:07.5240 ereport.io.pci.fabric
May 17 14:11:07.5240 ereport.io.pci.fabric
May 17 14:11:07.5240 ereport.io.pci.fabric
May 17 14:11:07.5240 ereport.io.pci.fabric
May 17 14:11:07.5240 ereport.io.pci.dpe
May 17 14:11:07.5240 ereport.io.pci.mdpe
May 17 14:11:07.5240 ereport.io.pci.sserr
May 17 14:11:07.5240 ereport.io.pciex.tl.ptlp
May 17 14:11:07.5240 ereport.io.pciex.rc.nfe-msg
May 17 14:11:07.5240 ereport.io.pci.dpe
May 17 14:11:07.5240 ereport.io.pci.sec-mdpe
May 17 14:11:07.5240 ereport.io.pciex.a-nonfatal
May 17 14:11:07.5240 ereport.io.pciex.tl.ptlp
May 17 14:11:07.5240 ereport.io.pciex.rc.ce-msg
May 17 14:11:07.5240 ereport.io.pci.dpe
May 17 14:11:07.5240 ereport.io.pci.sec-mdpe
May 17 14:11:07.5240 ereport.io.pciex.a-nonfatal
May 17 14:11:07.5240 ereport.io.pciex.tl.ptlp
May 17 14:11:07.5240 ereport.io.pciex.rc.ce-msg
May 17 14:11:07.6296 ereport.fm.ferg.invalid
May 17 14:30:23.6509 ereport.cpu.ultraSPARC-T2plus.dau
May 17 14:36:42.1586 ereport.fm.ferg.invalid

 

If T3 or T4, lists the DIMM # to path data.
##### fma/fmtopo-V.out #####
DIMM 0 - /SYS/PM0/CMP0/BOB0/CH0/D0
DIMM 1 - /SYS/PM0/CMP0/BOB0/CH0/D1
DIMM 2 - /SYS/PM0/CMP0/BOB0/CH1/D0
DIMM 3 - /SYS/PM0/CMP0/BOB0/CH1/D1
DIMM 4 - /SYS/PM0/CMP0/BOB1/CH0/D0
DIMM 5 - /SYS/PM0/CMP0/BOB1/CH0/D1
DIMM 6 - /SYS/PM0/CMP0/BOB1/CH1/D0
DIMM 7 - /SYS/PM0/CMP0/BOB1/CH1/D1
DIMM 8 - /SYS/PM0/CMP0/BOB2/CH0/D0
DIMM 9 - /SYS/PM0/CMP0/BOB2/CH0/D1
DIMM 10 - /SYS/PM0/CMP0/BOB2/CH1/D0
DIMM 11 - /SYS/PM0/CMP0/BOB2/CH1/D1
DIMM 12 - /SYS/PM0/CMP0/BOB3/CH0/D0
DIMM 13 - /SYS/PM0/CMP0/BOB3/CH0/D1
DIMM 14 - /SYS/PM0/CMP0/BOB3/CH1/D0
DIMM 15 - /SYS/PM0/CMP0/BOB3/CH1/D1

 

Sorts & counts unum & DIMM entries.  The date range of events will be useful.
##### fma/fmdump-eV.out  (Note: Uses server's timezone) #####
The first fmdump-eV entry is from Apr 13 2011 03:09:52.
---- FIRST DATE ----       ---- LAST DATE ----  COUNT  DEVICE
Apr 13 2011 03:09:52 thru May 17 2011 15:06:18  64214  MB/CMP0/BR0/CH0
Apr 13 2011 03:13:01 thru May 17 2011 15:06:56  54833  MB/CMP0/BR0: CH0/D1/J0600
Apr 13 2011 03:13:13 thru May 17 2011 15:06:18   5426  MB/CMP0/BR0: CH0/D0/J0500
Apr 13 2011 04:07:22 thru May 16 2011 12:30:29     22  MB/CMP0/BR0: CH0/D0/J0500 CH1/D0/J0700
Apr 13 2011 04:07:22 thru May 17 2011 06:26:46    137  MB/CMP0/BR0
May 09 2011 11:31:48 thru May 17 2011 14:30:23     53  MB/CMP0/BR0: CH0/D1/J0600 CH1/D1/J0800
May 13 2011 11:54:57 thru May 17 2011 14:11:07     12  /pci@400
May 13 2011 11:54:57 thru May 17 2011 14:11:07     15  /pci@400/pci@0
May 13 2011 11:54:57 thru May 17 2011 14:11:07     15  /pci@400/pci@0/pci@8
May 13 2011 11:54:57 thru May 17 2011 14:11:07     15  /pci@400/pci@0/pci@8/scsi@0

##### fma/fmDump-eV.out  (Note: Uses TSE's timezone.)  #####
The first fmdump-eV entry is from May 17 2011 03:10:11.
---- FIRST DATE ----       ---- LAST DATE ----  COUNT  DEVICE
May 17 2011 03:10:11 thru May 17 2011 15:06:18   1785  MB/CMP0/BR0/CH0
May 17 2011 03:14:25 thru May 17 2011 15:06:18    153  MB/CMP0/BR0: CH0/D0/J0500
May 17 2011 03:25:40 thru May 17 2011 15:06:56   1891  MB/CMP0/BR0: CH0/D1/J0600
May 17 2011 05:55:42 thru May 17 2011 06:26:46      3  MB/CMP0/BR0
May 17 2011 08:04:41 thru May 17 2011 15:02:23    108
May 17 2011 13:20:07 thru May 17 2011 14:11:07      8  /pci@400
May 17 2011 13:20:07 thru May 17 2011 14:11:07     10  /pci@400/pci@0
May 17 2011 13:20:07 thru May 17 2011 14:11:07     10  /pci@400/pci@0/pci@8
May 17 2011 13:20:07 thru May 17 2011 14:11:07     10  /pci@400/pci@0/pci@8/scsi@0
May 17 2011 13:20:07 thru May 17 2011 14:30:23      3  MB/CMP0/BR0: CH0/D1/J0600 CH1/D1/J0800

##### fma/fmadm-faulty.out #####
--------------- ------------------------------------  -------------- ---------
TIME            EVENT-ID                              MSG-ID         SEVERITY
--------------- ------------------------------------  -------------- ---------
May 10 03:23:18 45b080f8-b0ce-eb40-d6ab-fc16e058ca1c  SUN4V-8002-42  Critical
Fault class : fault.memory.dimm-ue-imminent 95%
Affects     : mem:///unum=MB/CMP0/BR0/CH0/D0/J0500
                  faulted but still in service
FRU         : "MB/CMP0/BR0/CH0/D0/J0500" (hc://:serial=00AD0110111091A63A:part=511-1151-01-Rev-05/motherboard=0/chip=0/branch=0/dram-channel=0/dimm=0) 95%
                  faulty
Description : A pattern of correctable errors has been observed suggesting the
              potential exists that an uncorrectable error may occur.
              Refer to http://sun.com/msg/SUN4V-8002-42 for more information.


Apr 04 09:57:49 21c312ab-4c48-ee92-c762-e2680ae35b74  FMD-8000-0W    Minor
Fault class : defect.sunos.fmd.nosub
Description : The Solaris Fault Manager received an event from a component to
              which no automated diagnosis software is currently subscribed.
              Refer to http://sun.com/msg/FMD-8000-0W for more information.


Apr 02 05:17:43 6dd87a3b-8b9a-cb7c-f010-af2b3bb16786  PCIEX-8000-0A  Critical
Fault class : fault.io.pciex.device-interr
Affects     : dev:////pci@400
                  faulted but still in service
FRU         : "MB" (hc://:product-id=SUNW,T5240:chassis-id=FML1020004:server-id=fwgams3:serial=0328MSL-1002AV01M0:part=540793902/motherboard=0)
                  faulty
Description : A problem was detected for a PCIEX device.
              Refer to http://sun.com/msg/PCIEX-8000-0A for more information.


Apr 09 23:59:05 773efc10-2b1b-4961-809d-86f3594da8d0  SUN4V-8000-E2  Critical
Fault class : fault.memory.bank max 95%
Affects     : mem:///unum=MB/CMP0/BR0/CH0/D1/J0600
              mem:///unum=MB/CMP0/BR0/CH1/D1/J0800
                  faulted but still in service
FRU         : "MB/CMP0/BR0/CH0/D1/J0600" (hc://:serial=00AD01101110A1A65B:part=511-1151-01-Rev-05/motherboard=0/chip=0/branch=0/dram-channel=0/dimm=1) max 95%
              "MB/CMP0/BR0/CH1/D1/J0800" (hc://:serial=00AD0110111031A66B:part=511-1151-01-Rev-05/motherboard=0/chip=0/branch=0/dram-channel=1/dimm=1) 95%
                  faulty
Description : The number of errors associated with this memory module has
              exceeded acceptable levels.  Refer to
              http://sun.com/msg/SUN4V-8000-E2 for more information.

##### Tx000/showfaults_-v #####
Last POST Run: Thu Apr 28 14:04:43 2011

Post Status: Passed all devices
  ID Time                           FRU               Class             Fault
   1 Apr 28 14:00:36                /SYS/MB                             Host detected fault MSGID: PCIEX-8000-0A  UUID: 6dd87a3b-8b9a-cb7c-f010-af2b3bb16786
   2 Apr 28 14:00:36                /SYS/MB                             Host detected fault MSGID: SUN4V-8000-E2  UUID: d807ae91-64fc-e7ed-9b00-96b3fb44c2a3
   3 May 10 08:21:57                /SYS/MB/CMP0/BR0/CH0/D0                   Host detected fault MSGID: SUN4V-8002-42  UUID: 45b080f8-b0ce-eb40-d6ab-fc16e058ca1c
   4 Apr 28 14:00:36                /SYS/MB/CMP0/BR0/CH0/D0                   Host detected fault MSGID: SUN4V-8000-E2  UUID: e6cf8844-2c8c-c0b1-8e21-f273ed3fff76
   5 Apr 28 14:00:36                /SYS/MB/CMP0/BR0/CH0/D0                   Host detected fault MSGID: SUN4V-8000-E2  UUID: d807ae91-64fc-e7ed-9b00-96b3fb44c2a3
   6 Apr 28 14:00:36                /SYS/MB/CMP0/BR0/CH0/D1                   Host detected fault MSGID: SUN4V-8000-E2  UUID: 773efc10-2b1b-4961-809d-86f3594da8d0
   7 Apr 28 14:00:36                /SYS/MB/CMP0/BR0/CH0/D1                   Host detected fault MSGID: SUN4V-8002-42  UUID: b3895ac1-e2ad-c58f-f189-f2bf8fb0db53
   8 Apr 28 14:00:36                /SYS/MB/CMP0/BR0/CH0/D1                   Host detected fault MSGID: SUN4V-8000-E2  UUID: d807ae91-64fc-e7ed-9b00-96b3fb44c2a3
   9 Apr 28 14:00:36                /SYS/MB/CMP0/BR0/CH1/D0                   Host detected fault MSGID: SUN4V-8000-E2  UUID: e6cf8844-2c8c-c0b1-8e21-f273ed3fff76
  10 Apr 28 14:00:36                /SYS/MB/CMP0/BR0/CH1/D0                   Host detected fault MSGID: SUN4V-8000-E2  UUID: d807ae91-64fc-e7ed-9b00-96b3fb44c2a3
  11 Apr 28 14:00:36                /SYS/MB/CMP0/BR0/CH1/D1                   Host detected fault MSGID: SUN4V-8000-E2  UUID: 773efc10-2b1b-4961-809d-86f3594da8d0
  12 Apr 28 14:00:36                /SYS/MB/CMP0/BR0/CH1/D1                   Host detected fault MSGID: SUN4V-8000-E2  UUID: d807ae91-64fc-e7ed-9b00-96b3fb44c2a3

##### sysconfig/last-20-reboot.out #####
reboot    system boot                   Tue May 17 14:17
reboot    system down                   Tue May 17 13:41
reboot    system boot                   Tue May 17 13:25
reboot    system down                   Tue May 17 13:19
reboot    system boot                   Mon May 16 13:51
reboot    system down                   Mon May 16 13:42
reboot    system boot                   Fri May 13 13:51
reboot    system down                   Fri May 13 13:39
reboot    system boot                   Fri May 13 11:58
reboot    system down                   Fri May 13 11:58
reboot    system boot                   Fri May 13 11:47
reboot    system down                   Fri May 13 10:43
reboot    system boot                   Wed May 11 11:47
reboot    system down                   Wed May 11 11:43
reboot    system boot                   Wed May 11 11:43
reboot    system down                   Wed May 11 11:38
reboot    system boot                   Wed May 11 11:38
reboot    system down                   Wed May 11 09:26
reboot    system boot                   Wed May 11 09:25
reboot    system down                   Tue May 10 21:32

 

Provides an indication of the ILOM to host clock differences.
##### Tx000/showdate #####  (SC normally in UTC!)
SC Date:   Tue May 17 20:02:57 2011
Host Date: Tue May 17 15:04:25 2011


##### Tx000/showlogs_-v #####
Apr 28 13:53:18: Chassis |critical: "Host has been powered off"
Apr 28 13:57:55: Audit   |major   : "Upgrade Succeeded"
Apr 28 14:00:36: Chassis |major   : "Host detected fault, MSGID: FMD-8000-0W"
Apr 28 14:00:36: Chassis |major   : "Host detected fault, MSGID: SUN4V-8000-E2"
Apr 28 14:00:36: Chassis |major   : "Host detected fault, MSGID: PCIEX-8000-0A"
Apr 28 14:00:36: Chassis |major   : "Host detected fault, MSGID: SUN4V-8002-42"
Apr 28 14:00:36: Chassis |major   : "Host detected fault, MSGID: SUN4V-8000-E2"
Apr 28 14:00:37: Chassis |major   : "Host detected fault, MSGID: SUN4V-8000-E2"
Apr 28 14:00:38: Chassis |major   : "Host has been powered on"
Apr 28 14:02:25: Chassis |major   : "Hot removal of HDD4"
Apr 28 14:02:26: Chassis |major   : "Hot insertion of HDD3"
Apr 28 14:02:28: Chassis |major   : "Hot insertion of HDD2"
Apr 28 14:02:29: Chassis |major   : "Hot removal of HDD7"
Apr 28 14:02:43: Chassis |major   : "Hot insertion of HDD1"
Apr 28 14:02:43: Chassis |major   : "Hot removal of HDD6"
Apr 28 14:02:44: Chassis |major   : "Hot insertion of HDD0"
Apr 28 14:02:44: Chassis |major   : "Hot removal of HDD5"
Apr 28 14:05:32: Chassis |major   : "Host is running"
May 10 08:21:59: Chassis |major   : "Host detected fault, MSGID: SUN4V-8002-42"
May 11 14:22:35: Chassis |critical: "SP Request to Reset Host due to Watchdog"
May 11 14:22:35: Chassis |major   : "Host is running"
May 11 16:44:53: Chassis |critical: "SP Request to Reset Host due to Watchdog"
May 11 16:44:53: Chassis |major   : "Host is running"
sc>

##### messages/messages.1   Filtered log of non-external Storage messages #####
Apr 28 08:02:42 fwgams3 xntpd[410]: [ID 866926 daemon.notice] xntpd exiting on signal 15
Apr 28 08:02:48 fwgams3 syslogd: going down on signal 15
Apr 28 08:06:45 fwgams3 genunix: [ID 540533 kern.notice] ^MSunOS Release 5.10 Version Generic_142900-15 64-bit    <--- logs filtered to display reboots & errors

Apr 28 06:37:47 fwgams3 xntpd[432]: [ID 866926 daemon.notice] xntpd exiting on signal 15
Apr 28 06:37:47 fwgams3 rpcbind: [ID 564983 daemon.error] rpcbind terminating on signal.
Apr 28 08:37:54 fwgams3 genunix: [ID 672855 kern.notice] syncing file systems...

Apr 28 08:37:55 fwgams3 genunix: [ID 904073 kern.notice]  done
Apr 28 08:38:57 fwgams3 genunix: [ID 540533 kern.notice] ^MSunOS Release 5.10 Version Generic_142900-15 64-bit

Apr 28 08:49:40 fwgams3 snmpXdmid: [ID 216524 daemon.error] Registration with DMI failed. err = 831.

##### messages/messages    Filtered log of non-external Storage messages #####

May 17 12:51:33 fwgams3 xntpd[295]: [ID 990412 daemon.error] can't open /var/ntp/ntp.drift.TEMP: No space left on device
May 17 12:55:51 fwgams3 tictimed[803]: [ID 768879 user.error] [tictimed] Error opening file /var/log/.lwact.16838

May 17 13:20:07 fwgams3 ^Mpanic[cpu70]/thread=2a10267fca0:
May 17 13:20:07 fwgams3 unix: [ID 198415 kern.notice] Fatal error has occured in: PCIe fabric.(0x0)(0x63)
May 17 13:20:07 fwgams3 unix: [ID 100000 kern.notice]
May 17 13:20:07 fwgams3 genunix: [ID 723222 kern.notice] 000002a10267f700 px:px_err_panic+1ac (1959800, 13c7c00, 63, 2a10267f7b0, 0, 0)

May 17 13:20:07 fwgams3 genunix: [ID 672855 kern.notice] syncing file systems...

May 17 13:23:44 fwgams3 genunix: [ID 851671 kern.notice] dump succeeded
May 17 13:24:47 fwgams3 genunix: [ID 540533 kern.notice] ^MSunOS Release 5.10 Version Generic_142900-15 64-bit

May 17 13:25:39 fwgams3 savecore: [ID 570001 auth.error] reboot after panic: Fatal error has occured in: PCIe fabric.(0x0)(0x63)
May 17 13:25:39 fwgams3 savecore: [ID 219048 auth.error] not enough space in /var/crash/fwgams3 (715 MB avail, 1934 MB needed)
May 17 13:25:40 fwgams3 savecore: [ID 570001 auth.error] reboot after panic: Fatal error has occured in: PCIe fabric.(0x0)(0x63)
May 17 13:25:40 fwgams3 savecore: [ID 219048 auth.error] not enough space in /var/crash/fwgams3 (715 MB avail, 1934 MB needed)

May 17 13:36:32 fwgams3 snmpXdmid: [ID 216524 daemon.error] Registration with DMI failed. err = 831.
May 17 14:11:07 fwgams3 genunix: [ID 843051 kern.info] NOTICE: SUNW-MSG-ID: SUNOS-8000-0G, TYPE: Error, VER: 1, SEVERITY: Major

May 17 14:11:07 fwgams3 ^Mpanic[cpu70]/thread=2a1026adca0:
May 17 14:11:07 fwgams3 unix: [ID 198415 kern.notice] Fatal error has occured in: PCIe fabric.(0x0)(0x63)
May 17 14:11:07 fwgams3 unix: [ID 100000 kern.notice]
May 17 14:11:07 fwgams3 genunix: [ID 723222 kern.notice] 000002a1026ad700 px:px_err_panic+1ac (1959800, 13c7c00, 63, 2a1026ad7b0, 0, 0)

May 17 14:11:07 fwgams3 genunix: [ID 672855 kern.notice] syncing file systems...

May 17 14:16:02 fwgams3 genunix: [ID 851671 kern.notice] dump succeeded
May 17 14:17:05 fwgams3 genunix: [ID 540533 kern.notice] ^MSunOS Release 5.10 Version Generic_142900-15 64-bit

May 17 14:17:56 fwgams3 savecore: [ID 570001 auth.error] reboot after panic: Fatal error has occured in: PCIe fabric.(0x0)(0x63)

May 17 14:28:48 fwgams3 snmpXdmid: [ID 216524 daemon.error] Registration with DMI failed. err = 831.
May 17 14:29:37 fwgams3 iscsi: [ID 286457 kern.notice] NOTICE: iscsi connection(129) unable to connect to target SENDTARGETS_DISCOVERY (errno:146)

 

Filters console output for certein events (panics, reboots, ...) & a dated entry soon after.
##### Tx000/consolehistory_-v #####

OpenBoot 4.30.10, 32544 MB memory available, Serial #89659964.

fwgams3 console login: May 17 13:25:43 fwgams3 iscsi: NOTICE: iscsi connection(45) unable to connect to target SENDTARGETS_DISCOVERY (errno:146)

May 17 13:25:43 fwgams3 iscsi: NOTICE: iscsi discovery failure - SendTargets (172.030.184.069)

May 17 13:25:57 fwgams3 pkid[814]: PKI Service has been started

May 17 13:26:02 fwgams3 pkid[1227]: Reflection PKI Services Manager is already running [process id: 814]

panic[cpu70]/thread=2a1026adca0: Fatal error has occured in: PCIe fabric.(0x0)(0x63)
000002a1026ad700 px:px_err_panic+1ac (1959800, 13c7c00, 63, 2a1026ad7b0, 0, 0)
  %l0-3: 0000000000011801 0000000001959800 0000000000000000 0000000000000001
  %l4-7: 0000000000000000 0000000001875c00 0000000000000001 0000000000000000
000002a1026ad810 px:px_err_fabric_intr+1b4 (300055f1540, 0, 200, 1, 63, 200)
  %l0-3: 0000000000000200 0000000001959e60 0000000001959c00 0000000000000001
  %l4-7: 0000000001959e48 0000000001959c00 0000000001959e40 0000000001959c00
000002a1026ad980 px:px_msiq_intr+1e8 (60031c4d7e0, 30003ceb860, 13ba2dc, 0, 1, 300014d5bc8)
  %l0-3: 0000060031cc3e60 00000300055ef800 0000030003ceb860 0000000000000000
  %l4-7: 0000000000000000 00000000034c0000 000002a1026ada80 0000000000000030
syncing file systems

OpenBoot 4.30.10, 32544 MB memory available, Serial #89659964.
fwgams3 console login: May 17 14:18:03 fwgams3 iscsi: NOTICE: iscsi connection(51) unable to connect to target SENDTARGETS_DISCOVERY (errno:146)
May 17 14:18:03 fwgams3 iscsi: NOTICE: iscsi discovery failure - SendTargets (172.030.184.069)
May 17 14:18:13 fwgams3 pkid[683]: PKI Service has been started

May 17 14:18:20 fwgams3 pkid[1119]: Reflection PKI Services Manager is already running [process id: 683]



Analysis of small number of known problems with T3/4 or T5xx0 servers.

========== Analysis (if any) ============
Bug 7070900.  Replace failed DIMM.  Full power cycle may stabilize the system.
Bug 7133194.  Some DIMMs erroneously have ereports.  Please open a defect to SunBugTraq (minor issue).

 


List of Analizable Faults:

Sun Alert 1468850.1:  T4-4 with FW prior 8.1.5 + fault SUN4V-8000-E2 + failures on BOB0/CH1/D0 & BOB1/CH1/D0, "Alert 1468850.1"
T3-2 with fault ILOM-8000-2V  "HH3 T3-2 Risers"
T3-2 with fault SUN4V-8002-PX, "T3-2 Riser2"
7016293 on T3 or T4 with Panic trap 9, "Trap9 7016293"
7108029 on T3 or T4 with Panic Trap 31 + 147440-04 (assumed to be vxio related),  "vxio 147440-04"
7142527 on T3 or T4 with FW prior 8.1.5 + fault SUN4V-8002-US (but could be 7172435), "T3 DYNA LEAK"

7172435 on T3 or T4 with FW 8.1.5 or later + fault SUN4V-8002-US, "ILOM 7172435"
7141035 on T3 or T4 with message Send Mondo, "T3 Mondo"
7151759 on T3 or T4 with fault SUN4V-8002-KQ, "T3 c2c"
7146062 on T3 or T4 with FW prior 8.1.5 + fault SPT-8000-DH, "Power Glitch"
7064258 on T3 or T4 with FW prior 8.1.4.d + message Disconnected command timeout ...,  "T3 Discon Timeout"
7115336 on T3 or T4 with message PCI panic (0x43) , "T3 PCI 0x43"
6863127 on T5xx0 with message PCI panic (0x41) or 43 + fault PCIEX-8000-3S + prior KJP 142909-17,  "T5x PCI 0x41"
DIMM voltage mismatch on T5xx0, "DIMM Volt"
ILOM memory leak on T5xx0 with FW prior 7.2.7.b + fault SUN4V-8002-SP, "ILOM Mem Leak"
1.5V Hynix DIMM failure on T5240 + Unrecoverable Hardware Panic, "Hynix 1.5V"
1.5V Micron DIMM failure on T5240 + Unrecoverable Hardware Panic, "Micron 1.5V"
Warning if T5140 & FW prior 7.3.1.a + missing 511-1604, "PDB FW"
Warning if T5120 & FW prior 7.3.1.a + missing 511-1604, "PDB FW"
7133194 on CMT if 2 DIMMs have ereport at exact same time, "DupEreport"
7043851 on CMT if IPMItool panic due to bmc:do_vc2bmc , "IPMI BMC"
All CMTs with uncertified DIMMs, "Uncertified DIMM"
All CMTs with without 147705-01 + fault PCIEX-8000-KP, "PCIE driver"


Logs

An Explosum/Snapper logging feature exists & places entries below on a Santa Clara lab server so that critical faults/conditions can be tracked.  If a '.' follows the SR number, then the entry was due to snapper as in the T4-4 entry below, otherwise from explosum.

Tue Apr 16 12:00:00 PDT 2013
3-7072747891 T5220 7.0.9.c  S10U6    BEL0818HZZ
3-7073994802 E4900 5.20.14  S10U9    0408HH2215
3-7074228321 T2000 6.3.0    S10U3    0651NNN0ER
3-7074228321 T2000 6.3.0    S10U3    0651NNN0ER
3-7073549901 V440  4.22.33  S9U7
3-7074429541.T4-4 .8.2.2.c           1307BDY72C
3-7019584361 X4102          11.1S5.1 0817ALB1EA   SUNOS-8000-KL
3-7007081871 T5140 7.1.7    S10U6                 PCIEX-8000-3S SUNOS-8000-1L
3-7068823851 T5140 7.2.2.b  S10U6    TFL0801002   FMD-8000-3F
3-7019584361 X4102          11.1S5.1 0817ALB1EA   SUNOS-8000-KL
3-7070119133 T2000 6.5.11   S10U4    0824NNN0D2
3-7057487097 T2000 6.6.7    S10U6    0642NNN0G1   SUN4V-8000-8Q
3-7057487097 T2000 6.6.7    S10U6    0642NNN0G1   SUN4V-8000-8Q
3-6994233271 I86PC          S10U9    GB8050BM5L   SUNOS-8000-FU PCIEX-8000-KP
3-6994233271 I86PC          S10U9    CZ3125JL5H   SUNOS-8000-FU
3-7054275671 T4-4  8.2.1.b  S11S13.4 1246BDY456
3-7075256021 T5120 7.2.7.b  S10U6    BEL08046O0
3-7059605741 X4500          S10U5    0746AMT039   ZFS-8000-D3
3-7053197451 T6340 7.2.6    S10U9
3-7072103291 T5240 7.4.2    S10U9    BDL1048111    DupEreport
3-7053197451 T6340 7.2.6    S10U8    08388N0055



Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback