Asset ID: |
1-79-1359404.1 |
Update Date: | 2018-03-14 |
Keywords: | |
Solution Type
Predictive Self-Healing Sure
Solution
1359404.1
:
Explosum - Explorer Hardware summary tool
Related Items |
- Sun SPARC Enterprise T5120 Server
|
Related Categories |
- PLA-Support>Sun Systems>SPARC>Usx/Blade/Netra>SN-SPARC: USx
- _Old GCS Categories>Sun Microsystems>Servers>Entry-Level Servers
- _Old GCS Categories>Sun Microsystems>Servers>NEBS-Certified Servers
- _Old GCS Categories>Sun Microsystems>Servers>CMT Servers
|
Oracle Confidential PARTNER - Available to partners (SUN).
Reason: Restricted Product Info
Applies to:
Sun SPARC Enterprise T5120 Server - Version Not Applicable and later
Information in this document applies to any platform.
Purpose
This lists how to run explosum & provides sample output.
Scope
Details
Explosum gathers information from an explorer & will provide an analysis of many known server issues at the end of the summary file, HWsummary.html. Most output is filtered to only show erroneous entries or short status of various components. This tool is targeted for VSP products, but can be very helpful when run on explorers from other SPARC or X64 products. It will also work on some X64 SOSreports & SunDiag directories. It places the summary into files HWsummary.html & SWsummary.html which are placed into the explorer's top directory. The HWsummary file does not contain storage array or certain OS information which is included in the larger SWsummary file. This tool consists of a script "es.sh" which performs some UNIX shell commands & then calls 2 compiled C programs "ce" & "es". The script expands fmdump-eV files stored below the fma/var directory & places them into fma. es then parses the explorer & converted files for hardware information. If certain conditions are detected in es, it's return code to the script will cause an email be sent to affected backline engineers. FRU & event data will be stored for most platform types if the Explorer contains a valid date, a Serial number, & run from an SR directory in cores3.
I have the latest SPARC version of this tool on cores3
explosum . (if already in the top directory of the explorer)
explosum explorer-top-directory-path
explosum . output-directory (redirect output file)
A known working (usually older version) is on behive. The Solaris X64 executable is esx & shell script is esx.sh.
https://stbeehive.oracle.com/teamcollab/library/st/SummaryTools/Documents#dcid=334B:3BF0:afrh:38893C00F42F38A1E0404498C8A6612B000AD9E7AE70
Please have the customer run explorers to obtain SP related data to avoid manual data gathering/analysis, as follows:
ALOM based: explorer -w default,alomextended (preferred on T5xx0 servers)
T3 & 4: explorer -w ipmi,ipmiextended,ilomextended,default
Explorer 7.0 should be used to obtain data on Solaris 11 systems if possible. Please note that the latest version of Explorer will contain more data to help isolate problems.
A method to run Explosum via the browser is to click the Explominer link at ISDE URL: https://mos-cores.us.oracle.com/collectionviewer/prod/index.php . Doing this will also add the required patch list to the summary!
An FE can download these 3 executables (es.sh, ce, es) from the SummaryTools beehive site & place them into the same directory on a Solaris based server. It is then run by typing "es.sh ." when in the top directory of the explorer (if the executable directory is added to the PATH variable.
Explanation of tool's output:
******************************************************************************
explosum revision 6.22 (Explorer Revision: 6.6)
Oracle4Ever SR# 3-1234567890 15/01/27-18:33:
Hostname: tryo HostID: 8645d49a Platform: SPARC T5-2 Serial#: AK00182246
******************************************************************************
Internal: System Config PCI Config Disk Config Net Config Fault Info Logs Analysis
External: Explosum Issues FW Troubleshooting SSH
========== System Configuration ============
Lists any known major issues with loaded version of FW. If prtdiag not obtained, it checks SP related files for the System FW or OBP.
##### sysconfig/prtdiag-v.out #####
Sun System Firmware 7.1.8.a 2009/03/15 14:48
*** Major ILOM memory leaks were fixed by FW 7.2.7 so this should be upgraded soon.
OBP 4.29.2 2009/03/12 06:53
This OBP used in system FW: 7.1.8 through 8.d
##### etc/release #####
Solaris 10 5/08 s10s_u5wos_10 SPARC
If Solaris 11 - lists the branch & documentation to determine the version.
##### patch+pkg/pkg_info-l.out See doc 1372094.1 #####
Branch: 0.175.0.6.0.6.0
##### sysconfig/uname-a.out #####
KJP loaded = 147440-09
If ExploMiner run prior Explosum, it's required patches are listed.
##### ExploMiner_SPARC-T5-2_patches.nobody#####
This summary contains only REQUIRED Patches and their DEPENDENT Patches
Required: 120812-32 OpenGL 1.5: OpenGL Patch for Solaris
Required: 150011-04 VM Server for SPARC 3.0 ldmd patch
Required: 150400-20 SunOS 5.10: Kernel Patch
Lists each time KJP was upgraded.
##### patch+pkg/patch_listing #####
Aug 23 2009 137137-09 S10 U6 10/08
Aug 23 2009 138888-01 S10 U7 Point
Mar 13 2011 139555-08 S10 U7 5/09
Mar 13 2011 141444-09 S10 U8 10/09
Mar 13 2011 142909-17 S10 U9 09/10
Mar 13 2011 144488-04 S10 U10 Point
Aug 21 2011 144488-17 S10 U10 Point
Jun 24 01:32 144500-19 S10 U10 8/11
Jun 24 01:36 147440-09 S10 U11 Point
Lists required FMA & CMT requiored patches.
##### patch+pkg/patch-list #####
119578-30 FMA
126897-02 FMA
127755-01 FMA
145961-01 FMA
FMADM 146582-02 missing
FMD 147778-01 missing
FMD 147790-01 missing
========== LDom Configuration ============
##### sysconfig/ldm_-V.out #####
Logical Domains Manager (v 3.1.0.1)
##### sysconfig/virtinfo-a.out#####
Domain name: primary
Control domain: prod01
##### sysconfig/ldm_list_-l.outfile lists VCPU utilizations #####
NAME STATE FLAGS CONS VCPU MEMORY UTIL NORM UPTIME
primary active -n-cv- UART 32 32G 1.9% 2.0% 88d 23h 51m
infra01 bound ------ 5000 64 32G
infra11 active -n---- 5002 64 64G 0.1% 0.1% 8d 24m
midtier01 bound ------ 5001 32 32G
midtier11 active -n---- 5003 64 64G 0.0% 0.0% 8d 37m
##### sysconfig/svcs-av.outlists only LDom related services #####
STATE NSTATE STIME CTID FMRI
online - Oct_30 32 svc:/ldoms/agents:default
online - Oct_30 100 svc:/ldoms/vntsd:default
online - Oct_30 93 svc:/ldoms/ldmd:default
##### sysconfig/ldm_list-devices_-a.out See doc 1020212.1 #####
Number of cores: 32
Lists only items which have problems.
##### Tx000/showenvironment #####
Supply Status Fan_Fault Temp_Fault Volt_Fault Cur_Fault
/SYS/PS1 No Input Power OFF OFF OFF OFF
========== FRU Configuration ============
Displays FRU board numbers, failed components, & indication of possible non-certified DIMMs using manufacturer part number. Link to doc for CMT certified DIMMs listed.
##### Tx000/showfru See doc 1411086.1 #####
Part Manufacturer Part # Ser # Max Temp Status
/SYS/MB Mitac Internat 5111392-02 AU01UL 101 (28 degrees C) 0x64 (MAINTENANCE REQUIRED, SUSPECT, DE
/SYS/PDB FOXCONN 5017697-09 G05KFH 101 (28 degrees C) 0x00 (OK)
/SYS/PADCRD FOXCONN 5111255-03 A10YC9 101 (28 degrees C) 0x00 (OK)
/SYS/SASBP FOXCONN 5111256-01 A20TLN 101 (28 degrees C) 0x00 (OK)
/SYS/FANBD0 FOXCONN 5017695-04 E07T59 101 (28 degrees C) 0x00 (OK)
/SYS/FANBD1 FOXCONN 5017695-04 E07T99 101 (28 degrees C) 0x00 (OK)
/SYS/PS0 Power-One 3002138-03 A718CU
/SYS/PS1 Power-One 3002138-03 A718CZ
DIMM Manufacturer Vendor Part # Part # Ser # Status
/SYS/MB/CMP0/BR0/CH0/D0 Hynix Semicond HYMP125L72CP8D5-Y5 511-1151 1091A63A 0x64 (MAINTENANCE REQUIRED, SUSPECT, DE
/SYS/MB/CMP0/BR0/CH0/D1 Hynix Semicond HYMP125L72CP8D5-Y5 511-1151 10A1A65B 0x64 (MAINTENANCE REQUIRED, SUSPECT, DE
/SYS/MB/CMP0/BR0/CH1/D0 Hynix Semicond HYMP125L72CP8D5-Y5 511-1151 10C1A63A 0x64 (MAINTENANCE REQUIRED, SUSPECT, DE
/SYS/MB/CMP0/BR0/CH1/D1 Hynix Semicond HYMP125L72CP8D5-Y5 511-1151 1031A66B 0x64 (MAINTENANCE REQUIRED, SUSPECT, DE
/SYS/MB/CMP0/BR1/CH0/D0 Hynix Semicond HYMP125L72CP8D5-Y5 511-1151 1041A635 0x00 (OK)
/SYS/MB/CMP0/BR1/CH0/D1 Hynix Semicond HYMP125L72CP8D5-Y5 511-1151 1051A65D 0x00 (OK)
/SYS/MB/CMP0/BR1/CH1/D0 Hynix Semicond HYMP125L72CP8D5-Y5 511-1151 10B1A673 0x00 (OK)
/SYS/MB/CMP0/BR1/CH1/D1 Hynix Semicond HMP31GF7AFR4C-Y5D5 0000000 33306C58 0x00 (OK) DIMM possibly not certified!!!
/SYS/MB/CMP1/BR0/CH0/D0 Hynix Semicond HYMP125L72CP8D5-Y5 511-1151 10B1A634 0x00 (OK)
/SYS/MB/CMP1/BR0/CH0/D1 Hynix Semicond HYMP125L72CP8D5-Y5 511-1151 1041A65C 0x00 (OK)
/SYS/MB/CMP1/BR0/CH1/D0 Hynix Semicond HYMP125L72CP8D5-Y5 511-1151 1051A65A 0x00 (OK)
/SYS/MB/CMP1/BR0/CH1/D1 Hynix Semicond HYMP125L72CP8D5-Y5 511-1151 10C1A65B 0x00 (OK)
/SYS/MB/CMP1/BR1/CH0/D0 Hynix Semicond HYMP125L72CP8D5-Y5 511-1151 10B1A65B 0x00 (OK)
/SYS/MB/CMP1/BR1/CH0/D1 Hynix Semicond HYMP125L72CP8D5-Y5 511-1151 1061A65D 0x00 (OK)
/SYS/MB/CMP1/BR1/CH1/D0 Hynix Semicond HYMP125L72CP8D5-Y5 511-1151 1031A65C 0x00 (OK)
/SYS/MB/CMP1/BR1/CH1/D1 Hynix Semicond HYMP125L72CP8D5-Y5 511-1151 1081A65D 0x00 (OK)
Somewhat duplicates the section above BUT indicates SSH FRU components & additional error checking.
##### SHOWFRU tool - Thanks to Doug Baker! #####
################################################################################
Latest version 1.62 on cores2 at /cores_data/local/bin/showfru
Report bugs, RFEs or if you have questions email doug.baker@oracle.com
Further info http://panacea.central.sun.com/twiki/bin/view/Tools/ToolPageShowfru
################################################################################
/SYS/MB Orderable part 540-7939 02
/SYS/MB 511-1392 02 AU01UL
/SYS/MB/CMP0/BR0/CH0/D0 511-1151 01 1091A63A 0x64 (MAINTENANCE REQUIRED, SUSPECT, DEEMED FAULTY)
/SYS/MB/CMP0/BR0/CH0/D1 511-1151 01 10A1A65B 0x64 (MAINTENANCE REQUIRED, SUSPECT, DEEMED FAULTY)
/SYS/MB/CMP0/BR0/CH1/D0 511-1151 01 10C1A63A 0x64 (MAINTENANCE REQUIRED, SUSPECT, DEEMED FAULTY)
/SYS/MB/CMP0/BR0/CH1/D1 511-1151 01 1031A66B 0x64 (MAINTENANCE REQUIRED, SUSPECT, DEEMED FAULTY)
/SYS/MB/CMP0/BR1/CH0/D0 511-1151 01 1041A635 0x00 (OK)
/SYS/MB/CMP0/BR1/CH0/D1 511-1151 01 1051A65D 0x00 (OK)
/SYS/MB/CMP0/BR1/CH1/D0 511-1151 01 10B1A673 0x00 (OK)
/SYS/MB/CMP0/BR1/CH1/D1 511-1151 01 10C1A637 0x00 (OK)
/SYS/MB/CMP1/BR0/CH0/D0 511-1151 01 10B1A634 0x00 (OK)
/SYS/MB/CMP1/BR0/CH0/D1 511-1151 01 1041A65C 0x00 (OK)
/SYS/MB/CMP1/BR0/CH1/D0 511-1151 01 1051A65A 0x00 (OK)
/SYS/MB/CMP1/BR0/CH1/D1 511-1151 01 10C1A65B 0x00 (OK)
/SYS/MB/CMP1/BR1/CH0/D0 511-1151 01 10B1A65B 0x00 (OK)
/SYS/MB/CMP1/BR1/CH0/D1 511-1151 01 1061A65D 0x00 (OK)
/SYS/MB/CMP1/BR1/CH1/D0 511-1151 01 1031A65C 0x00 (OK)
/SYS/MB/CMP1/BR1/CH1/D1 511-1151 01 1081A65D 0x00 (OK)
/SYS/PDB Orderable part 541-2073 09
/SYS/PDB 501-7697 09 G05KFH
/SYS/PADCRD Orderable part 541-3513 02
/SYS/PADCRD 511-1255 03 A10YC9
/SYS/SASBP 511-1256 01 A20TLN
/SYS/FANBD0 Orderable part 541-2211 04
/SYS/FANBD0 501-7695 04 E07T59
/SYS/FANBD1 Orderable part 541-2211 04
/SYS/FANBD1 501-7695 04 E07T99
/SYS/PS0 300-2138 03 A718CU
/SYS/PS1 300-2138 03 A718CZ
################################################################################
CHS History of currently disabled Components, use -v to see full history
################################################################################
Component : /SYS/MB
Time Stamp : Thu, Apr 28 2011 14:00:36 GMT
New_Status : 0x64 (MAINTENANCE REQUIRED, SUSPECT, DEEMED FAULTY)
Old_Status : 0x64 (MAINTENANCE REQUIRED, SUSPECT, DEEMED FAULTY)
Initiator : Fault Management
Component : 0
Event_Code : FMA Message R
Fault_Diag_Secs :
FMA_String : PCIEX-8000-0A
UUID: : 6dd87a3b-8b9a-cb7c-f010-af2b3bb16786
DE_Name : eft
DE_Version : 1.16
Diagdata : 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
################################################################################
WARNING: Components can get disabled due to software bugs always check if any
of the known issues apply before replacing hardware
PTS T2000 http://panacea/twiki/bin/view/Products/ProdIssuesSunFireT2000
PTS T1000 http://panacea/twiki/bin/view/Products/ProdIssuesSunFireT1000
################################################################################
========== PCI Configuration ============
Lists CPU config, link to doc to correlate PCI card to Oracle part #, & list of possible problematic components.
##### sysconfig/prtdiag-v.out See doc 1373995.1 #####
System Configuration: Sun Microsystems sun4v Sun Netra T5220
Memory size: 3968 Megabytes
CPU ID Frequency Implementation Status
0 1165 MHz SUNW,UltraSPARC-T2 on-line
...
47 1165 MHz SUNW,UltraSPARC-T2 on-line
Slot + Bus Name + Model
Status Type Path
----------------------------------------------------------------------------
MB/NET0 PCIE network-pciex8086,105e
/pci@0/pci@0/pci@1/pci@0/pci@2/network@0
MB/NET1 PCIE network-pciex8086,105e
/pci@0/pci@0/pci@1/pci@0/pci@2/network@0,1
MB/NET2 PCIE network-pciex8086,105e
/pci@0/pci@0/pci@1/pci@0/pci@3/network@0
MB/NET3 PCIE network-pciex8086,105e
/pci@0/pci@0/pci@1/pci@0/pci@3/network@0,1
MB/SASHBA PCIE scsi-pciex1000,58 LSI,1068E
/pci@0/pci@0/pci@2/scsi@0
MB/RISER1/PCIE1 PCIE SUNW,qlc-pciex1077,2432 QLE2462
/pci@0/pci@0/pci@8/pci@0/pci@1/SUNW,qlc@0
MB/RISER1/PCIE1 PCIE SUNW,qlc-pciex1077,2432 QLE2462
/pci@0/pci@0/pci@8/pci@0/pci@1/SUNW,qlc@0,1
MB/RISER0/PCIE0 PCIE network-pciex108e,abcd SUNW,pcie-qgc
/pci@0/pci@0/pci@8/pci@0/pci@9/network@0
MB/RISER0/PCIE0 PCIE network-pciex108e,abcd SUNW,pcie-qgc
/pci@0/pci@0/pci@8/pci@0/pci@9/network@0,1
MB/RISER0/PCIE0 PCIE network-pciex108e,abcd SUNW,pcie-qgc
/pci@0/pci@0/pci@8/pci@0/pci@9/network@0,2
MB/RISER0/PCIE0 PCIE network-pciex108e,abcd SUNW,pcie-qgc
/pci@0/pci@0/pci@8/pci@0/pci@9/network@0,3
MB PCIX usb-pciclass,0c0310
/pci@0/pci@0/pci@1/pci@0/pci@1/pci@0/usb@0
MB PCIX usb-pciclass,0c0310
/pci@0/pci@0/pci@1/pci@0/pci@1/pci@0/usb@0,1
MB PCIX usb-pciclass,0c0320
/pci@0/pci@0/pci@1/pci@0/pci@1/pci@0/usb@0,2
--- Non OK sensor output ---
SYS/PS1 I_IN_MAIN disabled
SYS/PS1 I_IN_LIMIT disabled
SYS/PS1 I_OUT_MAIN disabled
SYS/PS1 I_OUT_LIMIT disabled
SYS/PS1 V_IN_MAIN disabled
SYS/PS1 V_OUT_MAIN disabled
SYS ACT steady
##### sysconfig/prtpicl-v.out Part numbers may point to older version cards (bugs: 19263165 & 19355916) #####
Label WWN - MAC Slot Part# Status/Drv Path Version
/SYS/MB/XGBE0 00.10.e0.3e.94.5a ixgbe 0 /pci@300/pci@1/pci@0/pci@1/network@0
/SYS/MB/NET1 00.10.e0.3e.94.5b ixgbe 1 /pci@300/pci@1/pci@0/pci@1/network@0,1
/SYS/MB/PCIE1 90.e2.ba.5a.1d.40 1 375-3617-01 ixgbe 2 /pci@300/pci@1/pci@0/pci@4/network@0 Sun Dual 10GbE SFP+ PCIe 2.0 LP FCode 3.01 4/2/2012
/SYS/MB/PCIE1 90.e2.ba.5a.1d.41 1 375-3617-01 ixgbe 3 /pci@300/pci@1/pci@0/pci@4/network@0,1 Sun Dual 10GbE SFP+ PCIe 2.0 LP FCode 3.01 4/2/2012
/SYS/MB/XGBE1 00.10.e0.3e.94.5c ixgbe 4 /pci@3c0/pci@1/pci@0/pci@1/network@0
/SYS/MB/NET3 00.10.e0.3e.94.5d ixgbe 5 /pci@3c0/pci@1/pci@0/pci@1/network@0,1
/SYS/MB/PCIE2 90.e2.ba.5a.1c.40 2 375-3617-01 ixgbe 6 /pci@380/pci@1/pci@0/pci@5/network@0 Sun Dual 10GbE SFP+ PCIe 2.0 LP FCode 3.01 4/2/2012
/SYS/MB/PCIE2 90.e2.ba.5a.1c.41 2 375-3617-01 ixgbe 7 /pci@380/pci@1/pci@0/pci@5/network@0,1 Sun Dual 10GbE SFP+ PCIe 2.0 LP FCode 3.01 4/2/2012
/SYS/MB/PCIE3 3 371-4306 emlxs 0 /pci@380/pci@1/pci@0/pci@6/SUNW,emlxs@0 LPe12002-S
/SYS/MB/PCIE3 3 371-4306 emlxs 1 /pci@380/pci@1/pci@0/pci@6/SUNW,emlxs@0,1 LPe12002-S
/SYS/MB/PCIE4 4 371-4306 emlxs 2 /pci@380/pci@1/pci@0/pci@7/SUNW,emlxs@0 LPe12002-S
/SYS/MB/PCIE4 4 371-4306 emlxs 3 /pci@380/pci@1/pci@0/pci@7/SUNW,emlxs@0,1 LPe12002-S
##### sysconfig/fcinfo.out #####
WWN Dev Model FW Serial State Speed Link Sync Sign Prot InvT InvC
10000090fa51454c /dev/cfg/c11 LPe12002-S LPe12002- 4925382+13440000AP online 8Gb 7 3079 1 4 3765 0
10000090fa51454d /dev/cfg/c12 LPe12002-S LPe12002- 4925382+13440000AP online 8Gb 5 5970 1 4 7017 0
10000090fa51454e /dev/cfg/c9 LPe12002-S LPe12002- 4925382+13440000B9 online 8Gb 5 2241 1 4 2164 0
10000090fa51454f /dev/cfg/c10 LPe12002-S LPe12002- 4925382+13440000B9 online 8Gb 22 3420 4 12 2783 4
##### sysconfig/prtconf-v.out See doc if retired! See doc 1614738.1 #####
cfg /dev/cfg/c9 /pci@380/pci@1/pci@0/pci@6/SUNW,emlxs@0/fp@0,0:fc
cfg /dev/cfg/c10 /pci@380/pci@1/pci@0/pci@6/SUNW,emlxs@0,1/fp@0,0:fc
cfg /dev/cfg/c11 /pci@380/pci@1/pci@0/pci@7/SUNW,emlxs@0/fp@0,0:fc
cfg /dev/cfg/c12 /pci@380/pci@1/pci@0/pci@7/SUNW,emlxs@0,1/fp@0,0:fc
*************************** Disk Configuration *******************************
##### sysconfig/eeprom.out #####
boot-device=/pci@300/pci@1/pci@0/pci@2/scsi@0/disk@w3060943c1859cb2e,0:a disk net
use-nvramrc?=false
nvramrc: data not available.
##### etc/vfstab #####
#device device mount FS fsck mount mount
#to mount to fsck point type pass at boot options
swap - /tmp tmpfs - yes -
/dev/zvol/dsk/rpool/swap - - swap - no -
Lists internal drives since attempts to remove entries from known arrays. Contact Don D if a new array to be added.
##### disks/diskinfo (internal volumes only listed) - SWsummary contains external disk info See SWsummary.html#Disk Configuration #####
Location Vendor Product Rev Serial # Dual Port
c2t3060943C1859CB2Ed0 LSI Logical Volume 3000 LSIInternal primary
c3t3d0 TEAC DV-W28SS-W 10A primary
c4t38E02031C0D94CD3d0 LSI Logical Volume 3000 LSIInternal primary
Lists internal drives since attempts to remove entries from known arrays. Contact Don D if a new array to be added.
##### sysconfig/iostat-En.out (internal volumes only listed) - SWsummary contains external disk info See SWsummary.html#Disk Configuration #####
Disk Size Soft Hard Trans Media Ready NoDev Recov Illeg PFlAn
c1t0d0 146.81GB 0 0 0 0 0 0 0 0 0
c1t1d0 146.81GB 0 0 0 0 0 0 0 0 0
c0t0d0 0.00GB 0 0 0 0 0 0 0 4 0
c1t2d0 146.81GB 0 0 0 0 0 0 0 0 0
Lists internal drives since attempts to remove entries from known arrays. Contact Don D if a new array to be added.
##### disks/format.out (internal volumes only listed) - SWsummary contains external disk info See SWsummary.html#Disk Configuration #####
c1t0d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
/pci@0/pci@0/pci@2/scsi@0/sd@0,0
c1t1d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
/pci@0/pci@0/pci@2/scsi@0/sd@1,0
c1t2d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
/pci@0/pci@0/pci@2/scsi@0/sd@2,0
##### etc/path_to_inst (internal volumes only listed) #####
"/pci@0/pci@0/pci@1/pci@0/pci@1/pci@0/usb@0,2/hub@4/device@4/storage@0/disk@0,0" 2 "sd"
"/pci@0/pci@0/pci@2/scsi@0/sd@0,0" 0 "sd"
"/pci@0/pci@0/pci@2/scsi@0/sd@1,0" 1 "sd"
"/pci@0/pci@0/pci@2/scsi@0/sd@2,0" 3 "sd"
If a T3 - T7 platform, the disk # to pci path is displayed.
##### sysconfig/prtconf-v.out: (no output implies possible zpool or HW RAID PCI card) #####
Disk 5000cca0253b496c - 00000000 - /pci@400/pci@1/pci@0/pci@0/LSI,sas@0/disk@w5000cca0253b496d,0
Disk 5000cca0253b5728 - 00000001 - /pci@400/pci@1/pci@0/pci@0/LSI,sas@0/disk@w5000cca0253b5729,0
Disk 5001517bb28964a2 - 00000002 - /pci@400/pci@1/pci@0/pci@0/LSI,sas@0/disk@w5001517bb28964a2,0
Disk 5001517bb289649b - 00000003 - /pci@400/pci@1/pci@0/pci@0/LSI,sas@0/disk@w5001517bb289649b,0
Disk 5000cca0253b3b44 - 00000000 - /pci@700/pci@1/pci@0/pci@0/LSI,sas@0/disk@w5000cca0253b3b45,0
Disk 5000cca0253c4320 - 00000001 - /pci@700/pci@1/pci@0/pci@0/LSI,sas@0/disk@w5000cca0253c4321,0
Disk 5001517bb28963ef - 00000002 - /pci@700/pci@1/pci@0/pci@0/LSI,sas@0/disk@w5001517bb28963ef,0
Lists attached Arrays.
##### sysconfig/prtpicl-v.out#####
Array type = ZFS Storage 7335
HW RAID data gathered with Explorer 7.0
##### disks/raidctl_-l_-g.out #####
Disk Vendor Product Firmware Capacity Status HSP
----------------------------------------------------------------------------
0.0.0 SEAGATE ST914602SSUN146 0400 136.7G GOOD N/A
Disk Vendor Product Firmware Capacity Status HSP
----------------------------------------------------------------------------
0.1.0 N/AGOOD N/A
Lists Cougar card Information if newer version explorer used.
##### RAIDmanager/getconfig_1.out See doc 1331121.1 #####
Controller: Optimal Ser #: 00820AA0059 BIOS: 5.2-0 (17757) Battery: Not Installed
Logical Dev: 0 Simple_volume Optimal 285685 MB (0,8) Bootable
Logical Dev: 1 Simple_volume Optimal 285685 MB (0,9)
Logical Dev: 2 5 Optimal 857075 MB (0,10)(0,11)(0,19)(0,18)
Logical Dev: 3 5 Optimal 857075 MB (0,14)(0,15)(0,16)(0,13)
Phys Dev: 0 Online 0,8 SEAGATE ST930003SSUN300G 0D70 00090370E2HJ
Phys Dev: 1 Online 0,9 SEAGATE ST930003SSUN300G 0D70 00090370E66S HDD SMART error!
Phys Dev: 2 Online 0,10 SEAGATE ST930003SSUN300G 0D70 00090370DACN
Phys Dev: 3 Online 0,11 SEAGATE ST930003SSUN300G 0D70 00090370C3YD
Phys Dev: 4 Failed 0,12 SEAGATE ST930003SSUN300G 0D70 00090370E8Y8
Phys Dev: 5 Online 0,13 SEAGATE ST930003SSUN300G 0D70 00100371NQBQ
Phys Dev: 6 Online 0,14 SEAGATE ST930003SSUN300G 0D70 00100271JPX2
Phys Dev: 7 Online 0,15 SEAGATE ST930003SSUN300G 0D70 00090370EX1J
Phys Dev: 8 Online 0,16 SEAGATE ST930003SSUN300G 0D70 00100271FZEA
Phys Dev: 9 Failed 0,17 SEAGATE ST930003SSUN300G 0D70 00100271JB0F
Phys Dev: 10 Online 0,18 SEAGATE ST930003SSUN300G 0D70 00100371MWL8
Phys Dev: 11 Online 0,19 SEAGATE ST930003SSUN300G 0D70 00090370E1AD
##### RAIDmanager/RaidEvt.log (Please note that disks start a #0 & system labels lowest disk #1) #####
July 28, 2012 8:33:25 PM COT WRN 402:A01C0S11L-- sbogadm04 S.M.A.R.T. slot 3, S/N 001041G3M2DE PFV3M2DE (Vendor: HITACHI Model: H103030SCSUN300G).
July 28, 2012 8:33:25 PM COT WRN 402:A01C0S16L-- sbogadm04 S.M.A.R.T. slot 8, S/N 001041G3Z7TE PFV3Z7TE (Vendor: HITACHI Model: H103030SCSUN300G).
July 28, 2012 8:33:25 PM COT WRN 402:A01C0S17L-- sbogadm04 S.M.A.R.T. slot 9, S/N 001041G408DE PFV408DE (Vendor: HITACHI Model: H103030SCSUN300G).
July 28, 2012 8:33:25 PM COT WRN 402:A01C0S19L-- sbogadm04 S.M.A.R.T. slot 11, S/N 001041G4011E PFV4011E (Vendor: HITACHI Model: H103030SCSUN300G).
July 28, 2012 8:33:29 PM COT INF 19434:A00C-S--L-- sbogadm04 User root logged into sbogadm04 with administrative privileges.
July 28, 2012 8:54:52 PM COT INF 1:A00C-S--L-- sbogadm04 Successfully updated the controller image: sbogadm04, controller 1.
July 28, 2012 9:15:48 PM COT INF 10572:A0-1C-S--L-- sbogadm04 Sun StorageTek RAID Manager started on TCP/IP port number 34,571.
July 28, 2012 9:22:29 PM COT INF 19434:A00C-S--L-- sbogadm04 User root logged into sbogadm04 with administrative privileges.
July 28, 2012 9:15:48 PM COT INF 10572:A0-1C-S--L-- sbogadm04 Sun StorageTek RAID Manager started on TCP/IP port number 34,571.
July 28, 2012 9:32:08 PM COT INF 19434:A00C-S--L-- sbogadm04 User root logged into sbogadm04 with administrative privileges.
July 28, 2012 9:36:08 PM COT WRN 402:A01C0S17L-- sbogadm04 S.M.A.R.T. slot 9, S/N 001041G408DE PFV408DE (Vendor: HITACHI Model: H103030SCSUN300G).
July 28, 2012 9:36:08 PM COT WRN 402:A01C0S11L-- sbogadm04 S.M.A.R.T. slot 3, S/N 001041G3M2DE PFV3M2DE (Vendor: HITACHI Model: H103030SCSUN300G).
July 28, 2012 9:36:08 PM COT WRN 402:A01C0S19L-- sbogadm04 S.M.A.R.T. slot 11, S/N 001041G4011E PFV4011E (Vendor: HITACHI Model: H103030SCSUN300G).
Lists Niwot card info if newer version explorer used.
##### RAIDmanager/MegaCli/CfgDsply-aALL.out See doc 1397311.1 #####
DISK GROUP: 0 278.464 GB Optimal Primary-1, Secondary-0, RAID Level Qualifier-0
-----ERRORS-----
Disk Slot DevID Port Media Other Pred State FW SAS Addr Drive SMART
0 0 9 1(path0) 0 0 0 Online, Spun Up A2B0 0x5000cca025103c25 HITACHI H106030SDSUN300GA2B01205N8XTEB No
1 1 8 0(path0) 0 0 0 Online, Spun Up A2B0 0x5000cca025264115 HITACHI H106030SDSUN300GA2B01205NP15ZB No
##### RAIDmanager/MegaCli/GetEvents-aALL.out #####
Success in AdpEventLog
Lists Pool status & lower level objects. Also contains a link to a usefull ZFS doc.
##### disks/zfs/zpool_status_-v.out See doc 1004209.1 #####
Pool: rpool ONLINE c0t5000CCA01D87F058d0s0 c0t5000CCA01D8E8AA4d0s0
Pool: zp-730test1 ONLINE c0t600144F08F08C858000054C667EF0001d0
Pool: zp-730test2 ONLINE c0t600144F08F08C858000054C6681F0004d0
Lists Mirrors & submirrors with status & a link to a useful SVM doc. Also lists lower level objects if in faulted status.
##### disks/svm/metastat-t.out See doc 1003847.1 #####
d50: Mirror
d51: Submirror of d50 State: Okay Mon Nov 1 11:54:32 2010
d100: Mirror
d41: Submirror of d100 State: Okay Mon Nov 1 11:54:32 2010
d10: Mirror
d11: Submirror of d10 State: Okay Mon Nov 1 11:54:32 2010
d0: Mirror
d1: Submirror of d0 State: Okay Mon Nov 1 11:54:31 2010
d60: Mirror
d61: Submirror of d60 State: Okay Mon Nov 1 11:54:33 2010
Lists Volumes & a link to a useful vxvm doc. Also lists lower level objects if in faulted status.
##### disks/vxvm/vxprint-th.out See doc 1004532.1#####
v crash - ENABLED ACTIVE 31462500 ROUND - fsgen
c0t5000C500334503B3d0 c0t5000C500334505F7d0
v rootvol - ENABLED ACTIVE 83887500 ROUND - root
c0t5000C500334503B3d0 c0t5000C500334505F7d0
v swapvol - ENABLED SYNC 469762500 ROUND - swap
c0t5000C500334503B3d0 c0t5000C500334503B3d0 c0t5000C500334505F7d0
v dumpvol - ENABLED ACTIVE 41943040 SELECT - fsgen
c3t5005076802301E28d1
v oemvol - ENABLED ACTIVE 20971520 SELECT - fsgen
c3t5005076802301E28d0
v vol01 - ENABLED ACTIVE 10485760 SELECT vol01-01 fsgen
c3t5005076802301E28d0 c3t5005076802301E28d1 c3t5005076802301E28d2 c3t5005076802301E28d3
...
##### sysconfig/ifconfig-a.out #####
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
lo0:1: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
zone fwgams10
inet 127.0.0.1 netmask ff000000
nxge0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
inet 172.30.184.69 netmask ffffffe0 broadcast 172.30.184.95
ether 0:21:28:58:1a:3c
nxge0:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
zone fwgams10
inet 172.30.184.70 netmask ffffffe0 broadcast 172.30.184.95
nxge1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
inet 10.1.1.13 netmask ff000000 broadcast 10.255.255.255
ether 0:21:28:58:1a:3d
*************************** Fault Information ********************************
##### sysconfig/crash/ls-al_var_crash*.out (no cores if empty) #####
-rw-r--r-- 1 root root 2 May 17 14:20 bounds
-rw-r--r-- 1 root root 1706728 May 8 22:57 unix.0
-rw-r--r-- 1 root root 1706728 May 9 11:56 unix.1
-rw-r--r-- 1 root root 1562250 May 10 13:08 unix.2
-rw-r--r-- 1 root root 1706728 May 10 15:22 unix.3
-rw-r--r-- 1 root root 1588723712 May 10 13:09 vmcore.2
-rw-r--r-- 1 root root 1880809472 May 10 15:23 vmcore.3
##### fma/fmdump.out #####
Apr 02 05:17:43.7549 6dd87a3b-8b9a-cb7c-f010-af2b3bb16786 PCIEX-8000-0A
Apr 04 09:57:49.8446 21c312ab-4c48-ee92-c762-e2680ae35b74 FMD-8000-0W
Apr 04 15:54:22.4906 b3895ac1-e2ad-c58f-f189-f2bf8fb0db53 SUN4V-8002-42
Apr 09 23:59:05.3990 773efc10-2b1b-4961-809d-86f3594da8d0 SUN4V-8000-E2
Apr 13 04:07:34.3343 e6cf8844-2c8c-c0b1-8e21-f273ed3fff76 SUN4V-8000-E2
May 10 03:23:18.1358 45b080f8-b0ce-eb40-d6ab-fc16e058ca1c SUN4V-8002-42
Lists CPU & memory related retirements.
##### fma/fmstat-a-mcpumem-retire.out#####
cpu_blfails 0 failed cpu blacklists
cpu_blsupp 0 cpu blacklists suppressed
cpu_fails 0 cpu faults unresolveable
cpu_flts 0 cpu faults resolved
page_fails 0 page faults unresolveable
page_flts 0 page faults resolved
Will indicate if the cpumem-diagnosis engine is offline for T5xx0s.
##### fma/fmadm-config.out #####
##### fma/fmdump-e.out (Skips CEs & other benign ereports) #####
May 17 14:11:07.6296 ereport.fm.ferg.invalid
May 17 14:11:07.5240 ereport.io.pci.fabric
May 17 14:11:07.5240 ereport.io.pci.fabric
May 17 14:11:07.5240 ereport.io.pci.fabric
May 17 14:11:07.5240 ereport.io.pci.fabric
May 17 14:11:07.5240 ereport.io.pci.dpe
May 17 14:11:07.5240 ereport.io.pci.mdpe
May 17 14:11:07.5240 ereport.io.pci.sserr
May 17 14:11:07.5240 ereport.io.pciex.tl.ptlp
May 17 14:11:07.5240 ereport.io.pciex.rc.nfe-msg
May 17 14:11:07.5240 ereport.io.pci.dpe
May 17 14:11:07.5240 ereport.io.pci.sec-mdpe
May 17 14:11:07.5240 ereport.io.pciex.a-nonfatal
May 17 14:11:07.5240 ereport.io.pciex.tl.ptlp
May 17 14:11:07.5240 ereport.io.pciex.rc.ce-msg
May 17 14:11:07.5240 ereport.io.pci.dpe
May 17 14:11:07.5240 ereport.io.pci.sec-mdpe
May 17 14:11:07.5240 ereport.io.pciex.a-nonfatal
May 17 14:11:07.5240 ereport.io.pciex.tl.ptlp
May 17 14:11:07.5240 ereport.io.pciex.rc.ce-msg
May 17 14:11:07.6296 ereport.fm.ferg.invalid
May 17 14:30:23.6509 ereport.cpu.ultraSPARC-T2plus.dau
May 17 14:36:42.1586 ereport.fm.ferg.invalid
If T3 or T4, lists the DIMM # to path data.
##### fma/fmtopo-V.out #####
DIMM 0 - /SYS/PM0/CMP0/BOB0/CH0/D0
DIMM 1 - /SYS/PM0/CMP0/BOB0/CH0/D1
DIMM 2 - /SYS/PM0/CMP0/BOB0/CH1/D0
DIMM 3 - /SYS/PM0/CMP0/BOB0/CH1/D1
DIMM 4 - /SYS/PM0/CMP0/BOB1/CH0/D0
DIMM 5 - /SYS/PM0/CMP0/BOB1/CH0/D1
DIMM 6 - /SYS/PM0/CMP0/BOB1/CH1/D0
DIMM 7 - /SYS/PM0/CMP0/BOB1/CH1/D1
DIMM 8 - /SYS/PM0/CMP0/BOB2/CH0/D0
DIMM 9 - /SYS/PM0/CMP0/BOB2/CH0/D1
DIMM 10 - /SYS/PM0/CMP0/BOB2/CH1/D0
DIMM 11 - /SYS/PM0/CMP0/BOB2/CH1/D1
DIMM 12 - /SYS/PM0/CMP0/BOB3/CH0/D0
DIMM 13 - /SYS/PM0/CMP0/BOB3/CH0/D1
DIMM 14 - /SYS/PM0/CMP0/BOB3/CH1/D0
DIMM 15 - /SYS/PM0/CMP0/BOB3/CH1/D1
Sorts & counts unum & DIMM entries. The date range of events will be useful.
##### fma/fmdump-eV.out (Note: Uses server's timezone) #####
The first fmdump-eV entry is from Apr 13 2011 03:09:52.
---- FIRST DATE ---- ---- LAST DATE ---- COUNT DEVICE
Apr 13 2011 03:09:52 thru May 17 2011 15:06:18 64214 MB/CMP0/BR0/CH0
Apr 13 2011 03:13:01 thru May 17 2011 15:06:56 54833 MB/CMP0/BR0: CH0/D1/J0600
Apr 13 2011 03:13:13 thru May 17 2011 15:06:18 5426 MB/CMP0/BR0: CH0/D0/J0500
Apr 13 2011 04:07:22 thru May 16 2011 12:30:29 22 MB/CMP0/BR0: CH0/D0/J0500 CH1/D0/J0700
Apr 13 2011 04:07:22 thru May 17 2011 06:26:46 137 MB/CMP0/BR0
May 09 2011 11:31:48 thru May 17 2011 14:30:23 53 MB/CMP0/BR0: CH0/D1/J0600 CH1/D1/J0800
May 13 2011 11:54:57 thru May 17 2011 14:11:07 12 /pci@400
May 13 2011 11:54:57 thru May 17 2011 14:11:07 15 /pci@400/pci@0
May 13 2011 11:54:57 thru May 17 2011 14:11:07 15 /pci@400/pci@0/pci@8
May 13 2011 11:54:57 thru May 17 2011 14:11:07 15 /pci@400/pci@0/pci@8/scsi@0
##### fma/fmDump-eV.out (Note: Uses TSE's timezone.) #####
The first fmdump-eV entry is from May 17 2011 03:10:11.
---- FIRST DATE ---- ---- LAST DATE ---- COUNT DEVICE
May 17 2011 03:10:11 thru May 17 2011 15:06:18 1785 MB/CMP0/BR0/CH0
May 17 2011 03:14:25 thru May 17 2011 15:06:18 153 MB/CMP0/BR0: CH0/D0/J0500
May 17 2011 03:25:40 thru May 17 2011 15:06:56 1891 MB/CMP0/BR0: CH0/D1/J0600
May 17 2011 05:55:42 thru May 17 2011 06:26:46 3 MB/CMP0/BR0
May 17 2011 08:04:41 thru May 17 2011 15:02:23 108
May 17 2011 13:20:07 thru May 17 2011 14:11:07 8 /pci@400
May 17 2011 13:20:07 thru May 17 2011 14:11:07 10 /pci@400/pci@0
May 17 2011 13:20:07 thru May 17 2011 14:11:07 10 /pci@400/pci@0/pci@8
May 17 2011 13:20:07 thru May 17 2011 14:11:07 10 /pci@400/pci@0/pci@8/scsi@0
May 17 2011 13:20:07 thru May 17 2011 14:30:23 3 MB/CMP0/BR0: CH0/D1/J0600 CH1/D1/J0800
##### fma/fmadm-faulty.out #####
--------------- ------------------------------------ -------------- ---------
TIME EVENT-ID MSG-ID SEVERITY
--------------- ------------------------------------ -------------- ---------
May 10 03:23:18 45b080f8-b0ce-eb40-d6ab-fc16e058ca1c SUN4V-8002-42 Critical
Fault class : fault.memory.dimm-ue-imminent 95%
Affects : mem:///unum=MB/CMP0/BR0/CH0/D0/J0500
faulted but still in service
FRU : "MB/CMP0/BR0/CH0/D0/J0500" (hc://:serial=00AD0110111091A63A:part=511-1151-01-Rev-05/motherboard=0/chip=0/branch=0/dram-channel=0/dimm=0) 95%
faulty
Description : A pattern of correctable errors has been observed suggesting the
potential exists that an uncorrectable error may occur.
Refer to http://sun.com/msg/SUN4V-8002-42 for more information.
Apr 04 09:57:49 21c312ab-4c48-ee92-c762-e2680ae35b74 FMD-8000-0W Minor
Fault class : defect.sunos.fmd.nosub
Description : The Solaris Fault Manager received an event from a component to
which no automated diagnosis software is currently subscribed.
Refer to http://sun.com/msg/FMD-8000-0W for more information.
Apr 02 05:17:43 6dd87a3b-8b9a-cb7c-f010-af2b3bb16786 PCIEX-8000-0A Critical
Fault class : fault.io.pciex.device-interr
Affects : dev:////pci@400
faulted but still in service
FRU : "MB" (hc://:product-id=SUNW,T5240:chassis-id=FML1020004:server-id=fwgams3:serial=0328MSL-1002AV01M0:part=540793902/motherboard=0)
faulty
Description : A problem was detected for a PCIEX device.
Refer to http://sun.com/msg/PCIEX-8000-0A for more information.
Apr 09 23:59:05 773efc10-2b1b-4961-809d-86f3594da8d0 SUN4V-8000-E2 Critical
Fault class : fault.memory.bank max 95%
Affects : mem:///unum=MB/CMP0/BR0/CH0/D1/J0600
mem:///unum=MB/CMP0/BR0/CH1/D1/J0800
faulted but still in service
FRU : "MB/CMP0/BR0/CH0/D1/J0600" (hc://:serial=00AD01101110A1A65B:part=511-1151-01-Rev-05/motherboard=0/chip=0/branch=0/dram-channel=0/dimm=1) max 95%
"MB/CMP0/BR0/CH1/D1/J0800" (hc://:serial=00AD0110111031A66B:part=511-1151-01-Rev-05/motherboard=0/chip=0/branch=0/dram-channel=1/dimm=1) 95%
faulty
Description : The number of errors associated with this memory module has
exceeded acceptable levels. Refer to
http://sun.com/msg/SUN4V-8000-E2 for more information.
##### Tx000/showfaults_-v #####
Last POST Run: Thu Apr 28 14:04:43 2011
Post Status: Passed all devices
ID Time FRU Class Fault
1 Apr 28 14:00:36 /SYS/MB Host detected fault MSGID: PCIEX-8000-0A UUID: 6dd87a3b-8b9a-cb7c-f010-af2b3bb16786
2 Apr 28 14:00:36 /SYS/MB Host detected fault MSGID: SUN4V-8000-E2 UUID: d807ae91-64fc-e7ed-9b00-96b3fb44c2a3
3 May 10 08:21:57 /SYS/MB/CMP0/BR0/CH0/D0 Host detected fault MSGID: SUN4V-8002-42 UUID: 45b080f8-b0ce-eb40-d6ab-fc16e058ca1c
4 Apr 28 14:00:36 /SYS/MB/CMP0/BR0/CH0/D0 Host detected fault MSGID: SUN4V-8000-E2 UUID: e6cf8844-2c8c-c0b1-8e21-f273ed3fff76
5 Apr 28 14:00:36 /SYS/MB/CMP0/BR0/CH0/D0 Host detected fault MSGID: SUN4V-8000-E2 UUID: d807ae91-64fc-e7ed-9b00-96b3fb44c2a3
6 Apr 28 14:00:36 /SYS/MB/CMP0/BR0/CH0/D1 Host detected fault MSGID: SUN4V-8000-E2 UUID: 773efc10-2b1b-4961-809d-86f3594da8d0
7 Apr 28 14:00:36 /SYS/MB/CMP0/BR0/CH0/D1 Host detected fault MSGID: SUN4V-8002-42 UUID: b3895ac1-e2ad-c58f-f189-f2bf8fb0db53
8 Apr 28 14:00:36 /SYS/MB/CMP0/BR0/CH0/D1 Host detected fault MSGID: SUN4V-8000-E2 UUID: d807ae91-64fc-e7ed-9b00-96b3fb44c2a3
9 Apr 28 14:00:36 /SYS/MB/CMP0/BR0/CH1/D0 Host detected fault MSGID: SUN4V-8000-E2 UUID: e6cf8844-2c8c-c0b1-8e21-f273ed3fff76
10 Apr 28 14:00:36 /SYS/MB/CMP0/BR0/CH1/D0 Host detected fault MSGID: SUN4V-8000-E2 UUID: d807ae91-64fc-e7ed-9b00-96b3fb44c2a3
11 Apr 28 14:00:36 /SYS/MB/CMP0/BR0/CH1/D1 Host detected fault MSGID: SUN4V-8000-E2 UUID: 773efc10-2b1b-4961-809d-86f3594da8d0
12 Apr 28 14:00:36 /SYS/MB/CMP0/BR0/CH1/D1 Host detected fault MSGID: SUN4V-8000-E2 UUID: d807ae91-64fc-e7ed-9b00-96b3fb44c2a3
##### sysconfig/last-20-reboot.out #####
reboot system boot Tue May 17 14:17
reboot system down Tue May 17 13:41
reboot system boot Tue May 17 13:25
reboot system down Tue May 17 13:19
reboot system boot Mon May 16 13:51
reboot system down Mon May 16 13:42
reboot system boot Fri May 13 13:51
reboot system down Fri May 13 13:39
reboot system boot Fri May 13 11:58
reboot system down Fri May 13 11:58
reboot system boot Fri May 13 11:47
reboot system down Fri May 13 10:43
reboot system boot Wed May 11 11:47
reboot system down Wed May 11 11:43
reboot system boot Wed May 11 11:43
reboot system down Wed May 11 11:38
reboot system boot Wed May 11 11:38
reboot system down Wed May 11 09:26
reboot system boot Wed May 11 09:25
reboot system down Tue May 10 21:32
Provides an indication of the ILOM to host clock differences.
##### Tx000/showdate ##### (SC normally in UTC!)
SC Date: Tue May 17 20:02:57 2011
Host Date: Tue May 17 15:04:25 2011
##### Tx000/showlogs_-v #####
Apr 28 13:53:18: Chassis |critical: "Host has been powered off"
Apr 28 13:57:55: Audit |major : "Upgrade Succeeded"
Apr 28 14:00:36: Chassis |major : "Host detected fault, MSGID: FMD-8000-0W"
Apr 28 14:00:36: Chassis |major : "Host detected fault, MSGID: SUN4V-8000-E2"
Apr 28 14:00:36: Chassis |major : "Host detected fault, MSGID: PCIEX-8000-0A"
Apr 28 14:00:36: Chassis |major : "Host detected fault, MSGID: SUN4V-8002-42"
Apr 28 14:00:36: Chassis |major : "Host detected fault, MSGID: SUN4V-8000-E2"
Apr 28 14:00:37: Chassis |major : "Host detected fault, MSGID: SUN4V-8000-E2"
Apr 28 14:00:38: Chassis |major : "Host has been powered on"
Apr 28 14:02:25: Chassis |major : "Hot removal of HDD4"
Apr 28 14:02:26: Chassis |major : "Hot insertion of HDD3"
Apr 28 14:02:28: Chassis |major : "Hot insertion of HDD2"
Apr 28 14:02:29: Chassis |major : "Hot removal of HDD7"
Apr 28 14:02:43: Chassis |major : "Hot insertion of HDD1"
Apr 28 14:02:43: Chassis |major : "Hot removal of HDD6"
Apr 28 14:02:44: Chassis |major : "Hot insertion of HDD0"
Apr 28 14:02:44: Chassis |major : "Hot removal of HDD5"
Apr 28 14:05:32: Chassis |major : "Host is running"
May 10 08:21:59: Chassis |major : "Host detected fault, MSGID: SUN4V-8002-42"
May 11 14:22:35: Chassis |critical: "SP Request to Reset Host due to Watchdog"
May 11 14:22:35: Chassis |major : "Host is running"
May 11 16:44:53: Chassis |critical: "SP Request to Reset Host due to Watchdog"
May 11 16:44:53: Chassis |major : "Host is running"
sc>
##### messages/messages.1 Filtered log of non-external Storage messages #####
Apr 28 08:02:42 fwgams3 xntpd[410]: [ID 866926 daemon.notice] xntpd exiting on signal 15
Apr 28 08:02:48 fwgams3 syslogd: going down on signal 15
Apr 28 08:06:45 fwgams3 genunix: [ID 540533 kern.notice] ^MSunOS Release 5.10 Version Generic_142900-15 64-bit <--- logs filtered to display reboots & errors
Apr 28 06:37:47 fwgams3 xntpd[432]: [ID 866926 daemon.notice] xntpd exiting on signal 15
Apr 28 06:37:47 fwgams3 rpcbind: [ID 564983 daemon.error] rpcbind terminating on signal.
Apr 28 08:37:54 fwgams3 genunix: [ID 672855 kern.notice] syncing file systems...
Apr 28 08:37:55 fwgams3 genunix: [ID 904073 kern.notice] done
Apr 28 08:38:57 fwgams3 genunix: [ID 540533 kern.notice] ^MSunOS Release 5.10 Version Generic_142900-15 64-bit
Apr 28 08:49:40 fwgams3 snmpXdmid: [ID 216524 daemon.error] Registration with DMI failed. err = 831.
##### messages/messages Filtered log of non-external Storage messages #####
May 17 12:51:33 fwgams3 xntpd[295]: [ID 990412 daemon.error] can't open /var/ntp/ntp.drift.TEMP: No space left on device
May 17 12:55:51 fwgams3 tictimed[803]: [ID 768879 user.error] [tictimed] Error opening file /var/log/.lwact.16838
May 17 13:20:07 fwgams3 ^Mpanic[cpu70]/thread=2a10267fca0:
May 17 13:20:07 fwgams3 unix: [ID 198415 kern.notice] Fatal error has occured in: PCIe fabric.(0x0)(0x63)
May 17 13:20:07 fwgams3 unix: [ID 100000 kern.notice]
May 17 13:20:07 fwgams3 genunix: [ID 723222 kern.notice] 000002a10267f700 px:px_err_panic+1ac (1959800, 13c7c00, 63, 2a10267f7b0, 0, 0)
May 17 13:20:07 fwgams3 genunix: [ID 672855 kern.notice] syncing file systems...
May 17 13:23:44 fwgams3 genunix: [ID 851671 kern.notice] dump succeeded
May 17 13:24:47 fwgams3 genunix: [ID 540533 kern.notice] ^MSunOS Release 5.10 Version Generic_142900-15 64-bit
May 17 13:25:39 fwgams3 savecore: [ID 570001 auth.error] reboot after panic: Fatal error has occured in: PCIe fabric.(0x0)(0x63)
May 17 13:25:39 fwgams3 savecore: [ID 219048 auth.error] not enough space in /var/crash/fwgams3 (715 MB avail, 1934 MB needed)
May 17 13:25:40 fwgams3 savecore: [ID 570001 auth.error] reboot after panic: Fatal error has occured in: PCIe fabric.(0x0)(0x63)
May 17 13:25:40 fwgams3 savecore: [ID 219048 auth.error] not enough space in /var/crash/fwgams3 (715 MB avail, 1934 MB needed)
May 17 13:36:32 fwgams3 snmpXdmid: [ID 216524 daemon.error] Registration with DMI failed. err = 831.
May 17 14:11:07 fwgams3 genunix: [ID 843051 kern.info] NOTICE: SUNW-MSG-ID: SUNOS-8000-0G, TYPE: Error, VER: 1, SEVERITY: Major
May 17 14:11:07 fwgams3 ^Mpanic[cpu70]/thread=2a1026adca0:
May 17 14:11:07 fwgams3 unix: [ID 198415 kern.notice] Fatal error has occured in: PCIe fabric.(0x0)(0x63)
May 17 14:11:07 fwgams3 unix: [ID 100000 kern.notice]
May 17 14:11:07 fwgams3 genunix: [ID 723222 kern.notice] 000002a1026ad700 px:px_err_panic+1ac (1959800, 13c7c00, 63, 2a1026ad7b0, 0, 0)
May 17 14:11:07 fwgams3 genunix: [ID 672855 kern.notice] syncing file systems...
May 17 14:16:02 fwgams3 genunix: [ID 851671 kern.notice] dump succeeded
May 17 14:17:05 fwgams3 genunix: [ID 540533 kern.notice] ^MSunOS Release 5.10 Version Generic_142900-15 64-bit
May 17 14:17:56 fwgams3 savecore: [ID 570001 auth.error] reboot after panic: Fatal error has occured in: PCIe fabric.(0x0)(0x63)
May 17 14:28:48 fwgams3 snmpXdmid: [ID 216524 daemon.error] Registration with DMI failed. err = 831.
May 17 14:29:37 fwgams3 iscsi: [ID 286457 kern.notice] NOTICE: iscsi connection(129) unable to connect to target SENDTARGETS_DISCOVERY (errno:146)
Filters console output for certein events (panics, reboots, ...) & a dated entry soon after.
##### Tx000/consolehistory_-v #####
OpenBoot 4.30.10, 32544 MB memory available, Serial #89659964.
fwgams3 console login: May 17 13:25:43 fwgams3 iscsi: NOTICE: iscsi connection(45) unable to connect to target SENDTARGETS_DISCOVERY (errno:146)
May 17 13:25:43 fwgams3 iscsi: NOTICE: iscsi discovery failure - SendTargets (172.030.184.069)
May 17 13:25:57 fwgams3 pkid[814]: PKI Service has been started
May 17 13:26:02 fwgams3 pkid[1227]: Reflection PKI Services Manager is already running [process id: 814]
panic[cpu70]/thread=2a1026adca0: Fatal error has occured in: PCIe fabric.(0x0)(0x63)
000002a1026ad700 px:px_err_panic+1ac (1959800, 13c7c00, 63, 2a1026ad7b0, 0, 0)
%l0-3: 0000000000011801 0000000001959800 0000000000000000 0000000000000001
%l4-7: 0000000000000000 0000000001875c00 0000000000000001 0000000000000000
000002a1026ad810 px:px_err_fabric_intr+1b4 (300055f1540, 0, 200, 1, 63, 200)
%l0-3: 0000000000000200 0000000001959e60 0000000001959c00 0000000000000001
%l4-7: 0000000001959e48 0000000001959c00 0000000001959e40 0000000001959c00
000002a1026ad980 px:px_msiq_intr+1e8 (60031c4d7e0, 30003ceb860, 13ba2dc, 0, 1, 300014d5bc8)
%l0-3: 0000060031cc3e60 00000300055ef800 0000030003ceb860 0000000000000000
%l4-7: 0000000000000000 00000000034c0000 000002a1026ada80 0000000000000030
syncing file systems
OpenBoot 4.30.10, 32544 MB memory available, Serial #89659964.
fwgams3 console login: May 17 14:18:03 fwgams3 iscsi: NOTICE: iscsi connection(51) unable to connect to target SENDTARGETS_DISCOVERY (errno:146)
May 17 14:18:03 fwgams3 iscsi: NOTICE: iscsi discovery failure - SendTargets (172.030.184.069)
May 17 14:18:13 fwgams3 pkid[683]: PKI Service has been started
May 17 14:18:20 fwgams3 pkid[1119]: Reflection PKI Services Manager is already running [process id: 683]
Analysis of small number of known problems with T3/4 or T5xx0 servers.
========== Analysis (if any) ============
Bug 7070900. Replace failed DIMM. Full power cycle may stabilize the system.
Bug 7133194. Some DIMMs erroneously have ereports. Please open a defect to SunBugTraq (minor issue).
List of Analizable Faults:
Sun Alert 1468850.1: T4-4 with FW prior 8.1.5 + fault SUN4V-8000-E2 + failures on BOB0/CH1/D0 & BOB1/CH1/D0, "Alert 1468850.1"
T3-2 with fault ILOM-8000-2V "HH3 T3-2 Risers"
T3-2 with fault SUN4V-8002-PX, "T3-2 Riser2"
7016293 on T3 or T4 with Panic trap 9, "Trap9 7016293"
7108029 on T3 or T4 with Panic Trap 31 + 147440-04 (assumed to be vxio related), "vxio 147440-04"
7142527 on T3 or T4 with FW prior 8.1.5 + fault SUN4V-8002-US (but could be 7172435), "T3 DYNA LEAK"
7172435 on T3 or T4 with FW 8.1.5 or later + fault SUN4V-8002-US, "ILOM 7172435"
7141035 on T3 or T4 with message Send Mondo, "T3 Mondo"
7151759 on T3 or T4 with fault SUN4V-8002-KQ, "T3 c2c"
7146062 on T3 or T4 with FW prior 8.1.5 + fault SPT-8000-DH, "Power Glitch"
7064258 on T3 or T4 with FW prior 8.1.4.d + message Disconnected command timeout ..., "T3 Discon Timeout"
7115336 on T3 or T4 with message PCI panic (0x43) , "T3 PCI 0x43"
6863127 on T5xx0 with message PCI panic (0x41) or 43 + fault PCIEX-8000-3S + prior KJP 142909-17, "T5x PCI 0x41"
DIMM voltage mismatch on T5xx0, "DIMM Volt"
ILOM memory leak on T5xx0 with FW prior 7.2.7.b + fault SUN4V-8002-SP, "ILOM Mem Leak"
1.5V Hynix DIMM failure on T5240 + Unrecoverable Hardware Panic, "Hynix 1.5V"
1.5V Micron DIMM failure on T5240 + Unrecoverable Hardware Panic, "Micron 1.5V"
Warning if T5140 & FW prior 7.3.1.a + missing 511-1604, "PDB FW"
Warning if T5120 & FW prior 7.3.1.a + missing 511-1604, "PDB FW"
7133194 on CMT if 2 DIMMs have ereport at exact same time, "DupEreport"
7043851 on CMT if IPMItool panic due to bmc:do_vc2bmc , "IPMI BMC"
All CMTs with uncertified DIMMs, "Uncertified DIMM"
All CMTs with without 147705-01 + fault PCIEX-8000-KP, "PCIE driver"
Logs
An Explosum/Snapper logging feature exists & places entries below on a Santa Clara lab server so that critical faults/conditions can be tracked. If a '.' follows the SR number, then the entry was due to snapper as in the T4-4 entry below, otherwise from explosum.
Tue Apr 16 12:00:00 PDT 2013
3-7072747891 T5220 7.0.9.c S10U6 BEL0818HZZ
3-7073994802 E4900 5.20.14 S10U9 0408HH2215
3-7074228321 T2000 6.3.0 S10U3 0651NNN0ER
3-7074228321 T2000 6.3.0 S10U3 0651NNN0ER
3-7073549901 V440 4.22.33 S9U7
3-7074429541.T4-4 .8.2.2.c 1307BDY72C
3-7019584361 X4102 11.1S5.1 0817ALB1EA SUNOS-8000-KL
3-7007081871 T5140 7.1.7 S10U6 PCIEX-8000-3S SUNOS-8000-1L
3-7068823851 T5140 7.2.2.b S10U6 TFL0801002 FMD-8000-3F
3-7019584361 X4102 11.1S5.1 0817ALB1EA SUNOS-8000-KL
3-7070119133 T2000 6.5.11 S10U4 0824NNN0D2
3-7057487097 T2000 6.6.7 S10U6 0642NNN0G1 SUN4V-8000-8Q
3-7057487097 T2000 6.6.7 S10U6 0642NNN0G1 SUN4V-8000-8Q
3-6994233271 I86PC S10U9 GB8050BM5L SUNOS-8000-FU PCIEX-8000-KP
3-6994233271 I86PC S10U9 CZ3125JL5H SUNOS-8000-FU
3-7054275671 T4-4 8.2.1.b S11S13.4 1246BDY456
3-7075256021 T5120 7.2.7.b S10U6 BEL08046O0
3-7059605741 X4500 S10U5 0746AMT039 ZFS-8000-D3
3-7053197451 T6340 7.2.6 S10U9
3-7072103291 T5240 7.4.2 S10U9 BDL1048111 DupEreport
3-7053197451 T6340 7.2.6 S10U8 08388N0055
Attachments
This solution has no attachment