Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-79-1021322.1
Update Date:2017-09-19
Keywords:

Solution Type  Predictive Self-Healing Sure

Solution  1021322.1 :   PCIEX-8000-J5 - PCIEX subsystem problem  


Related Items
  • SPARC T8-1
  •  
  • SPARC T8-4
  •  
  • SPARC M7-8
  •  
  • SPARC T7-4
  •  
  • Oracle SuperCluster M7 Hardware
  •  
  • SPARC M8-8
  •  
  • SPARC T8-2
  •  
  • SPARC T7-2
  •  
  • Oracle SuperCluster M8 Hardware
  •  
  • SPARC M7-16
  •  
  • SPARC T7-1
  •  
Related Categories
  • PLA-Support>Sun Systems>Sun_Other>Sun Collections>SN-OTH: Sun PSH
  •  

PreviouslyPublishedAs
PCIEX-8000-J5


Applies to:

Sun Microsystems > Boards
SPARC M7-8
SPARC M7-16
SPARC T7-1
SPARC T7-2
All Platforms

Purpose

Provide additional information for message ID: PCIEX-8000-J5

Details

Predictive Self-Healing Article
PCIEX subsystem problem

Type

Fault
fault.io.pciex.device-interr-corr

Severity

Major

Description

Too many recovered internal errors have been detected within the specified PCIEX device. This may degrade into a non-recoverable fault.

Automated Response

One or more device instances may be disabled

Impact

Loss of services provided by the device instances associated with this fault

Suggested Action for System Administrator

This error is not always easy to identify as a hardware or software issue.   `fmadm faulty` will provide additional information about the identity of the device or contact Oracle for support.

 

For Solaris 10

  • On first occurrance if the device showing the issue is a network device using the e1000g driver verify that the fix for <BUG 15693815> is in place.  <BUG 15693815> is fixed in Patch: 147270-01(SPARC)  or 147271-01(x86).  If the patch is not installed, install the patch, and clear the fault(s). After installing the Solaris 10 patch, PCIEX-8000-KP and PCIEX-8000-J5 faults should be cleared using the fmadm(1M) command.  For instruction on clearing the fault see:

How to clear faults in FMA after component replacement on Sun Fire[TM] servers. (Doc ID 1009467.1)

How To Clear FMA faults from Solaris[TM] and SC (System Controller) on T1000/T2000 T5120/T5220/T5140/T5240/T5440, T3-1/T3-2/T3-4, T4-1/T4-2/T4-4 (Doc ID 1004229.1)

  • If the device is any other PCI Card an issue exists which can cause good hardware to report this error.  After installing the Solaris 10 patch, PCIEX-8000-KP and PCIEX-8000-J5 faults should be cleared using the fmadm(1M) command.

See Alert <Document 1369835.1>  Solaris 10 SPARC Kernel Patch 137137-09 May Cause Erroneous PCIEX-8000-KP Reports During PCIE Correctable Events

   If these known issues do not apply collect an explorer and snapshot if applicable, and open or progress a Service Request with Oracle. See reference section for information on explorer and snapshot.

 

For Solaris 11 Express based upon builds snv_87 through snv_170

  • If the device is any PCI Card an issue exists which can cause good hardware to report this error.  After upgrading Solaris 11 , PCIEX-8000-KP and PCIEX-8000-J5 faults should be cleared using the fmadm(1M) command.

    See Alert <Document 1369835.1>  Solaris 10 SPARC Kernel Patch 137137-09 May Cause Erroneous PCIEX-8000-KP Reports During PCIE Correctable Events

   If this known issue does not apply collect an explorer and snapshot if applicable, and open or progress a Service Request with Oracle. See reference section for information on explorer and snapshot.

 

For Other Solaris Releases

  • The PCI Device is faulty.  Collect an explorer and snapshot if applicable, and open or progress a Service Request with Oracle. See reference section for information on explorer and snapshot.

 

Notes on determining which Kernel Patch is installed:

The folowing commands can be useful for determining the current running Kernel Patch

  SPARC:      showrev -p | egrep "(137137|147147|147270|147705)-.. O" | cut -d" " -f2
  x86:          showrev -p | egrep "(147148|147271)-.. O" | cut -d" " -f2

 

Useful Command to determine Error rates and whether the issue matches:


Solaris 10 SPARC Kernel Patch 137137-09 May Cause Erroneous PCIEX-8000-KP/-J5 Reports During PCIE Correctable Events (Doc ID 1369835.1)

$ fmdump -e -c 'ereport.io.pciex.dl.btlp' -n "detector.device-path=/pci@10,600000*" \
        -t 01/01/13 <PATH_TO_errlog_FILE> | cut -b1-9 | uniq -c \
        | awk '{print $2,$3,"2013",$4":00,"$1}' | egrep -v TIME \
        | sort -t, -rn +1 | head 

Dec 06 2013 08:00,12 
Dec 03 2013 02:00,4 
Nov 27 2013 00:00,3 
Nov 26 2013 02:00,3 
Nov 22 2013 08:00,3 
Nov 29 2013 08:00,2 
Nov 20 2013 00:00,2 
Nov 19 2013 02:00,2 
Dec 05 2013 21:00,2 
Dec 04 2013 00:00,2


The -c, -n, -t option, and the errorlog file may change for your situation. 

 

 

Details

Too many recovered internal errors have been detected within the specified PCIEX device. This may degrade into a non-recoverable fault.



Product
Boards

Product_uuid
e368db18-0fd5-11d8-84cb-080020a9ed93

References

<BUG:15693815> - SUNBT7015123-SOLARIS_11 PCIEX-8000-J5, TYPE: FAULT, VER: 1, SEVERITY: MAJOR WITH
<NOTE:1009467.1> - How to clear faults in FMA after component replacement on Sun Fire[TM] servers.
<NOTE:1153444.1> - Oracle Services Tools Bundle (STB) - RDA/Explorer, SNEEP, ACT
<NOTE:1533993.1> - Collect XSCF snapshot(s) by running STB7.3 (or newer ) domain Explorer on SPARC Enterprise M3000/M4000/M5000/M8000/M9000 (OPL) Servers
<NOTE:1004229.1> - How To Clear FMA faults from Solaris[TM] and SC (System Controller) on T1000/T2000 T5120/T5220/T5140/T5240/T5440,T6320,T6340, T3-1/T3-2/T3-4, T4-1/T4-2/T4-4

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback