Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1956704.1
Update Date:2017-05-01
Keywords:

Solution Type  Problem Resolution Sure

Solution  1956704.1 :   T4-1/T4-2 panics with Fatal error has occured in: PCIe fabric.(0x1)(0x41)  


Related Items
  • SPARC T4-1
  •  
  • Netra SPARC T4-1 Server
  •  
  • Netra SPARC T4-2 Server
  •  
  • SPARC T4-2
  •  
Related Categories
  • PLA-Support>Sun Systems>SPARC>CMT>SN-SPARC: T4
  •  




In this Document
Symptoms
Cause
Solution
References


Created from <SR 3-10008933441>

Applies to:

SPARC T4-1 - Version All Versions to All Versions [Release All Releases]
Netra SPARC T4-1 Server - Version All Versions to All Versions [Release All Releases]
Netra SPARC T4-2 Server - Version All Versions to All Versions [Release All Releases]
SPARC T4-2 - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.

Symptoms

 System will panic frequently with:

Dec 14 01:28:24 Madison panic[cpu54]/thread=2a103ee1c60:
Dec 14 01:28:24 Madison unix: [ID 198415 kern.notice] Fatal error has occured in: PCIe fabric.(0x1)(0x41)
Dec 14 01:28:24 Madison unix: [ID 100000 kern.notice]
Dec 14 01:28:24 Madison genunix: [ID 723222 kern.notice] 000002a103ee16a0 px:px_err_panic+1c4 (106f2400, 1, 41, 7bfba800, 1, 106f0530)
Dec 14 01:28:24 Madison genunix: [ID 702911 kern.notice]   %l0-3: 000002a103ee1750 000010015f824900 00000000106f2800 000000000000005f
Dec 14 01:28:24 Madison   %l4-7: 0000000000000000 0000000010508400 ffffffffffffffff 0000000000000000
Dec 14 01:28:25 Madison genunix: [ID 723222 kern.notice] 000002a103ee17b0 px:px_err_fabric_intr+1ac (10015f822000, 1, 0, 1, 41, 4000f3da8b8)
Dec 14 01:28:25 Madison genunix: [ID 702911 kern.notice]   %l0-3: 0000000000000008 000000007bfba970 0000000000000008 0000000000000000
Dec 14 01:28:25 Madison   %l4-7: 0000000000000024 000000007bfba800 0000000000000001 000010015f825858
Dec 14 01:28:25 Madison genunix: [ID 723222 kern.notice] 000002a103ee1930 px:px_msiq_intr+208 (10015d6dadc8, 0, 9, 10015f828c58, 1, 2)
Dec 14 01:28:25 Madison genunix: [ID 702911 kern.notice]   %l0-3: 0000000000000000 00000000411e8000 000004000f3d7d08 000010015f822000
Dec 14 01:28:25 Madison   %l4-7: 000010015f828e18 000004000f3da8b8 000010015f815308 0000000000000031
Dec 14 01:28:25 Madison unix: [ID 100000 kern.notice]
Dec 14 01:28:25 Madison genunix: [ID 672855 kern.notice] syncing file systems...
Dec 14 01:28:27 Madison genunix: [ID 904073 kern.notice]  done

 

System may display additional erereports or an FMA alert with the same timestamp as the time of the panic such as:

ereports:

ereport.io.pci.fabric
 
ereport.io.pci.sserr
ereport.io.pci.ma 
ereport.io.pci.sec-sta
ereport.io.pciex.nonfatal
ereport.io.scsi.cmd.disk.dev.rqs.derr
ereport.io.pci.sec-rserr
ereport.io.pciex.tl.acsv
ereport.io.pciex.rc.nfe-msg
ereport.io.pciex.rc.mue-msg
 
FMA alert:
SUNOS-8000-J0

The alert may point to the USB PCI path such as:
/pci@400/pci@1/pci@0/pci@b/pci@0/usb@0,2 and other paths along this.
 
 
 

Cause

 Various things can cause these kinds of panics:

1) A problem with the drivers on S11. See T4 system panic after Solaris 11.1 update: Fatal error has occured in: PCIe fabric.(0x1)(0x41) (Doc ID 1519563.1)

2) A problem with network drivers on S10. See Systems Using e1000g Network Cards with LSO Enabled May Panic (Doc ID 1307369.1)

3) A problem with the ehci driver. See Certain Solaris 10 Patches and Solaris 11 SRUs may Cause the EHCI Hardware to Access Freed DMA Addresses, Resulting in PCI Faults (Doc ID 1642787.1)

 

Solution

 If none of the above match your issue, it could be a problem with the USB PCI bus. In this case please open an SR with Oracle support to evaluate the hardware. Please make sure to have an explorer and an ILOM snapshot ready for review.

Kernel analysis of two core dumps showed that the issue could be a problem on the USB PCI bus and the systemboard needed to be replaced.

References

<NOTE:1307369.1> - Systems Using e1000g Network Cards with LSO Enabled May Panic
<NOTE:1519563.1> - T4 system panic after Solaris 11.1 update: Fatal error has occured in: PCIe fabric.(0x1)(0x41)
<NOTE:1642787.1> - Certain Solaris 10 Patches and Solaris 11 SRUs may Cause the EHCI Hardware to Access Freed DMA Addresses, Resulting in PCI Faults

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback