Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-2368595.1
Update Date:2018-03-05
Keywords:

Solution Type  Problem Resolution Sure

Solution  2368595.1 :   T4/T5 S11 System faults on the PCI/SCSI bus with a PCIEX-8000-0A and MPTSAS firmware errors  


Related Items
  • SPARC T4-1
  •  
  • SPARC T5-4
  •  
  • SPARC T5-2
  •  
  • SPARC T4-4
  •  
  • SPARC T5-8
  •  
  • SPARC T4-2
  •  
Related Categories
  • PLA-Support>Sun Systems>SPARC>CMT>SN-SPARC: T5
  •  




In this Document
Symptoms
Cause
Solution
References


Applies to:

SPARC T5-8 - Version All Versions to All Versions [Release All Releases]
SPARC T4-1 - Version All Versions to All Versions [Release All Releases]
SPARC T4-2 - Version All Versions to All Versions [Release All Releases]
SPARC T4-4 - Version All Versions to All Versions [Release All Releases]
SPARC T5-2 - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.

Symptoms

The system exhibits the following FMA alert:

TIME            EVENT-ID                              MSG-ID         SEVERITY
Dec 25 01:42:22 2b26d19a-2028-4f8a-914b-bee5ae3800eb  PCIEX-8000-0A  Critical
Problem Status    : repaired
   Part_Number   : 33170334+1+1
   Serial_Number : 
  Certainty   : 100%
  Affects     : dev:////pci@300/pci@1/pci@0/pci@2/scsi@0
  Status      : out of service, but associated components no longer faulty
    Status           : repaired
    Location         : "/SYS/MB"
    Part_Number      : 7076601
    Revision         : 03
    Serial_Number    : 465769T+1446UL0G62
Description : A problem was detected for a PCIEX device.

 

This maps to a SCSI Controller :

/SYS/MB/SASHBA0   PCIE  scsi-pciex1000,87                 LSI,2308_2 8.0GT/x4   8.0GT/x4        /pci@300/pci@1/pci@0/pci@2/scsi@0


The messages file is flooded with these types of messages:
Dec 25 01:38:25 HOSTX scsi: [ID 107833 kern.warning] WARNING: /pci@300/pci@1/pci@0/pci@2/scsi@0 (mpt_sas0):
Dec 25 01:38:25 HOSTX Reset failedafter fault was detected
Dec 25 01:38:35 HOSTX scsi: [ID 107833 kern.warning] WARNING: /pci@300/pci@1/pci@0/pci@2/scsi@0 (mpt_sas0):
Dec 25 01:38:35 HOSTX MPTSAS Firmware Fault, code: 1500
Dec 25 01:38:37 HOSTX scsi: [ID 107833 kern.warning] WARNING: /pci@300/pci@1/pci@0/pci@2/scsi@0 (mpt_sas0):
Dec 25 01:38:37 HOSTX ioc reset abort passthru
Dec 25 01:38:37 HOSTX scsi: [ID 107833 kern.warning] WARNING: /pci@300/pci@1/pci@0/pci@2/scsi@0 (mpt_sas0):
Dec 25 01:38:37 HOSTX Reset failedafter fault was detected
Dec 25 01:38:47 HOSTX scsi: [ID 107833 kern.warning] WARNING: /pci@300/pci@1/pci@0/pci@2/scsi@0 (mpt_sas0):
Dec 25 01:38:47 HOSTX MPTSAS Firmware Fault, code: 1500
Dec 25 01:38:49 HOSTX scsi: [ID 107833 kern.warning] WARNING: /pci@300/pci@1/pci@0/pci@2/scsi@0 (mpt_sas0):
Dec 25 01:38:49 HOSTX ioc reset abort passthru
Dec 25 01:38:49 HOSTX scsi: [ID 107833 kern.warning] WARNING: /pci@300/pci@1/pci@0/pci@2/scsi@0 (mpt_sas0):
Dec 25 01:38:49 HOSTX Reset failedafter fault was detected
Dec 25 01:38:59 HOSTX scsi: [ID 107833 kern.warning] WARNING: /pci@300/pci@1/pci@0/pci@2/scsi@0 (mpt_sas0):
Dec 25 01:38:59 HOSTX MPTSAS Firmware Fault, code: 1500
Dec 25 01:39:01 HOSTX scsi: [ID 107833 kern.warning] WARNING: /pci@300/pci@1/pci@0/pci@2/scsi@0 (mpt_sas0):
Dec 25 01:39:01 HOSTX ioc reset abort passthru
Dec 25 01:39:01 HOSTX scsi: [ID 107833 kern.warning] WARNING: /pci@300/pci@1/pci@0/pci@2/scsi@0 (mpt_sas0):
Dec 25 01:39:01 HOSTX Reset failedafter fault was detected
Dec 25 01:39:11 HOSTX scsi: [ID 107833 kern.warning] WARNING: /pci@300/pci@1/pci@0/pci@2/scsi@0 (mpt_sas0):
Dec 25 01:39:11 HOSTX MPTSAS Firmware Fault, code: 1500
Dec 25 01:39:13 HOSTX scsi: [ID 107833 kern.warning] WARNING: /pci@300/pci@1/pci@0/pci@2/scsi@0 (mpt_sas0):
Dec 25 01:39:13 HOSTX ioc reset abort passthru
Dec 25 01:39:13 HOSTX scsi: [ID 107833 kern.warning] WARNING: /pci@300/pci@1/pci@0/pci@2/scsi@0 (mpt_sas0):
Dec 25 01:39:13 HOSTX Reset failedafter fault was detected
Dec 25 01:39:23 HOSTX scsi: [ID 107833 kern.warning] WARNING: /pci@300/pci@1/pci@0/pci@2/scsi@0 (mpt_sas0):
Dec 25 01:39:23 HOSTX MPTSAS Firmware Fault, code: 1500
Dec 25 01:39:25 HOSTX scsi: [ID 107833 kern.warning] WARNING: /pci@300/pci@1/pci@0/pci@2/scsi@0 (mpt_sas0):
Dec 25 01:39:25 HOSTX ioc reset abort passthru
Dec 25 01:39:25 HOSTX scsi: [ID 107833 kern.warning] WARNING: /pci@300/pci@1/pci@0/pci@2/scsi@0 (mpt_sas0):
Dec 25 01:39:25 HOSTX Reset failedafter fault was detected
Dec 25 01:39:35 HOSTX scsi: [ID 107833 kern.warning] WARNING: /pci@300/pci@1/pci@0/pci@2/scsi@0 (mpt_sas0):
Dec 25 01:39:35 HOSTX MPTSAS Firmware Fault, code: 1500

Cause

 This is caused by a bug in the MPT driver. This is specific to code 1500. There are many other codes, this document refers to this specific code 1500.

Solution

MPTSAS Firmware Code 1500 is resolved by updating to S11 11.3.4.5 or higher.

This bug has not been observed in S10 as of yet.

Solaris 10, The re-org of OS team, ended up responsible for SAS device driver software, but with insufficient resource or past experience to even keep up with current customer
escalations with OS10, which ended regular support and entered extended support phase. Unless the this specific to code 1500 is an escalated "Customer Outage" situation, it is highly unlikely that the Solaris Platform Software Team will find time/resource to begin learning the MPT device driver and what might be improved.


Doc created from SR 3-16507941691

 

 


 

References

<BUG:16264047> - SYSTEM RESET - CRITICAL FAULT ON PCIE DEVICE

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback