Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-77-1369869.1
Update Date:2018-01-09
Keywords:

Solution Type  Sun Alert Sure

Solution  1369869.1 :   Healthy Solaris 10 SPARC Systems May Incorrectly Report Hardware Errors (SUNOS-8000-FU) During PCIE Correctable Events  


Related Items
  • Sun Fire V215 Server
  •  
  • Sun Fire V245 Server
  •  
  • Sun SPARC Enterprise M9000-32 Server
  •  
  • Sun Ultra 45 Workstation
  •  
  • Sun Blade T6300 Server Module
  •  
  • Sun SPARC Enterprise M8000 Server
  •  
  • Sun Fire T2000 Server
  •  
  • Sun Fire V445 Server
  •  
  • Sun SPARC Enterprise T5120 Server
  •  
  • Solaris Operating System
  •  
  • Sun SPARC Enterprise T5220 Server
  •  
  • Sun SPARC Enterprise M4000 Server
  •  
  • Solaris Operating System
  •  
  • Sun SPARC Enterprise M5000 Server
  •  
  • Sun SPARC Enterprise T5240 Server
  •  
  • Sun SPARC Enterprise T5140 Server
  •  
  • Sun Netra CP3060 ATCA Blade Server
  •  
Related Categories
  • PLA-Support>Sun Systems>Sun_Other>Sun Collections>SN-OTH: Sun Alert
  •  
  • _Old GCS Categories>Sun Microsystems>Sun Alert>Release Phase>Resolved
  •  




In this Document
Description
Occurrence
Symptoms
Workaround
Patches
History


Applies to:

Sun Netra CP3060 ATCA Blade Server - Version Not Applicable and later
Sun SPARC Enterprise M4000 Server - Version Not Applicable and later
Sun Fire T2000 Server - Version Not Applicable and later
Sun Fire V445 Server - Version Not Applicable and later
Solaris Operating System - Version 10 10/09 U8 to 10 10/09 U8 [Release 10.0]
Information in this document applies to any platform.
_____________________



Date of Resolved Release: 21-Oct-2011
____________________________________



Description


Incorrect handling of correctable errors on Solaris 10 SPARC systems fitted with a certain model of PCI Express Switch, may cause the error SUNOS-8000-FU to be incorrectly reported on the Fault Management Architecture (FMA) class ereport.io.pci.sec-rserr. This may result in unnecessary hardware replacement for healthy hardware.

Occurrence


This issue can occur in the following releases:

SPARC Platform

  • Solaris 10 with patch 125369-10 through 125369-13 or patch 127755-01 and without patch 146855-01
  • Solaris 11 Express based upon builds snv_39 through snv_157

Note 1: The following SPARC platforms are impacted by this issue:

  • Ultra 45
  • Sun Fire v445
  • Sun Fire v215, v245
  • Sun Blade T6300
  • Sun Fire T2000
  • Netra CP 3060
  • SPARC Enterprise M4000
  • SPARC Enterprise M5000
  • SPARC Enterprise T5120 with Sun External I/O Expansion Unit
  • SPARC Enterprise T5140 with Sun External I/O Expansion Unit
  • SPARC Enterprise T5220 with Sun External I/O Expansion Unit
  • SPARC Enterprise T5240 with Sun External I/O Expansion Unit
  • SPARC Enterprise T5440 with Sun External I/O Expansion Unit
  • SPARC Enterprise M8000 with Sun External I/O Expansion Unit
  • SPARC Enterprise M9000 with Sun External I/O Expansion Unit

Note 2: Solaris 8, Solaris 9, and Solaris on the x86 platform are not impacted by this issue.

Note 3: Solaris 11 Express distributions may include additional bug fixes above and beyond the build from which it was derived. The base build can be derived as follows:

   $ uname -v
   snv_151

If the output is of the format 151.x.x.x, then the build installed is snv_151.

Symptoms


When patch 125369-13 is installed, or a system is upgraded to a release that includes this patch or to an affected Solaris 11 Express build, FMA may report correctable errors not previously observed on the system.

If the described issue occurs, the following message will be seen on the system console:

    SUNW-MSG-ID: SUNOS-8000-FU, TYPE: Defect, VER: 1, SEVERITY: Major
    EVENT-TIME: Tue Mar  29 21:03 PDT 2011
    PLATFORM: SUNW,SPARC-Enterprise , CSN: -, HOSTNAME: -
    SOURCE: eft, REV: 1.16
    EVENT-ID: af46a1fb-a712-617b-cab3-fc57b79a1dd9
    DESC: The diagnosis engine encountered telemetry for which it was unable to perform a diagnosis.
    Refer to http://sun.com/msg/SUNOS-8000-FU for more information.
    AUTO-RESPONSE: Error reports have been logged for examination by Sun.

    IMPACT: Automated diagnosis and response for these events will not occur.

Use fmadm(1M) and fmdump(1M) for further confirmation or contact Oracle for support.

    # fmadm faulty
    --------------- ------------------------------------  -------------- ---------
    TIME            EVENT-ID                              MSG-ID         SEVERITY
    --------------- ------------------------------------  -------------- ---------
    May 21 04:19:41 cf33eeba-54e0-6e79-b7c3-cf7de492f1d3  SUNOS-8000-FU  Major

    Host        : xyz1
    Platform    : SUNW,SPARC-Enterprise     Chassis_id  : xyz2400L

    Fault class : defect.sunos.eft.undiag.fme

    Description : The diagnosis engine encountered telemetry for which it was unable to perform a diagnosis.
                  Refer to http://sun.com/msg/SUNOS-8000-FU for more information.

    Response    : Error reports have been logged for examination by Sun.

    Impact      : Automated diagnosis and response for these events will not occur.

    Action      : Ensure that the latest Solaris Kernel and Predictive Self-Healing (PSH) patches are installed.

    # fmdump -e

    May 21 04:19:36.4888 ereport.io.pci.fabric
    May 21 04:19:36.4885 ereport.io.pci.sec-rserr

Workaround


There is no workaround for this issue.

This issue is resolved in the following releases:

SPARC Platform

  • Solaris 10 with patch 146855-01 or later
  • Solaris 11 Express based upon builds snv_158 or later
Note: After installing the Solaris 10 patch 146855-01, the SUNOS-8000-FU faults should be cleared using the command:
    # fmadm repair <EVENT-ID >

where the event-id is obtained from the output from the "fmadm faulty" command as shown in the symptoms section above.

Patches

<SUNPATCH 146855-01>

History

21-Oct-2011: Date of Resolved Release
29-Dec-2011: Updated Document Title
06-Mar-2012: Updated note in Workaround section
07-Aug-2013: Maintenance check for relevance/currency, no change in content

Internal Comments:

This is fundamentally a PLX PEX8532 Switch bug due to Erratum #59. It wasn't, however, exposed on SPARC Solaris platforms until 6239835 was introduced; thus, it is classified as a Solaris regression.

PLX PEX8532 is an 8-port, 32-lane PCI Express switch manufactured by PLX Technology, and embedded in the Sun/Oracle platforms specified in the SA.

Please send technical questions to the following email:
sunalertpublication_us@oracle.com
and copy the Responsible Engineer/Contributor listed below.

Internal Contributor/Submitter: daniel.ice@oracle.com
Internal Eng Responsible Engineer: daniel.ice@oracle.com
Internal Services Knowledge Engineer: jeff.folla@oracle.com
Internal Eng Business Unit Group: Systems RPE
Internal Escalation ID: 72975798 73110406, 73340258, 73374798, 2-8011418, 2-8177736, 2-8034193, 2-8064817, 2-8077393, 2-8103991, 2-8194834, 2-8196583 2-8145695, 3-2711368881, 3-2849580131, 3-3225313050
Internal Resolution Patches: 146855-01


Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback