M12-io.pcie.device.se - An error was detected on a PCIe switch, onboard device, or a card in a PCI slot

Asset ID:	1-79-2218456.1
Update Date:	2017-08-14
Keywords:

Solution Type Predictive Self-Healing Sure

Solution 2218456.1 : M12-io.pcie.device.se - An error was detected on a PCIe switch, onboard device, or a card in a PCI slot

Applies to:

Fujitsu SPARC M12-2S
Fujitsu SPARC M12-2
Fujitsu SPARC M12-1
SPARC

Purpose

Provide additional information for message ID: M12-io.pcie.device.se

Fujitsu fault codes:

02002417, 02002419

Details

Type

: Hardware Fault; io.pcie.device.se

Severity

: Major

Description

: Fault due to a serious error detected on a PCIe switch, onboard device, or a card in a PCI slot.

Automated Response

: No immediate action is taken.

Impact

: When the failure is detected on a PCI card, nothing is deconfigured.; NOTE: POST/OBP stops using the PCI card when the failure is detected. But, by powering off and on the domain, the domain will start using it again.

Indicted Hardware

When the failure is detected on a PCI card, the PCI card is marked for replacement.

For M12-1 systems, when the failure is detected on a PCIe switch or an onboard device, MBU should be replaced.
For M12-2/M12-2S systems, when the failure is detected on a PCIe switch or an onboard device, CMUL should be replaced.
For PCI Box, when the failure is detected on a PCIe switch on a link card, the link card should be replaced.

When the failure is on a PCI card, the fault information for this fault is not stored in the FMA resource cache, nor is it stored on the XSCF's persistent storage. Instead, the fault information is stored only in the hardware descriptor (HWD) of the domain that the device belongs to.

The HWD itself is cleared of all information about faulty devices when the domain is powered down (this includes platform resets and platform power-downs). The HWD information about this device being faulty is also cleared when a hot-plug operation is performed on the faulty PCIe card from within Solaris running on the domain.

However, even though the fault information is not stored in the FMA resource cache or XSCF persistent storage, the fault occurrence is logged in the relevant error logs and fault logs.

If the fault was detected while running POST, the event is listed under one of the following categories:

- se-lane-degraded: 02002417 The PCIe lane has been degraded and is operating at reduced speed.
- se-transfer-rate-fallback: 02002419 The transfer rate, slower than expected, is configured as a result of initialization.

Status of lanes and their speed can be confirmed by output from prtdiag command on Solaris.

Suggested Action for System Administrator

: The recommended service action for this event is to schedule replacement of the affected component(s) at the earliest possible convenience. Although the hardware may be functioning, it is not intended nor recommended that the faulted component(s) remain in the system for a prolonged period of time.

Refer to the following document for the latest procedures for displaying event content in preparation for submitting a service request and applying any post-repair actions that may be required.

PSH Procedural Article for Fujitsu M10 Diagnosis (Doc ID 1525156.1)

Attachments

This solution has no attachment