![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||||
Solution Type Problem Resolution Sure Solution 1924028.1 : Fujitsu M10-4/M10-4S: PCI access errors (12bb0000) on both CMUL and CMUU
In this Document
Created from <SR 3-9519312981> Applies to:Fujitsu M10-4S - Version All Versions to All Versions [Release All Releases]Fujitsu M10-4 - Version All Versions to All Versions [Release All Releases] Information in this document applies to any platform. SymptomsPCI access errors appear on both the CMUL and CMUU ‘showlogs monitor’ output Date: Aug 22 11:06:16 UTC 2014
Code: 40000000-00a20400480400a204-12bb00000000000000000000 Status: Warning Occurred: Aug 22 11:06:12.605 UTC 2014 FRU: /BB#0/CMUL Msg: PCI access error Diagnostic Code: 00000100 00000000 0000 00000001 00000000 0000 00000100 00000000 0000 00000000 00000000 00000000 00000000 00000000 00000000 0000 Date: Aug 22 10:48:35 UTC 2014 Code: 40000000-006b0400a20400a204-12bb00000000000000000000 Status: Warning Occurred: Aug 22 10:48:26.778 UTC 2014 FRU: /BB#0/CMUU,/BB#0/CMUL Msg: PCI access error Diagnostic Code: 00000101 00000000 0000 00000301 00000000 0000 00000301 00000000 0000 00000000 00000000 00000000 00000000 00000000 00000000 0000 Date: Aug 22 10:16:34 UTC 2014 Code: 40000000-00a204006b0400a204-12bb00000000000000000000 Status: Warning Occurred: Aug 22 10:16:30.788 UTC 2014 FRU: /BB#0/CMUL,/BB#0/CMUU Msg: PCI access error Diagnostic Code: 00000301 00000000 0000 00000101 00000000 0000 00000301 00000000 0000 00000000 00000000 00000000 00000000 00000000 00000000 0000
Faults appear in the FMA output reporting faults on both CMUL and CMUU. No PCI slot information will be available. --------------- ------------------------------------ -------------- ---------
TIME EVENT-ID MSG-ID SEVERITY --------------- ------------------------------------ -------------- --------- Aug 22 18:06:10 fcf4298f-99b8-e4dd-d488-d30804e03186 PCIEX-8000-YJ Major Problem Status : solved Diag Engine : eft / 1.16 System Manufacturer : unknown Name : ORCL,SPARC64-X Part_Number : unknown Serial_Number : PZ01426021 Host_ID : 90071189 ---------------------------------------- Suspect 1 of 3 : Fault class : fault.io.pciex.device-pcie-ce Certainty : 75% Affects : dev:////pci@8100/pci@4/pci@0 Status : faulted but still in service FRU Location : "/BB0/CMUL" Manufacturer : unknown Name : unknown Part_Number : 7088706 Revision : unknown Serial_Number : PP142602BL Chassis Manufacturer : unknown Name : ORCL,SPARC64-X Part_Number : 7088788 Serial_Number : PZ01426021 Status : faulty ---------------------------------------- Suspect 2 of 3 : Fault class : fault.io.pciex.bus-linkerr-corr Certainty : 25% Affects : dev:////pci@8100/pci@4 Status : faulted but still in service FRU Location : "/BB0/CMUL" Manufacturer : unknown Name : unknown Part_Number : 7088706 Revision : unknown Serial_Number : PP142602BL Chassis Manufacturer : unknown Name : ORCL,SPARC64-X Part_Number : 7088788 Serial_Number : PZ01426021 Status : faulty ---------------------------------------- Suspect 3 of 3 : Fault class : fault.io.pciex.device-pcie-ce Certainty : 75% Affects : dev:////pci@8100/pci@4/pci@0 Status : faulted but still in service FRU Location : "/BB0/CMUL" Manufacturer : unknown Name : unknown Part_Number : 7088706 Revision : unknown Serial_Number : PP142602BL Chassis Manufacturer : unknown Name : ORCL,SPARC64-X Part_Number : 7088788 Serial_Number : PZ01426021 Status : faulty Description : Too many recovered bus errors have been detected, which indicates a problem with the specified bus or with the specified transmitting device. This may degrade into an unrecoverable fault. Response : One or more device instances may be disabled Impact : Loss of services provided by the device instances associated with this fault Action : Use 'fmadm faulty' to provide a more detailed view of this event. If a plug-in card is involved check for badly-seated cards or bent pins. Please refer to the associated reference document at http://support.oracle.com/msg/PCIEX-8000-YJ for the latest service procedures and policies regarding this diagnosis. --------------- ------------------------------------ -------------- --------- TIME EVENT-ID MSG-ID SEVERITY --------------- ------------------------------------ -------------- --------- Aug 22 17:16:29 198aa417-5c27-69a5-87af-998bc3fe4936 PCIEX-8000-YJ Major Problem Status : solved Diag Engine : eft / 1.16 System Manufacturer : unknown Name : ORCL,SPARC64-X Part_Number : unknown Serial_Number : PZ01426021 Host_ID : 90071189 ---------------------------------------- Suspect 1 of 3 : Fault class : fault.io.pciex.device-pcie-ce Certainty : 75% Affects : dev:////pci@8700/pci@4/pci@0 Status : faulted but still in service FRU Location : "/BB0/CMUU" Manufacturer : unknown Name : unknown Part_Number : 7088708 Revision : unknown Serial_Number : PP142601JR Chassis Manufacturer : unknown Name : ORCL,SPARC64-X Part_Number : 7088788 Serial_Number : PZ01426021 Status : faulty ---------------------------------------- Suspect 2 of 3 : Fault class : fault.io.pciex.bus-linkerr-corr Certainty : 25% Affects : dev:////pci@8700/pci@4 Status : faulted but still in service FRU Location : "/BB0/CMUU" Manufacturer : unknown Name : unknown Part_Number : 7088708 Revision : unknown Serial_Number : PP142601JR Chassis Manufacturer : unknown Name : ORCL,SPARC64-X Part_Number : 7088788 Serial_Number : PZ01426021 Status : faulty ---------------------------------------- Suspect 3 of 3 : Fault class : fault.io.pciex.device-pcie-ce Certainty : 75% Affects : dev:////pci@8700/pci@4/pci@0 Status : faulted but still in service FRU Location : "/BB0/CMUU" Manufacturer : unknown Name : unknown Part_Number : 7088708 Revision : unknown Serial_Number : PP142601JR Chassis Manufacturer : unknown Name : ORCL,SPARC64-X Part_Number : 7088788 Serial_Number : PZ01426021 Status : faulty Description : Too many recovered bus errors have been detected, which indicates a problem with the specified bus or with the specified transmitting device. This may degrade into an unrecoverable fault. Response : One or more device instances may be disabled Impact : Loss of services provided by the device instances associated with this fault Action : Use 'fmadm faulty' to provide a more detailed view of this event. If a plug-in card is involved check for badly-seated cards or bent pins. Please refer to the associated reference document at http://support.oracle.com/msg/PCIEX-8000-YJ for the latest service procedures and policies regarding this diagnosis. -------------------------------------------------------------------
Changes
CausePossible Soft Error Rate Descrimination (SERD) issue. Possible cable or seating issue on PCIe cable connection between CMUL and CMUU. Possible FCO A0335-1 Red phosphorus in the PCI-e cable connecting CMUL to CMUU causes corrosion resulting in a short circuit on the DDC control signal and a system panic. (Doc ID 1629497.1)), Solution
3. If items ! and 2 above are complete then check that XCP is at or above XCP2321 ( See doc 2211342.1 for more details ). 3, If patch or upgrade was already installed or if additional errors are seen after application of upgrade or patch, replace the parts listed in XCP error logs. Date: Aug 22 11:06:16 UTC 2014
Code: 40000000-00a20400480400a204-12bb00000000000000000000 Status: Warning Occurred: Aug 22 11:06:12.605 UTC 2014 FRU: /BB#0/CMUL ~~~~~~~~~~ (*)PCI error has occurred in CMUL. Please replace only CMUL. Msg: PCI access error Date: Aug 22 10:48:35 UTC 2014 Code: 40000000-006b0400a20400a204-12bb00000000000000000000 Status: Warning Occurred: Aug 22 10:48:26.778 UTC 2014 FRU: /BB#0/CMUU,/BB#0/CMUL ~~~~~~~~~~ ~~~~~~~~~~ Msg: PCI access error
Fix FCO if it applies. The fix described in FCO A0335-1 Described in Doc ID 1629497.1 was applied to all systems proactively except these 7 serial numbers. If the error described in the document is on one of these systems apply the fix described in FCO A0335-1 first Serial numbers: PZ01334012 - PZ01334013 - PZ01337013 - PZ01337014 - PZ01326001 PZ01326002 PZ01326003
References<NOTE:1600364.1> - Fujitsu M10-4/M10-4s: Error: How to Decode the Correct Cable Location for Error 0200242d PCI Express link up failed After Replacing CMUL ( with PCIBP ) or CMUU.<NOTE:1617956.1> - I/O SERD threshold values are set too low and may result in PCIEX-8000-J5, PCIEX-8000-YJ and PCIEX-8000-KP faults. <BUG:20264642> - CONFLICTING DIAGNOSIS BETWEEN FMA AND XCP. FMA = CMUU - XCP = CMUL,CMUU <NOTE:1629497.1> - FCO A0335-1: Proactive - Scheduled: Red phosphorus in the PCI-e cable connecting CMUL to CMUU causes corrosion resulting in a short circuit on the DDC control signal and a system panic. Attachments This solution has no attachment |
||||||||||||||||||||
|