Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1465634.1
Update Date:2017-10-11
Keywords:

Solution Type  Problem Resolution Sure

Solution  1465634.1 :   M3000/M4000/M5000 - OBP probe-all command execution and subsequent boot causes "Last Trap: Data Access Error" and MBU_A or IOU is degraded  


Related Items
  • Sun SPARC Enterprise M4000 Server
  •  
  • Sun SPARC Enterprise M5000 Server
  •  
  • Sun SPARC Enterprise M3000 Server
  •  
Related Categories
  • PLA-Support>Sun Systems>SPARC>Enterprise>SN-SPARC: Mx000
  •  


M3000/M4000/M5000 - OBP probe-all command execution and subsequent boot causes "Last Trap: Data Access Error" and MBU_A or IOU is degraded

In this Document
Symptoms
Changes
Cause
Solution
References


Applies to:

Sun SPARC Enterprise M3000 Server - Version All Versions to All Versions [Release All Releases]
Sun SPARC Enterprise M4000 Server - Version All Versions to All Versions [Release All Releases]
Sun SPARC Enterprise M5000 Server - Version All Versions to All Versions [Release All Releases]
All Platforms

Symptoms

When issuing 'probe-all' instead of a "probe-scsi-all" command in OBP of M3000/M4000/M5000, the subsequent boot causes a "Last Trap: Data Access Error"
This also Degrades the MBU on an M3000 or an IOU on the M4000/M5000 with ereport.chassis.SPARC-Enterprise.asic.pci_sw.fe
The customer or field may interpret this as a HW error and would replace MBU_A or IOU

The M8000 and M9000 does not have this issue

Changes

 Running OBP command probe-all causes this issue

Cause

{0} ok probe-all
{0} ok boot cdrom
Boot device: /pci@0,600000/pci@0/pci@0/scsi@0/disk@4,0:f  File and args:
ERROR: /pci@0,600000/pci@0/pci@0/scsi@0: Last Trap: Data Access Error
%TL:1 %TT:32 %TPC:f0038b1c %TnPC:f0038b20 %TSTATE:446a001600
%PSTATE:16 ( IE:1 PRIV:1 PEF:1 )
DSFSR:4280804b ( FV:1 OW:1 PR:1 E:1 TM:1 ASI:80 NC:1 BERR:1 )
DSFAR:fdb16000 DSFPAR:401080100000 D-TAG:0

OR

{0} ok probe-all
{0} ok boot
ERROR: /pci@0,600000/pci@0/pci@0/scsi@0: Last Trap: Data Access Error
%TL:1 %TT:32 %TPC:f0038b1c %TnPC:f0038b20 %TSTATE:446a001600
%PSTATE:16 ( IE:1 PRIV:1 PEF:1 )
DSFSR:4280804b ( FV:1 OW:1 PR:1 E:1 TM:1 ASI:80 NC:1 BERR:1 )
DSFAR:fdb18000 DSFPAR:401080100000 D-TAG:0


XSCF> showstatus
*   MBU_A Status:Degraded;

OR


XSCF> showstatus
*   IOU#0 Status:Degraded;

XSCF> fmdump -e
TIME                 CLASS
Jun 08 08:59:43.5452 ereport.chassis.SPARC-Enterprise.asic.pci_sw.fe

 

For an M3000

XSCF> showlogs error -v -r -M
Date: Jun 08 10:28:31 SGT 2012     Code: 60002500-b9010000-0300000700320000
    Status: Warning                Occurred: Jun 08 10:28:29.030 SGT 2012
    FRU: /MBU_A
    Msg: MBU error occurred (TT=0x32)
    Diagnostic Code:
        00000000 00000000 00000000
        54543d30 78333200 00000000 00000000
        00000000 00000000 00000000 00000000
    UUID: 3055c68e-fc39-4ddf-a199-1e52f442ce1e MSG-ID: SCF-8001-U5

For M4000/M5000

XSCF> showlogs error -v -r -M
Date: Mar 25 09:01:29 SGT 2013     Code: 60002500-b9010000-0300000700320000
    Status: Warning                Occurred: Mar 25 09:01:27.111 SGT 2013
    FRU: /IOU#0
    Msg: IOU error occurred (TT=0x32)
    Diagnostic Code:
        00000000 00000000 00000000
        54543d30 78333200 00000000 00000000
        00000000 00000000 00000000 00000000
    UUID: 3a12a033-03cb-4960-8a93-1c951e69d733 MSG-ID: SCF-8001-U5

Solution

1. Do Not replace Hardware due to this issue

2. Execute clearfault to remove the fault from the system. See Instructions Below

- Note that clearfault was ONLY introduced in non-service mode in Firmware Version 1115 and above

- if you are on Firmware levels below 1115, you will not see the clearfault command

- if you do not already have the latest Firmware, please perform a Firmware Upgrade Prior to clearfault (as of this writing, the latest Firmware is 1119)

 

Procedures to clear the fault

1. From XSCF, poweroff and poweron the affected domain, if you do not do this step, the fault will only be cleared after Circuit Breaker OFF/ON (by removing power cords)
2. With Firmware 1115 and above, run clearfault on /MBU_A or /IOU#0 or /IOU#1
3. Subsequent boot will work, but when probe-all is issued again, the fault re-appears

** using Domain-0 as an example here, substitute your domain number accordingly
XSCF> poweroff -d0 -y
DomainIDs to power off:00
Continue? [y|n] :y
00 :Powering off
 
XSCF> poweron -d0
DomainIDs to power on:00
Continue? [y|n] : y
00 :Powering on

Example: M3000
XSCF> clearfault MBU_A
XSCF> showstatus
No failures found in System Initialization.
XSCF>

Example: M4000/M5000

XSCF> clearfault /IOU#0

XSCF> showstatus
No failures found in System Initialization.
XSCF>

References

<NOTE:1007101.1> - Sun SPARC(R)Enterprise M3000/M4000/M5000/M8000/M9000 (OPL) Servers: Fault clearing and LEDs behavior

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback