Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1018832.1
Update Date:2017-02-02
Keywords:

Solution Type  Problem Resolution Sure

Solution  1018832.1 :   Sun Fire [TM] SF3800/SF4800/SF4810/SF6800 - E4900/E6900 Server ( Serengeti/Amazon ): POST fails during IOPOST, marking all I/O Boards (IBs) as bad.  


Related Items
  • Sun Fire 4810 Server
  •  
  • Sun Fire 3800 Server
  •  
  • Sun Fire 6800 Server
  •  
  • Sun Fire E6900 Server
  •  
  • Sun Fire V1280 Server
  •  
  • Sun Fire 4800 Server
  •  
  • Sun Fire E4900 Server
  •  
Related Categories
  • PLA-Support>Sun Systems>SPARC>Enterprise>SN-SPARC: SF-x8x0/Ex900
  •  
  • _Old GCS Categories>Sun Microsystems>Servers>Midrange Servers
  •  
  • _Old GCS Categories>Sun Microsystems>Servers>Midrange V and Netra Servers
  •  

PreviouslyPublishedAs
230625


Applies to:

Sun Fire V1280 Server - Version Not Applicable and later
Sun Fire 3800 Server - Version Not Applicable and later
Sun Fire 4800 Server - Version Not Applicable and later
Sun Fire 4810 Server - Version Not Applicable and later
Sun Fire 6800 Server - Version Not Applicable and later
All Platforms

Symptoms

All I/O Boards (IBs) are marked as bad during IOPOST. This can be misleading while diagnosing the right FRU

 

Cause

Sometimes all I/O Boards (IBs) are marked as bad because of a faulty CPU running IOPOST.

The CPU itself running POST is bad, which unfortunately goes undetected by LPOST (POST for the CPU itself).

Solution



See a snippet from the console logs below.

Note the following from the console logs :

  • SB4/P0 is the processor running the IOPOST
  • SB4/P0 marks IB6/P0 and IB6/P1 - the two IO controllers on IB6 as "Failed"
  • SB4/P0 marks IB8/P0 and IB8/P1 - the two IO controllers on IB8 as "Failed"
  • SB4/P0 is actually the bad CPU. Since the CPU itself is faulty, it cannot reliably test the IBs, marking the controllers on the IBs as failed.
  • SB4/P0 goes undetected during its own Self Test (called LPOST)
  • It is highly unlikely that all of the IO controllers (IB6/P0, IB6/P1, IB8/P0 and IB8/P1) are bad.

 

Console logs :

{/N0/SB4/P0} ERROR: TEST=PCI IO Controller Functional Tests,SUBTEST=PCI IO
Controller DMA loopback Tests ID=152.2
{/N0/SB4/P0} Component under test: /N0/IB6/P0 PCI IOC
{/N0/SB4/P0}    Data Access Error from address 00000000.08000820. AFSR =
00000002.00000094
{/N0/SB4/P0} Secondary AFAR 00000000.08000820, Secondary AFSR =
00000002.00000094
{/N0/SB4/P0}  tl  tt         tstate                 tpc               tnpc
{/N0/SB4/P0} 01  63  00000099.80000606  00000000.0001ca48
00000000.0001ca4c
{/N0/SB4/P0}    (CE) Correctable system data ECC error
{/N0/SB4/P0}  tl  tt         tstate                 tpc               tnpc
{/N0/SB4/P0} 02  32  00000044.80001504  00000000.0000f1e0
00000000.0000f1e4
{/N0/SB4/P0} 01  63  00000099.80000606  00000000.0001ca48
00000000.0001ca4c
{/N0/SB4/P0}    (CE) Correctable system data ECC error
{/N0/SB4/P0}    (TO) Time-out from system bus
{/N0/SB4/P0}  tl  tt         tstate                 tpc               tnpc
{/N0/SB4/P0} 03  34  00000091.80001507  00000000.00014d80
00000000.00014d84
{/N0/SB4/P0} 02  32  00000044.80001504  00000000.0000f1e0
00000000.0000f1e4
{/N0/SB4/P0} 01  63  00000099.80000606  00000000.0001ca48
00000000.0001ca4c
{/N0/SB4/P0} AFSR = 00000000.00000000
{/N0/SB4/P0} AFAR = 00000000.08000820
{/N0/SB4/P0} IMMU SFSR = 00000000.00000000
{/N0/SB4/P0} DMMU SFSR = 00000000.00700009
{/N0/SB4/P0} DMMU SFAR = 00000000.08000820
{/N0/SB4/P0} PState = 00000000.00000015
{/N0/SB4/P0} Dispatch Control =00000000.0000103f
{/N0/SB4/P0} Data Cache Unit Control =0000ce00.0000000e
{/N0/SB4/P0} Safari Config. = 0aaa0028.20200006
{/N0/SB4/P0} EState = 00000000.00000000
{/N0/SB4/P0} @(#) lpost         5.15.2  2003/08/04 10:27
{/N0/SB4/P0} Copyright 2001-2003 Sun Microsystems, Inc.  All rights
reserved.
{/N0/SB4/P0} Use is subject to license terms.
{/N0/SB4/P0} Running PCI IO Controller Basic Tests
{/N0/SB4/P0}  tl  tt         tstate                 tpc               tnpc
{/N0/SB4/P0} 02  32  00000044.80001503  000007ff.f0007cc0
000007ff.f0007cc4
{/N0/SB4/P0} 01  32  00000000.80000405  000007ff.f0009bec
000007ff.f0009bf0
{/N0/SB4/P0}    (TO) Time-out from system bus
{/N0/SB4/P0}    (PRIV) Privileged code access error(s)
{/N0/SB4/P0}    (ME) Multiple Errors of the same type occurred
{/N0/SB4/P0}  tl  tt         tstate                 tpc               tnpc
{/N0/SB4/P0} 03  32  00000099.80001502  000007ff.f0006a58
000007ff.f0006a5c
{/N0/SB4/P0} 02  32  00000044.80001503  000007ff.f0007cc0
000007ff.f0007cc4
{/N0/SB4/P0} 01  32  00000000.80000405  000007ff.f0009bec
000007ff.f0009bf0
{/N0/SB4/P0}    (TO) Time-out from system bus
{/N0/SB4/P0}    (PRIV) Privileged code access error(s)
{/N0/SB4/P0} @(#) lpost         5.15.2  2003/08/04 10:27
{/N0/SB4/P0} Copyright 2001-2003 Sun Microsystems, Inc.  All rights
reserved.
{/N0/SB4/P0} Use is subject to license terms.
{/N0/SB4/P0} @(#) lpost         5.15.2  2003/08/04 10:27
{/N0/SB4/P0} Copyright 2001-2003 Sun Microsystems, Inc.  All rights
reserved.
{/N0/SB4/P0} Use is subject to license terms.
{/N0/IB6/P0} Failed <--- !!
{/N0/IB6/P1} Failed <--- !!
Sep 10 11:05:24 he101 Domain-A.SC: Excluded unusable, unlicensed, failed or
disabled board: /N0/IB6
Copying IO prom to Cpu dram
...................................
{/N0/SB4/P0} Running PCI IO Controller Basic Tests
{/N0/SB4/P0} Jumping to memory 00000000.00000020 [00000010]
{/N0/SB4/P0} System PCI IO post code running from memory
{/N0/SB4/P0} @(#) lpost         5.15.2  2003/08/04 10:28
{/N0/SB4/P0} Copyright 2001-2003 Sun Microsystems, Inc.  All rights
reserved.
{/N0/SB4/P0} Use is subject to license terms.
{/N0/SB4/P0} Subtest: PCI IO Controller Register Initialization for aid
0x1c
{/N0/SB4/P0} Running PCI IO Controller Functional Tests
{/N0/SB4/P0} Subtest: PCI IO Controller IOMMU  TLB Compare Tests for aid
0x1c
{/N0/SB4/P0} Subtest: PCI IO Controller IOMMU TLB Flush Tests for aid 0x1c
{/N0/SB4/P0} Subtest: PCI IO Controller DMA loopback Tests for aid 0x1c
{/N0/SB4/P0} ERROR: TEST=PCI IO Controller Functional Tests,SUBTEST=PCI IO
Controller DMA loopback Tests ID=152.2
{/N0/SB4/P0} Component under test: /N0/IB8/P0 PCI IOC
{/N0/SB4/P0}    Data Access Error from address 00000000.08000820. AFSR =
00000002.00000094
{/N0/SB4/P0} Secondary AFAR 00000000.08000820, Secondary AFSR =
00000002.00000094
{/N0/SB4/P0}  tl  tt         tstate                 tpc               tnpc
{/N0/SB4/P0} 01  63  00000099.80000605  00000000.0001c8b4
00000000.0001c8b8
{/N0/SB4/P0}    (CE) Correctable system data ECC error
{/N0/SB4/P0}  tl  tt         tstate                 tpc               tnpc
{/N0/SB4/P0} 02  32  00000044.80001503  00000000.0000f1e0
00000000.0000f1e4
{/N0/SB4/P0} 01  63  00000099.80000605  00000000.0001c8b4
00000000.0001c8b8
{/N0/SB4/P0}    (CE) Correctable system data ECC error
{/N0/SB4/P0}    (TO) Time-out from system bus
{/N0/SB4/P0}  tl  tt         tstate                 tpc               tnpc
{/N0/SB4/P0} 03  34  00000091.80001506  00000000.00014d80
00000000.00014d84
{/N0/SB4/P0} 02  32  00000044.80001503  00000000.0000f1e0
00000000.0000f1e4
{/N0/SB4/P0} 01  63  00000099.80000605  00000000.0001c8b4
00000000.0001c8b8
{/N0/SB4/P0} AFSR = 00000000.00000000
{/N0/SB4/P0} AFAR = 00000000.08000820
{/N0/SB4/P0} IMMU SFSR = 00000000.00000000
{/N0/SB4/P0} DMMU SFSR = 00000000.00700009
{/N0/SB4/P0} DMMU SFAR = 00000000.08000820
{/N0/SB4/P0} PState = 00000000.00000015
{/N0/SB4/P0} Dispatch Control =00000000.00000000
{/N0/SB4/P0} Data Cache Unit Control =00000000.0000000c
{/N0/SB4/P0} Safari Config. = 0aaa0028.20200006
{/N0/SB4/P0} EState = 00000000.00000000
{/N0/SB4/P0} @(#) lpost         5.15.2  2003/08/04 10:27
{/N0/SB4/P0} Copyright 2001-2003 Sun Microsystems, Inc.  All rights
reserved.
{/N0/SB4/P0} Use is subject to license terms.
{/N0/SB4/P0} Running PCI IO Controller Basic Tests
{/N0/SB4/P0}  tl  tt         tstate                 tpc               tnpc
{/N0/SB4/P0} 01  32  00000000.80000405  000007ff.f0009bec
000007ff.f0009bf0
{/N0/SB4/P0}    (TO) Time-out from system bus
{/N0/SB4/P0}    (PRIV) Privileged code access error(s)
{/N0/SB4/P0}  tl  tt         tstate                 tpc               tnpc
{/N0/SB4/P0} 02  32  00000099.80001502  000007ff.f0006a58
000007ff.f0006a5c
{/N0/SB4/P0} 01  32  00000000.80000405  000007ff.f0009bec
000007ff.f0009bf0
{/N0/SB4/P0}    (TO) Time-out from system bus
{/N0/SB4/P0}    (PRIV) Privileged code access error(s)
{/N0/SB4/P0}  tl  tt         tstate                 tpc               tnpc
{/N0/SB4/P0} 03  32  00000099.80001502  000007ff.f0006a58
000007ff.f0006a5c
{/N0/SB4/P0} 02  32  00000099.80001502  000007ff.f0006a58
000007ff.f0006a5c
{/N0/SB4/P0} 01  32  00000000.80000405  000007ff.f0009bec
000007ff.f0009bf0
{/N0/SB4/P0}    (TO) Time-out from system bus
{/N0/SB4/P0}    (PRIV) Privileged code access error(s)
{/N0/SB4/P0} @(#) lpost         5.15.2  2003/08/04 10:27
{/N0/SB4/P0} Copyright 2001-2003 Sun Microsystems, Inc.  All rights
reserved.
{/N0/SB4/P0} Use is subject to license terms.
{/N0/SB4/P0} @(#) lpost         5.15.2  2003/08/04 10:27
{/N0/SB4/P0} Copyright 2001-2003 Sun Microsystems, Inc.  All rights
reserved.
{/N0/SB4/P0} Use is subject to license terms.
{/N0/IB8/P0} Failed <--- !!
{/N0/IB8/P1} Failed <--- !!
Sep 10 11:05:47 he101 Domain-A.SC: Excluded unusable, unlicensed, failed or
disabled board: /N0/IB8
Sep 10 11:05:47 he101 Domain-A.SC: No usable Io board in domain.
setkeyswitch operation did not complete



Relief/Workaround

Disable the System Board (SB) containing the CPU running IOPOST (that fails IOPOST), so we move IOPOST to run on a different CPU.

This can be achieved by using the "disablecomponent" command from the system controller interface (SC-App) Alternatively, disabling the processor itself using the "disablecomponent" command is a valid workaround too.

 

To discuss this information further with Oracle experts and industry peers, we encourage you to review, join or start a discussion in an appropriate My Oracle Support Community, Oracle Sun Technologies Community.







Product
Sun Fire 6800 Server
Sun Fire 4810 Server
Sun Fire 4800 Server
Sun Fire 3800 Server
Sun Fire E4900
Sun Fire E6900






IOPOST, IB6/P0, Failed, DMA, Functional, Controller


Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback