Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1614738.1
Update Date:2018-05-01
Keywords:

Solution Type  Problem Resolution Sure

Solution  1614738.1 :   [SPARC T4/T5/M5 and M6] FMA I/O retirement : PCI devices can be seen from OBP but disappear when System Boots up into Solaris  


Related Items
  • SPARC T4-1
  •  
  • Sun Flash F20 PCIe Card
  •  
  • Flash Accelerator F80 PCIe Card
  •  
  • Fujitsu M10-4
  •  
  • Fujitsu M10-4S
  •  
  • F40 Flash Accelerator Card
  •  
  • SPARC T5-4
  •  
  • SPARC T5-2
  •  
  • SPARC T4-4
  •  
  • SPARC M6-32
  •  
  • SPARC T4-2
  •  
  • SPARC T5-8
  •  
  • SPARC M5-32
  •  
  • Fujitsu M10-1
  •  
Related Categories
  • PLA-Support>Sun Systems>SPARC>CMT>SN-SPARC: T5
  •  




In this Document
Symptoms
Cause
Solution


Created from <SR 3-8348766731>

Applies to:

SPARC T4-2 - Version All Versions to All Versions [Release All Releases]
Sun Flash F20 PCIe Card - Version All Versions to All Versions [Release All Releases]
F40 Flash Accelerator Card - Version All Versions to All Versions [Release All Releases]
Flash Accelerator F80 PCIe Card - Version All Versions to All Versions [Release All Releases]
SPARC T4-1 - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.

Symptoms

PCI devices can be seen from OBP but disappears when System Boots up into Solaris

Cause

Solaris FMA has a retirement agent for I/O devices. For Solaris FMA I/O retirement to work, the Fault Management framework requires an FMA capable(aware) device driver. When an FMA aware device driver reports errors to the Solaris FMA framework, Solaris FMA Diagnosis Engine (DE) may decide to offline, degrade or permanently retire the device (persistently across OS reboots).

Once FMA has permanently retired a device the state will persistently be stored in a file (/etc/devices/retire_store). This state will be persistent across OS reboots, until the retired device has been replaced and cleared with FMA command "fmadm repaired [fmri|label]". Solaris "prtdiag" will not display a retired device, but "prtconf -v" will display the device instance with a retired status.

 

Example of a retired QLC device instance collected from "prtconf -v" 

  
< removed ouput from top >

      SUNW,qlc (retired)
          Hardware properties:
              name='assigned-addresses' type=int items=15
                  value=81030110.00000000.00000100.00000000.00000100.83030114.00000001.00004000.00000000.00004000.82030130.00000000.00140000.00000000.00040000
              name='reg' type=int items=15
                  value=00030100.00000000.00000000.00000000.00000000.01030110.00000000.00000000.00000000.00000100.03030114.00000000.00000000.00000000.00001000
              name='compatible' type=string items=6
                  value='pciex1077,2532.1077.171.2' + 'pciex1077,2532.1077.171' + 'pciex1077,171' + 'pciex1077,2532.2' + 'pciex1077,2532' + 'pciclass,c0400'
          fp (retired)
             disk (retired)
                 Hardware properties:
                       name='compatible' type=string items=1
                          value='ssd'


< removed more ouput below >
 
Once a device is in the persistent retired list upon Solaris boot up, the following message
string "NOTICE: One or more I/O devices have been retired" is logged in /var/adm/messages.

Example
Dec 19 12:15:02 t5-8-sin06-a genunix: [ID 540533 kern.notice] SunOS Release 5.11 Version 11.1 64-bit
Dec 19 12:15:02 t5-8-sin06-a genunix: [ID 459285 kern.notice] Copyright (c) 1983, 2012, Oracle and/or its affiliates. All rights reserved.
Dec 19 12:15:02 t5-8-sin06-a genunix: [ID 678236 kern.info] Ethernet address = 0:10:e0:35:ab:cd
Dec 19 12:15:02 t5-8-sin06-a genunix: [ID 751201 kern.notice] NOTICE: One or more I/O devices have been retired
Dec 19 12:15:02 t5-8-sin06-a unix: [ID 389951 kern.info] mem = 8388608K (0x200000000)
Dec 19 12:15:02 t5-8-sin06-a unix: [ID 930857 kern.info] avail mem = 7627235328
Dec 19 12:15:02 t5-8-sin06-a rootnex: [ID 466748 kern.info] root nexus = SPARC T5-4
Dec 19 12:15:02 t5-8-sin06-a rootnex: [ID 349649 kern.info] pseudo0 at root
Dec 19 12:15:02 t5-8-sin06-a genunix: [ID 936769 kern.info] pseudo0 is /pseudo
  

 

 

Solution

 

In a situation where "fmadm repaired [fmri|label]" or "fmadm acquit [fmri|label]" does not clear the device from the
persistent store, the workaround is to delete the store and reboot solaris. 

STEP 1.) rm /etc/devices/retire_store
STEP 2.) reboot Solaris (e.g. init 6, reboot )

To disable FMA I/O retirement in Solaris 
STEP 1.) Edit /usr/lib/fm/fmd/plugins/io-retire.conf 
STEP 2.) add the following line at the end of the file

setprop global-disable true

STEP 3.) reboot Solaris (e.g. init 6, reboot )

This feature is documented in the following PSARC

http://psarc.us.oracle.com/PSARC/2007/290/materials/design1.09.txt
 Oracle Explorer 
FMA I/O retirement configuration can be collected from the following location
on Oracle Explorer 8.0.2.

<explorer>/usr/lib/fm/fmd/plugins/io-retire.conf

Persistent retire store file is currently not collected by Oracle Explorer 8.02,
but will be available in the next release of the tool under the following location. 

<explorer>/etc/devices/retire_store

Bug# 18073934 - EXPLORER SHOULD COLLECT FMA IO RETIREMENT FILES

 
 

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback