Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1664253.1
Update Date:2017-10-11
Keywords:

Solution Type  Problem Resolution Sure

Solution  1664253.1 :   M10 IOX AC power failure cause guest domain PCIe device in "ldom ls-io" output show status in "UNK"  


Related Items
  • Fujitsu M10-4S
  •  
  • Fujitsu M10-4
  •  
  • Fujitsu M10 PCI Expansion Unit
  •  
Related Categories
  • PLA-Support>Sun Systems>SPARC>Enterprise>SN-SPARC: Fujitsu M10
  •  
  • _Old GCS Categories>Announcements>All Product Lines>Support Systems
  •  




In this Document
Symptoms
Changes
Cause
Solution
References


Created from <SR 3-8877267748>

Applies to:

Fujitsu M10 PCI Expansion Unit - Version All Versions to All Versions [Release All Releases]
Fujitsu M10-4S - Version All Versions to All Versions [Release All Releases]
Fujitsu M10-4 - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.

Symptoms

Domain indicate PCIe devices as follows:

prtconf -vp output indicate PCIe devices in

"retired                 /pci@8200/pci@4/pci@0/pci@0/pci@0/pci@0/pci@0/pci@1/pci@0/pci@11/pci@0  "

ldm ls-io output indicate PCIe devices

"/BB0/PCI7/SLOT1                           PCIE   PCIE3    primary  UNK   "

Changes

 M10 IOX AC power failure

XSCF> showlogs monitor -r
Apr 18 17:19:04 localhost Event: SCF:PSU input power recover(/BB#0/PSU#0)
Apr 18 17:14:24 localhost Information: /BB#0/PCI#5/PCIBOX#7006/PSU#0:PCI-Box:PSU RECOVERY
Apr 18 17:14:18 localhost Information: /BB#0/PCI#3/PCIBOX#7004/PSU#0:PCI-Box:PSU RECOVERY
Apr 18 17:14:09 localhost Information: /BB#0/PCI#1/PCIBOX#7003/PSU#0:PCI-Box:PSU RECOVERY
Apr 18 17:10:37 localhost Alarm: /BB#0/PCI#3/PCIBOX#7004/PSU#0:PCI-Box:AC FAIL
Apr 18 17:10:33 localhost Alarm: /BB#0/PCI#1/PCIBOX#7003/PSU#0:PCI-Box:AC FAIL
Apr 18 17:10:29 localhost Alarm: /BB#0/PCI#7/PCIBOX#7008/PSU#0:PCI-Box:AC FAIL
Apr 18 17:10:24 localhost Alarm: /BB#0/PCI#5/PCIBOX#7006/PSU#0:PCI-Box:AC FAIL
Apr 18 17:10:08 localhost Event: SCF:PPARID 0 GID 0000000c state change (Host stopped)
Apr 18 17:10:08 localhost Event: SCF:PPARID 0 GID 00000009 state change (Host stopped)
Apr 18 17:10:08 localhost Event: SCF:PPARID 0 GID 00000009 state change (Solaris rebooting) 

Guest domain cannot be booted without the PCIe devices

Cause

 PCIe devices in the Guest VM will not Boot and PCIe devices in "UNK" or "retired"

Solution

There are 2 approachs to clear the "retired" PCIe devices

1. Where PCIe devices are in persistent store and message log indicates "NOTICE: One or more I/O devices have been retired"

- rm /etc/devices/retire_store
- reboot Solaris (e.g. init 6, reboot

2. Where PCIe devices show:

prtconf -vp output indicates PCIe devices as:

"retired                 /pci@8200/pci@4/pci@0/pci@0/pci@0/pci@0/pci@0/pci@1/pci@0/pci@11/pci@0  "

ldm ls-io output indicates PCIe devices as:

"/BB0/PCI7/SLOT1                           PCIE   PCIE3    primary  UNK   "

and XSCF snapshot indicate all IO in good status.

  • Clear all FMA fault event in the respective LDOM
  • Verify by checking 'prtconf -vp' output and confirm which devices are reported as '(retired)'.
  • If all fault events have been cleared, and underlying hardware is functioning correctly, as it should.
  • Perform the following steps to clear any stale retirement logs, in the domain
    • # rm /etc/devices/retire_store
    • # fmadm reset io-retire
    • # bootadm update-archive
  • Reboot the host and check all devices are reported normally via the following command
  • Check the output of ldm ls-io, prtconf -vp, and fmadm faulty
  • Then bring the guest domains back online

References

<NOTE:1581135.1> - Oracle VM guest bind failure due to Direct I/O device in an Unknown (UNK) state.

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback