![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||||
Solution Type Problem Resolution Sure Solution 2278551.1 : Solaris 11 Control Domain Panic Triggered by Guest LDom Reboot - Fatal error has occured in: PCIe fabric.(0x1)(0x243)
In this Document
Created from <SR 3-15081472961> Applies to:SPARC M6-32 - Version All Versions and laterInformation in this document applies to any platform. SymptomsThis is a M6-32 control domain with Solaris 11.1 SRU 21 Jun 8 13:14:07 control-dom01 genunix: [ID 843051 kern.info] NOTICE: SUNW-MSG-ID: SUNOS-8000-0G, TYPE: Error, VER: 1, SEVERITY: Major
Jun 8 13:14:07 control-dom01 unix: [ID 836849 kern.notice] Jun 8 13:14:07 control-dom01 ^Mpanic[cpu0]/thread=2a10009dc60: Jun 8 13:14:07 control-dom01 unix: [ID 198415 kern.notice] Fatal error has occured in: PCIe fabric.(0x1)(0x243) Jun 8 13:14:07 control-dom01 unix: [ID 100000 kern.notice] Jun 8 13:14:07 control-dom01 genunix: [ID 723222 kern.notice] 000002a10009d6a0 px:px_err_panic+1c4 (106f2400, 1, 243, 7bfba800, 1, 106f0530) Jun 8 13:14:07 control-dom01 genunix: [ID 702911 kern.notice] %l0-3: 000002a10009d750 0000000000000016 00000000106f2800 000000000000005f Jun 8 13:14:07 control-dom01 %l4-7: 0000000000000000 0000000010508400 ffffffffffffffff 0000000000000000 Jun 8 13:14:07 control-dom01 genunix: [ID 723222 kern.notice] 000002a10009d7b0 px:px_err_fabric_intr+1ac (c4001cda4000, 1, 220, 1, 243, 4000dd2c8a8) Jun 8 13:14:07 control-dom01 genunix: [ID 702911 kern.notice] %l0-3: 0000000000000220 000000007bfba970 0000000000000000 0000000000000220 Jun 8 13:14:07 control-dom01 %l4-7: 0000000000000001 000000007bfba800 0000000000000001 0000c4001be781d8 Jun 8 13:14:07 control-dom01 genunix: [ID 723222 kern.notice] 000002a10009d930 px:px_msiq_intr+208 (c4001cd84b28, 0, 9, c4001cda62c8, 0, 2) Jun 8 13:14:07 control-dom01 genunix: [ID 702911 kern.notice] %l0-3: 0000000000000000 0000000038790080 0000c4001be53998 0000c4001cda4000 Jun 8 13:14:07 control-dom01 %l4-7: 0000c4001cda6418 000004000dd2c8a8 0000c4001cd9fb50 0000000000000030 Jun 8 13:14:07 control-dom01 unix: [ID 100000 kern.notice] Jun 8 13:14:07 control-dom01 genunix: [ID 672855 kern.notice] syncing file systems... Jun 8 13:14:07 control-dom01 genunix: [ID 904073 kern.notice] done Jun 8 13:14:12 control-dom01 genunix: [ID 111219 kern.notice] dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel Jun 8 13:14:33 control-dom01 genunix: [ID 100000 kern.notice] Jun 8 13:14:33 control-dom01 genunix: [ID 665016 kern.notice] ^M100% done: 714892 pages dumped, Jun 8 13:14:33 control-dom01 genunix: [ID 851671 kern.notice] dump succeeded Jun 8 13:16:54 control-dom01 genunix: [ID 540533 kern.notice] ^MSunOS Release 5.11 Version 11.1 64-bit
Jun 08 13:14:07.2903 ereport.io.pci.fabric
Jun 08 13:14:07.3487 ereport.io.pciex.linkbw.down Jun 08 13:14:07.2903 ereport.io.pci.fabric Jun 08 13:14:07.2903 ereport.io.pci.fabric Jun 08 13:14:07.2903 ereport.io.pci.fabric Jun 08 13:14:07.2903 ereport.io.pci.fabric Jun 08 13:14:07.2903 ereport.io.pci.fabric Jun 08 13:14:07.2903 ereport.io.pci.fabric Jun 08 13:14:07.2903 ereport.io.pci.fabric Jun 08 13:14:07.2903 ereport.io.pci.fabric Jun 08 13:14:07.2903 ereport.io.pci.fabric Jun 08 13:14:07.2903 ereport.io.pci.fabric Jun 08 13:14:07.2903 ereport.io.pci.fabric Jun 08 13:14:07.2903 ereport.io.pci.fabric Jun 08 13:14:07.2903 ereport.io.pci.fabric Jun 08 13:14:07.2903 ereport.io.pci.fabric Jun 08 13:14:07.2903 ereport.io.pci.fabric Jun 08 13:14:07.2903 ereport.io.pci.fabric Jun 08 13:14:07.2903 ereport.io.pci.fabric Jun 08 13:14:07.2903 ereport.io.pci.fabric Jun 08 13:14:07.2903 ereport.io.pci.fabric Jun 08 13:14:07.2903 ereport.io.pci.fabric Jun 08 13:14:07.2903 ereport.io.pci.fabric Jun 08 13:14:07.2903 ereport.io.pci.fabric Jun 08 13:14:07.2903 ereport.io.pci.fabric Jun 08 13:14:07.2903 ereport.io.pci.fabric Jun 08 13:14:07.2903 ereport.io.pci.fabric Jun 08 13:14:07.2903 ereport.io.pciex.pl.re Jun 08 13:14:07.2903 ereport.io.pciex.dl.btlp Jun 08 13:14:07.2903 ereport.io.pciex.a-nonfatal Jun 08 13:14:07.2903 ereport.io.pciex.tl.uc Jun 08 13:14:07.2903 ereport.io.pciex.tl.uc Jun 08 13:14:07.2903 ereport.io.pciex.rc.ce-msg Jun 08 13:14:07.2903 ereport.io.pciex.rc.mce-msg
Jun 08 2017 13:14:07.290398687 ereport.io.pci.fabric
nvlist version: 0 class = ereport.io.pci.fabric ena = 0x2b4ac89307700001 detector = (embedded nvlist) nvlist version: 0 version = 0x0 scheme = dev cna_dev = 0x55fd57840000027e device-path = /pci@3c0 (end detector) primary = 1 pcie_adv_rp_status = 0x1 pcie_adv_rp_command = 0x0 pcie_adv_rp_ce_src_id = 0x220 pcie_adv_rp_ue_src_id = 0x0 __ttl = 0x1 __tod = 0x5939317f 0x114f21df Jun 08 2017 13:14:07.348767145 ereport.io.pciex.linkbw.down nvlist version: 0 class = ereport.io.pciex.linkbw.down ena = 0x2b4b003d24101801 detector = (embedded nvlist) nvlist version: 0 version = 0x0 scheme = dev cna_dev = 0x55fd57840000027e device-path = /pci@3c0/pci@1/pci@0/pci@4 (end detector) source-id = 0x220 device-id = 0x80ba vendor-id = 0x111d expected = 0x1 supported-link-speeds = 0xe current-link-speed = 0x1 current-link-width = 0x4 prior-link-speed = 0x3 prior-link-width = 0x8 target-link-speed = 0x0 max-link-speed = 0x3 max-link-width = 0x8 runtime = 0x0 __ttl = 0x1 __tod = 0x5939317f 0x14c9c3a9 .... Jun 08 2017 13:14:07.290398687 ereport.io.pci.fabric nvlist version: 0 class = ereport.io.pci.fabric ena = 0x2b4ac89307700001 detector = (embedded nvlist) nvlist version: 0 version = 0x0 scheme = dev device-path = /pci@3c0/pci@1/pci@0/pci@6/network@0,1 (end detector) bdf = 0x7301 device_id = 0x1528 vendor_id = 0x8086 rev_id = 0x1 dev_type = 0x0 pcie_off = 0xa0 pcix_off = 0x0 aer_off = 0x100 ecc_ver = 0x0 func_type = 0x1 pci_status = 0x10 pci_command = 0x147 pcie_status = 0x0 pcie_command = 0x201f pcie_dev_cap = 0x10008cc2 pcie_link_status = 0x1082 pcie_dev_ctl2 = 0x5 pcie_adv_ctl = 0x1e0 pcie_ue_status = 0x0 pcie_ue_mask = 0x0 pcie_ue_sev = 0x462031 pcie_ue_hdr0 = 0x0 pcie_ue_hdr1 = 0x0 pcie_ue_hdr2 = 0x0 pcie_ue_hdr3 = 0x0 pcie_ce_status = 0x0 pcie_ce_mask = 0x0 pcie_aff_flags = 0x0 pcie_aff_bdf = 0xffff orig_sev = 0x1 remainder = 0x0 severity = 0x1 __ttl = 0x1 __tod = 0x5939317f 0x114f21df Jun 08 2017 13:14:07.290398687 ereport.io.pciex.pl.re nvlist version: 0 ena = 0x2b4ac89307700001 detector = (embedded nvlist) nvlist version: 0 version = 0x0 scheme = dev cna_dev = 0x55fd57840000027e device-path = /pci@3c0/pci@1/pci@0/pci@4/SUNW,emlxs@0,1 (end detector) class = ereport.io.pciex.pl.re dev-status = 0x1 ce-status = 0x2041 link-width = 0x4 link-speed = 0x9c4 __ttl = 0x1 __tod = 0x5939317f 0x114f21df Jun 08 2017 13:14:07.290398687 ereport.io.pciex.dl.btlp nvlist version: 0 ena = 0x2b4ac89307700001 detector = (embedded nvlist) nvlist version: 0 version = 0x0 scheme = dev cna_dev = 0x55fd57840000027e device-path = /pci@3c0/pci@1/pci@0/pci@4/SUNW,emlxs@0,1 (end detector) class = ereport.io.pciex.dl.btlp dev-status = 0x1 ce-status = 0x2041 link-width = 0x4 link-speed = 0x9c4 __ttl = 0x1 __tod = 0x5939317f 0x114f21df Jun 08 2017 13:14:07.290398687 ereport.io.pciex.a-nonfatal nvlist version: 0 ena = 0x2b4ac89307700001 detector = (embedded nvlist) nvlist version: 0 version = 0x0 scheme = dev cna_dev = 0x55fd57840000027e device-path = /pci@3c0/pci@1/pci@0/pci@4/SUNW,emlxs@0,1 (end detector) class = ereport.io.pciex.a-nonfatal dev-status = 0x1 ce-status = 0x2041 __ttl = 0x1 __tod = 0x5939317f 0x114f21df Jun 08 2017 13:14:07.290398687 ereport.io.pciex.tl.uc nvlist version: 0 ena = 0x2b4ac89307700001 detector = (embedded nvlist) nvlist version: 0 version = 0x0 scheme = dev cna_dev = 0x55fd57840000027e device-path = /pci@3c0/pci@1/pci@0/pci@4/SUNW,emlxs@0,1 (end detector) class = ereport.io.pciex.tl.uc dev-status = 0x1 ue-status = 0x10000 ue-severity = 0x62011 adv-ctl = 0x1f0 source-id = 0xffff source-valid = 1 __ttl = 0x1 __tod = 0x5939317f 0x114f21df Jun 08 2017 13:14:07.290398687 ereport.io.pciex.tl.uc nvlist version: 0 ena = 0x2b4ac89307700001 detector = (embedded nvlist) nvlist version: 0 version = 0x0 scheme = dev cna_dev = 0x55fd578400000062 device-path = /pci@3c0/pci@1/pci@0/pci@4/SUNW,emlxs@0 (end detector) class = ereport.io.pciex.tl.uc dev-status = 0x1 ue-status = 0x10000 ue-severity = 0x62011 adv-ctl = 0x1f0 source-id = 0xffff source-valid = 1 __ttl = 0x1 __tod = 0x5939317f 0x114f21df
If you see a panic similar to this, check from the control domain if there are guest LDom with virtual functions assigned with command # ldm list-io -l .... /SYS/IOU0/PCIE1/IOVFC.PF0.VF0 VF pci_1 guestldom01 <<<--- virtual function assigned to guest LDom "guestldom01" If that is the case , check if any of these LDoms were rebooted just before the panic on control domain.
ChangesJust before panic, one guest LDom that has assigned some FC HBA virtual functions was rebooted unexpectedly due to another issue.
CauseYou could be facing this know issue, in some rare situations PCIe errors caused by stop/start of guests with VFs may cause primary domain to panic. Bug 15906060 : Panic seen on Primary after multiple start/stop of io-domains w/VF's Solution
Upgrade to Oracle Solaris 11.3.1.5.0 (or greater) on the guest LDom and control domain.
References<BUG:15906060> - PANIC SEEN ON PRIMARY AFTER MULTIPLE START/STOP OF IO-DOMAINS W/VF'S<BUG:21352084> - ROOT DOMAIN PANIC DUE TO FATAL ERROR OCCURED IN PCIE FABRIC Attachments This solution has no attachment |
||||||||||||||||||||
|