![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||
Solution Type Problem Resolution Sure Solution 1596055.1 : Sun Blade 6000 System: FC HBA or PCI card not detected - power off
In this Document
Created from <SR 3-7969557323> Applies to:Qlogic FC HBA - Version All Versions to All Versions [Release All Releases]Sun Blade 6000 System - Version All Versions to All Versions [Release All Releases] Emulex FC HBA - Version All Versions to All Versions [Release All Releases] Information in this document applies to any platform. The objective of this SR is to avoid repetitive replacements of the same PCI card, when the problem could be on the PCI slot server side. SymptomsThis is Sun Blade 6000 server. After server reboot, PCI FC HBA is not detected on prtdiag -v or in the messages files. From ilom FC HBA is not detected neither. Reset of SP. Blade X6220 did not helped. The same FC HBA cannot be recognized on Blade PCI slot 0, but it can be recognized on the Blade PCI Slot 1
When the HBA is detected on PCLI slot 1 we can see that on:
# prtdiag -v System Configuration: Sun Microsystems Sun Blade X6220 Server Module .... ==== Upgradeable Slots ====================================
And in the /var/adm/messages files, after reboot, qlc and fp instances are created (on the example below qlc4, qlc5, f4 and fp5) Oct 18 13:27:09 server1 genunix: [ID 540533 kern.notice] ^MSunOS Release 5.10 Version Generic_120012-14 64-bit
.... Oct 18 13:27:26 server1 qlc: [ID 139792 kern.info] Qlogic qlc(4) FCA Driver v20070604-2.21 Oct 18 13:27:26 server1 pcplusmp: [ID 637496 kern.info] pcplusmp: pciex1077,2432 (qlc) instance 4 vector 0x3b ioapic 0xff intin 0xff is bound to cpu 2 Oct 18 13:27:26 server1 qlc: [ID 657001 kern.info] Qlogic qlc(4) WWPN=2100001b321479f6 : WWNN=2000001b321479f6 Oct 18 13:27:28 server1 qlc: [ID 630585 kern.info] NOTICE: Qlogic qlc(4): Link ONLINE Oct 18 13:27:28 server1 qlc: [ID 694252 kern.info] NOTICE: qlc(4): Firmware version 4.0.27 Oct 18 13:27:28 server1 pcie_pci: [ID 586369 kern.info] PCIE-device: pci1077,13d@0, qlc4 Oct 18 13:27:28 server1 genunix: [ID 936769 kern.info] qlc4 is /pci@0,0/pci10de,5d@d/pci1077,13d@0 Oct 18 13:27:28 server1 qlc: [ID 139792 kern.info] Qlogic qlc(5) FCA Driver v20070604-2.21 Oct 18 13:27:28 server1 pcplusmp: [ID 398438 kern.info] pcplusmp: pciex1077,2432 (qlc) instance #5 vector 0x3c ioapic 0xff intin 0xff is bound to cpu 2 Oct 18 13:27:28 server1 qlc: [ID 657001 kern.info] Qlogic qlc(5) WWPN=2101001b323479f6 : WWNN=2001001b323479f6 Oct 18 13:27:28 server1 unix: [ID 950921 kern.info] cpu2: x86 (chipid 0x1 AuthenticAMD family 15 model 65 step 3 clock 2000 MHz) Oct 18 13:27:28 server1 unix: [ID 950921 kern.info] cpu2: Dual-Core AMD Opteron(tm) Processor 2212 Oct 18 13:27:28 server1 unix: [ID 557827 kern.info] cpu2 initialization complete - online Oct 18 13:27:29 server1 qlc: [ID 630585 kern.info] NOTICE: Qlogic qlc(5): Link ONLINE Oct 18 13:27:29 server1 qlc: [ID 694252 kern.info] NOTICE: qlc(5): Firmware version 4.0.27 Oct 18 13:27:29 server1 pcie_pci: [ID 586369 kern.info] PCIE-device: pci1077,13d@0,1, qlc5 Oct 18 13:27:29 server1 genunix: [ID 936769 kern.info] qlc5 is /pci@0,0/pci10de,5d@d/pci1077,13d@0,1 Oct 18 13:27:29 server1 genunix: [ID 936769 kern.info] fp4 is /pci@0,0/pci10de,5d@d/pci1077,13d@0,1/fp@0,0 Oct 18 13:27:29 server1 genunix: [ID 936769 kern.info] fp5 is /pci@0,0/pci10de,5d@d/pci1077,13d@0/fp@0,0
Cause1. Customer may be hitting : BUG 15818481 SUNBT7201015 X6220 blades failing to poweron pci slot after failed mosfet chip Check also for this Internal Doc 2082399.1 - FCO A0359-1: Expired: Conditional: One or both PEMs are not visible at BIOS or OS level after power outage in Sun Blade X6270 and Sun Blade X6270M2 Server Modules or When OS installed is Windows , this could be related to a BIOS firmware issue: Bug 23585155 : x6270M2 only detects ONE of TWO Qlogic PEM's on latest firmware with Win2008R2.
or 2. There is some disabled component on the server, ie on another T4-1B Blade server : If I disable PCI SWITCH1 the server boots with one missing FC HBA: -> set /SYS/MB/PCI-SWITCH1 component_state=Disabled
Set 'component_state' to 'Disabled' This disable this PCI/PEM card: /SYS/MB/PCI-EM1 PCIE SUNW,emlxs-pciex10df,fc40 LPem12002E-S 5.0GTx4
/pci@400/pci@2/pci@0/pci@4/pci@0/pci@3/SUNW,emlxs@0 /SYS/MB/PCI-EM1 PCIE SUNW,emlxs-pciex10df,fc40 LPem12002E-S 5.0GTx4 /pci@400/pci@2/pci@0/pci@4/pci@0/pci@3/SUNW,emlxs@0,1 and the other PCI card remains operative: emlxs0 is /pci@400/pci@1/pci@0/pci@4/pci@0/pci@3/SUNW,emlxs@0
emlxs1 is /pci@400/pci@1/pci@0/pci@4/pci@0/pci@3/SUNW,emlxs@0,1 The following entries are then not seen from 'probe-scsi-all' {0} ok probe-scsi-all /pci@400/pci@2/pci@0/pci@4/pci@0/pci@3/SUNW,emlxs@0 Solution1. Physically check/see the status of the three LEDs of the FC HBA ports, see: If the three LEDs are OFF, that means no power: Either the HBA is bad or The server has some PCI or other issue (most probable hitting BUG 15818481, or some internal device disabled) Based on experience we have , when the three LEDs are OFF on the FC HBA ports, it is because the PCI card is not being powered on due to some problem on the SUN BLADE SERVER MODULE. See other SR as example: 3-7687426561 (solaris), 3-7965059611 (linux) , 3-7980175761 (linux) If nobody has swapped both PEM's then: If the PEM module in slot 0 doesn't get any power and/or the fault is NOT walking with the PEM ( A1 )
2. Escalate this to Sun Blade 6000 platform team. If more analysis is required, provide a full snapshot from the Blade where the PCI FC HBA Card is plugged in. NOTE. A full snapshot reset the Blade, so this action requires a window maintenance.
3. Another troubleshooting step would be to test the FC HBA in a different system or another PCI slot, this may not be possible on many customer or swapp the PEM (PCI FC HBA ) of slot 0 with the PEM (PCI card) of slot 1 and check which FC HBA or PCI card is then visible on OS level
4. If the FC HBA has been replaced and the problem persist, then the problem is on PCI slot on server side, most probable a HW problem described on BUG 15818481
In case we are hitting BUG 15818481, the solution is to replace the Sun Blade X6220 Server Module.
References<BUG:15818481> - SUNBT7201015 X6220 BLADES FAILING TO POWERON PCI SLOT AFTER FAILED MOSFET CHIP<NOTE:1363756.1> - Sun Blade X6270M2 Server Module Current Product Issues <NOTE:2082399.1> - FCO A0359-1: Expired: Conditional: One or both PEMs are not visible at BIOS or OS level after power outage in Sun Blade X6270 and Sun Blade X6270M2 Server Modules. <BUG:23585155> - X6270M2 ONLY DETECTS ONE OF TWO QLOGIC PEM'S ON LATEST FIRMWARE WITH WIN2008R2 Attachments This solution has no attachment |
||||||||||||||||||
|