![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||
Solution Type Problem Resolution Sure Solution 1926283.1 : FC HBA Replacement Error - cfgadm: Hardware specific failure: configure failed
In this Document
Created from <SR 3-9567926061> Applies to:Sun SPARC Enterprise M5000 Server - Version All Versions and laterQlogic FC HBA - Version Not Applicable and later Emulex FC HBA - Version Not Applicable and later Sun Storage FC HBA - Version Not Applicable and later Solaris Operating System - Version 8.0 and later Information in this document applies to any platform. SymptomsM5000 Solaris 10 server01 had present two FC HBA, c2 has to be replaced due to HW problem FOUND PATH TO 2 LEADVILLE HBA/CNA PORTS IN EXPLORER
C# INST# PORT WWN MODEL FCODE STATUS DEVICE PATH -- ----- -------- ----- ----- ------ ----------- c0 qlc0 210000e08b94123a SG-XPCIE1FC-QF4 2.01 CONNECTED /pci@1,700000/SUNW,qlc@0 c2 qlc1 210000e08b94567b SG-XPCIE1FC-QF4 2.01 CONNECTED /pci@2,600000/SUNW,qlc@0 <<- to replace, localted in iou#0-pci#3
# cfgadm -al
iou#0-pci#0 unknown empty unconfigured unknown iou#0-pci#1 etherne/hp connected configured ok iou#0-pci#2 fibre/hp connected configured ok iou#0-pci#3 fibre/hp connected configured ok <<--OK iou#0-pci#4 unknown empty unconfigured unknown
# cfgadm -c unconfigure iou#0-pci#3
iou#0-pci#0 unknown empty unconfigured unknown iou#0-pci#1 etherne/hp connected configured ok iou#0-pci#2 fibre/hp connected configured ok iou#0-pci#3 unknown connected unconfigured unknown <<-- now unconfigured iou#0-pci#4 unknown empty unconfigured unknown
Notice. After step 2, we forgot to run the "cfgadm -c disconnect iou#0-pci#3" on the PCI-Device , this could be the reason of the fma fault
Sep 9 13:52:47 server01 pcie: [ID 126225 kern.notice] NOTICE: pciehpc (px2): card is removed from the slot iou#0-pci#3
Sep 9 13:52:48 server01 fmd: [ID 377184 daemon.error] SUNW-MSG-ID: SUNOS-8000-FU, TYPE: Defect, VER: 1, SEVERITY: Major Sep 9 13:52:48 server01 EVENT-TIME: Tue Sep 9 13:52:48 MEST 2014 Sep 9 13:52:48 server01 PLATFORM: SUNW,SPARC-Enterprise, CSN: BCF1018012, HOSTNAME: server01 Sep 9 13:52:48 server01 SOURCE: eft, REV: 1.16 Sep 9 13:52:48 server01 EVENT-ID: af575bdb-e6f5-6cf2-9123-c4127e0ff4ba Sep 9 13:52:48 server01 DESC: The diagnosis engine encountered telemetry for which it was unable to perform a diagnosis. Refer to http://sun.com/msg/SUNOS-8000-FU for more information. Sep 9 13:52:48 server01 AUTO-RESPONSE: Error reports have been logged for examination by Sun. Sep 9 13:52:48 server01 IMPACT: Automated diagnosis and response for these events will not occur. Sep 9 13:52:48 server01 REC-ACTION: Ensure that the latest Solaris Kernel and Predictive Self-Healing (PSH) patches are installed.
bash-3.2$ more fmadm-faulty.out
--------------- ------------------------------------ -------------- --------- TIME EVENT-ID MSG-ID SEVERITY --------------- ------------------------------------ -------------- --------- Sep 09 13:52:48 af575bdb-e6f5-6cf2-9123-c4127e0ff4ba SUNOS-8000-FU Major Host : server01 Platform : SUNW,SPARC-Enterprise Chassis_id : BCF1018012 Product_sn : Fault class : defect.sunos.eft.undiag.fme FRU : None faulty Description : The diagnosis engine encountered telemetry for which it was unable to perform a diagnosis. Refer to http://sun.com/msg/SUNOS-8000-FU for more information. Response : Error reports have been logged for examination by Sun. Impact : Automated diagnosis and response for these events will not occur. Action : Ensure that the latest Solaris Kernel and Predictive Self-Healing (PSH) patches are installed. bash-3.2$
bash-3.2$ more fmdump-e.out ... Sep 09 13:52:47.9385 ereport.io.fire.pec.fcp Sep 09 13:52:47.9385 ereport.io.fire.pec.te Sep 09 13:52:47.9385 ereport.io.fire.pec.ldn Sep 09 13:52:47.9385 ereport.io.pci.sserr Sep 09 13:52:47.9385 ereport.io.pciex.pl.te Sep 09 13:52:47.9385 ereport.io.pciex.tl.fcp Sep 09 13:52:47.9385 ereport.io.pci.sserr Sep 09 13:52:47.9385 ereport.io.pciex.pl.te Sep 09 13:52:47.9385 ereport.io.pciex.tl.fcp
bash-3.2$ more fmdump-eV.out
... Sep 09 2014 13:52:47.938510300 ereport.io.fire.pec.fcp nvlist version: 0 class = ereport.io.fire.pec.fcp ena = 0x22b5bd1722002c01 detector = (embedded nvlist) nvlist version: 0 version = 0x0 scheme = dev device-path = /pci@2,600000 (end detector) primary = 1 tlu-uele = 0x1fffff tlu-uie = 0x1fffff001fffff tlu-uis = 0x100002000 tlu-uess = 0x100002000 __ttl = 0x1 __tod = 0x540eea0f 0x37f087dc
Sep 9 13:59:30 server01 pcie: [ID 661617 kern.notice] NOTICE: pciehpc (px2): card is inserted in the slot iou#0-pci#3
# cfgadm -c configure iou#0-pci#3
cfgadm: Hardware specific failure: configure failed
Sep 9 14:11:04 server01 genunix: [ID 396655 kern.warning] WARNING: (px2): failed to probe the Connection iou#0-pci#3
# cfgadm -f -c configure iou#0-pci#3
cfgadm: Hardware specific failure: configure failed ...so the new PCI card fails to be configured. - showhardconf.out
IOU#0 Status:Normal; Ver:0101h; Serial:BF08412AAA ; + FRU-Part-Number:CF00541-2240 03 /541-2240-03 ; + Type:1; DDC_A#0 Status:Normal; DDCR Status:Normal; DDC_B#0 Status:Normal; PCI#1 Name_Property:network; Card_Type:Other; PCI#2 Name_Property:SUNW,qlc; Card_Type:Other; PCI#3 Name_Property:; Card_Type:Other; <<< new HBA should be "Name_Property:SUNW,qlc"
At this point, before rebooting the server , try this: CauseOverall it appears the issue is due to an incomplete procedure when
The active replacement of PCI Cassette, requires to do the following steps: # cfgadm -c configure Ap_Id
SolutionA workaround to avoid server reboot is to configure the FC HBA on another free PCIE slot on the M5000 domain server, on this case we use PCIE slot 4. # fmadm repair af575bdb-e6f5-6cf2-9123-c4127e0ff4ba
2. Unconfigure and disconnect the FC HBA # cfgadm -c unconfigure iou#0-pci#3
# cfgadm -c disconnect iou#0-pci#3 3. Move the cassette with the FC HBA from iou#0-pci#3 to iou#0-pci#4 # cfgadm -c configure iou#0-pci#4
Note. After inserting the cassete in PCIE Slot 4 , the xscf did not recognized initially the Name_Property and this command took a long time (some minutes) but it ended with no error. root@server01:/root# luxadm -e port
/devices/pci@1,700000/SUNW,qlc@0/fp@0,0:devctl CONNECTED /devices/pci@3,700000/fibre-channel@0/fp@0,0:devctl NOT CONNECTED
Note. If above steps does not work , a server reboot would be required to fix this status. References<NOTE:1399644.1> - How to Locate FC HBA Manual to Get Oracle Fibre Channel (FC) HBA Port LED Patterns and Other HBA information<NOTE:1012980.1> - Sun SPARC(R) Enterprise M3000/M4000/M5000/M8000/M9000 (OPL) Servers: General troubleshooting running the snapshot command Attachments This solution has no attachment |
||||||||||||||||||
|