![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||
Solution Type Problem Resolution Sure Solution 2125074.1 : Oracle ZFS Storage Appliance: QLT port flapping at the end of resilver causes VMWARE clients to disconnect
In this Document
Created from <SR 3-11217856511> Applies to:Oracle ZFS Storage ZS3-4 - Version All Versions and laterSun ZFS Storage 7420 - Version All Versions and later Oracle ZFS Storage ZS3-2 - Version All Versions and later Sun ZFS Storage 7120 - Version All Versions and later Sun ZFS Storage 7320 - Version All Versions and later 7000 Appliance OS (Fishworks) SymptomsAt the end of a resilver process, customers will notice fiber channel connectivity issues which causes their Virtual Machines to go down. Alerts are seen similar to the following: 2015-8-18 09:49:40 Show alert details8840dbae-3b03-ee6b-eb2b-b30e6c7b645e The ZFS pool 'pool-0' has finished resilvering. Minor Alert
2015-8-18 09:49:36 Show alert detailsd184f76b-950c-e672-eaac-dc0f2514bea1 Fibre Channel connectivity via port 21:00:00:24:ff:3b:73:7e (PCIe 0: Port 1) has been lost. Major alert 2015-8-18 09:46:45 Show alert details52a048dd-cae3-489a-a5a3-ab7c5be1a17b Fibre Channel connectivity via port 21:00:00:24:ff:3e:19:c7 (PCIe 5: Port 2) has been lost. 2015-8-12 03:54:34 Show alert details3cf97282-9ee8-4201-d327-a8db85981225 Fibre Channel connectivity via port 21:00:00:24:ff:3e:18:c4 (PCIe 0: Port 1) has been established. Minor alert
CauseAt the end of a resilver during the final cleanup there is a very short port flapping issue that causes VMWARE clients to disconnect. SolutionThis is a known issue and a code fix is available. Please do the following in order to confirm you have run into this bug.
The fix for this issue is available in Appliance Firmware Release OS8.6.0 / 2013.1.6.0 (or later)
The fix for Bug 22599649 changed how the Fibre ports respond to the hosts from busy to queue full which increases the time before the clients timeout. The attached workflow will collect the qlt firmware dump and qlt logs which can help verify if this bug was hit.
Alerts will show FC ports going down/up after a resilver: Tue Aug 18 16:46:45 2015
nvlist version: 0 class = alert.ak.appliance.nas.fc.port.down source = svc:/appliance/kit/akd:default slot_label = PCIe 5 port_name = Port 2 port_wwn = 21:00:00:24:ff:3e:19:c7 uuid = 52a048dd-cae3-489a-a5a3-ab7c5be1a17b link = Tue Aug 18 16:49:36 2015 Tue Aug 18 16:49:40 2015 Tue Aug 18 16:49:51 2015 Tue Aug 18 16:49:52 2015
Similar link up/down in the qlt trace: Feb 11 22:44:14 ATL-ZFS-1 qlt: [ID 882656 kern.notice] NOTICE: qlt0: LINK DOWN, pid(EF), topgy(2h) speed(8h)
Feb 11 22:44:14 ATL-ZFS-1 fct: [ID 580862 kern.notice] NOTICE: qlt0,0 LINK DOWN, portid ef, topology Private Loop,speed 8G Feb 11 22:47:14 ATL-ZFS-1 fct: [ID 469330 kern.notice] NOTICE: qlt0,0 LINK UP, portid ef, topology Private Loop, speed 8G Feb 11 23:28:11 ATL-ZFS-1 qlt: [ID 882656 kern.notice] NOTICE: qlt1: LINK DOWN, pid(EF), topgy(2h) speed(8h) Feb 11 23:28:11 ATL-ZFS-1 fct: [ID 580862 kern.notice] NOTICE: qlt1,0 LINK DOWN, portid ef, topology Private Loop,speed 8G Feb 11 23:31:03 ATL-ZFS-1 fct: [ID 469330 kern.notice] NOTICE: qlt1,0 LINK UP, portid ef, topology Private Loop, speed 8G Feb 12 00:50:07 ATL-ZFS-1 qlt: [ID 882656 kern.notice] NOTICE: qlt0: LINK DOWN, pid(EF), topgy(2h) speed(8h) Feb 12 00:50:07 ATL-ZFS-1 fct: [ID 580862 kern.notice] NOTICE: qlt0,0 LINK DOWN, portid ef, topology Private Loop,speed 8G Feb 12 00:50:49 ATL-ZFS-1 fct: [ID 469330 kern.notice] NOTICE: qlt0,0 LINK UP, portid ef, topology Private Loop, speed 8G Feb 12 00:53:12 ATL-ZFS-1 qlt: [ID 882656 kern.notice] NOTICE: qlt0: LINK DOWN, pid(EF), topgy(2h) speed(8h) Feb 12 00:53:12 ATL-ZFS-1 fct: [ID 580862 kern.notice] NOTICE: qlt0,0 LINK DOWN, portid ef, topology Private Loop,speed 8G Feb 12 00:55:14 ATL-ZFS-1 fct: [ID 469330 kern.notice] NOTICE: qlt0,0 LINK UP, portid ef, topology Private Loop, speed 8G
References<BUG:22080255> - ZFSSA BACKEND "RESILVERING" CAUSES FC FRONT END PORT INACCESIBLE TO REMOTE HOST<BUG:21787694> - BACKPORT BUG 22518671 TO AK-2013-REL <BUG:22599649> - QLT PORT FLAPPING AFTER RESILVERING COMPLETION DUE TO EXCHG NOT BEING TERMINATED <BUG:20639544> - ADD QLT LOGS IN AK BUNDLE <BUG:21071219> - COMSTAR FRAMEWORK SHOULD ALLOW A CMD TO ABORT PRIOR TO ZFS I/O COMPLETION Attachments This solution has no attachment |
||||||||||||||||||
|