![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||||
Solution Type Problem Resolution Sure Solution 2003660.1 : Oracle ZFS Storage Appliance: Solaris client ZFS pool (constructed from FC LUNs exported from ZFS-SA) becomes suspended due to appliance takeover.
In this Document
Created from <SR 3-10406486941> Applies to:Sun Storage 7410 Unified Storage System - Version All Versions to All Versions [Release All Releases]Sun Storage 7310 Unified Storage System - Version All Versions to All Versions [Release All Releases] Sun Storage 7210 Unified Storage System - Version All Versions to All Versions [Release All Releases] Oracle ZFS Storage ZS3-2 - Version All Versions to All Versions [Release All Releases] Sun ZFS Storage 7320 - Version All Versions to All Versions [Release All Releases] 7000 Appliance OS (Fishworks) SymptomsSolaris 11.2 client with STMS/MPXIO configured reporting zpool 'suspended' when there is a takeover on the ZFS appliance. Fibre Channel (FC) LUNs are mirrored by ZFS on the Solaris client. # zpool status data01
pool: data01 state: SUSPENDED status: One or more devices are unavailable in response to IO failures. The pool is suspended. action: Make sure the affected devices are connected, then run 'zpool clear' or 'fmadm repaired'. Run 'zpool status -v' to see device specific details. see: http://support.oracle.com/msg/ZFS-8000-HC scan: resilvered 5.57M in 0h0m with 0 errors on Tue Mar 24 17:28:47 2015 config: . NAME STATE READ WRITE CKSUM data01 SUSPENDED 0 110 0 mirror-0 ONLINE 0 130 0 c0t600144F0B97C139B00005510F3350002d0 ONLINE 0 140 0 c0t600144F0D232395600005510F2A90001d0 ONLINE 0 138 0
The ZFS appliance shows a short takeover/failback time.
The ZFS-SA exported FC LUNs are configured correctly with a target and host group configured. Mar 24 17:19:13 ZFS-8000-NX fault.fs.zfs.vdev.probe_failure 600144f0b97c139b00005510f3350002 <<--
Mar 24 17:19:13 ZFS-8000-FD fault.fs.zfs.vdev.io 600144f0b97c139b00005510f3350002 Mar 24 17:19:14 ZFS-8000-NX fault.fs.zfs.vdev.probe_failure n600144f0d232395600005510f2a90001 <<-- Mar 24 17:19:15 ZFS-8000-FD fault.fs.zfs.vdev.io n600144f0d232395600005510f2a90001 Mar 24 17:31:27 ZFS-8000-8A fault.fs.zfs.object.corrupt_data pool_name=data01 Mar 24 17:31:29 ZFS-8000-HC fault.fs.zfs.io_failure_wait pool_name=data01 <<-- suspended I/O
The 'rm.ak' and 'debug.sys' logs show Tue Mar 24 06:19:09 2015: takeover completed in 4.107s
Mar 24 06:19:10 BRSUA2-SAN-HEAD02 fct: [ID 469330 kern.notice] NOTICE: qlt0,0 LINK UP, portid ef, topology Private Loop, speed 8G.
Tue Mar 24 06:27:58 2015: ak_rm_fail_back phase 1 complete in 2.997s
Tue Mar 24 06:28:03 2015: ak_rm_fail_back phase 2 complete in 4.706s Mar 24 06:28:04 brsua2-san-head01 fct: [ID 469330 kern.notice] NOTICE: qlt0,0 LINK UP, portid ef, topology Private Loop, speed 8G.
ChangesFC directly connected to ZFS appliance without an FC switch
CauseConnectivity options: Point-to-Point (FC-P2P) and switch attach (FC-SW) connectivity is supported unless where noted specifically. No support is provided for arbitrated loop (FC-AL) connectivity.
SolutionFC direct connection supportability is available in Connectivity options: Point-to-Point (FC-P2P) and switch attach (FC-SW) connectivity is supported unless where noted specifically. No support is provided for arbitrated loop (FC-AL) connectivity.
16Gb Qlogic FC HBA indicates no support for 16Gb FC-AL connection Topologies supported: FC-SW switched fabric (N_Port), FC-AL arbitrated loop (not supported at 16 Gb) (NL_ Port), and Point-to-point (N_Port) http://docs.oracle.com/cd/E24651_01/html/E24460/z40003111016271.html#scrolltoc
In this case, the Solaris initiator should be forced to use Fibre Channel Point-to-Point (FC-P2P). Set connection-options=1 in /kernel/drv/qlc.conf
I/O error should be issued only after a appropriate timeout to cover port flaps.
Update Solaris client to minimum SRU 11.2.9.5.0 Workaround and best practise is to use FC switches. References<BUG:20802234> - LUNS PRESENTED TO SOLARIS CLIENT BECOME SUSPENDED DURING ZFS APPLIANCE TAKEOVER<NOTE:1434184.1> - Sun Storage 7000 Unified Storage System: How to Troubleshoot Fibre-Channel Problems <NOTE:1672221.1> - Oracle Solaris 11.2 Support Repository Updates (SRU) Index http://www.oracle.com/technetwork/server-storage/sun-unified-storage/documentation/o12-019-fclun-7000-rs-1559284.pdf <NOTE:1402545.1> - Sun Storage 7000 Unified Storage System: How to Troubleshoot Cluster Problems <BUG:18969626> - I/O STOPS WHEN OTHER PATH PULLED OUT AND INSERTED AFTER A PATH IS DEGRADED. Attachments This solution has no attachment |
||||||||||||||||||||
|