FC - Drives Connected to a FC HBA Can No Longer Be Seen on the Backup Application

Asset ID:	1-72-1452689.1
Update Date:	2017-10-30
Keywords:

Solution Type Problem Resolution Sure

Solution 1452689.1 : FC - Drives Connected to a FC HBA Can No Longer Be Seen on the Backup Application

Applies to:

Sun StorageTek SL48 Tape Library - Version Not Applicable to Not Applicable [Release N/A]
Sun StorageTek SL24 Tape Autoloader - Version Not Applicable to Not Applicable [Release N/A]
Solaris Operating System - Version 10 3/05 to 10 8/11 U10 [Release 10.0]
Sun StorageTek SL500 Modular Library System - Version Not Applicable to Not Applicable [Release N/A]
Information in this document applies to any platform.

Symptoms

NOTE In this case there was a SL500 library and IBM LTO drives. It can be assumed that other type of drives and other direct attached libraries (SL24, SL48, etc) can exhibit this scenario.

Drives can no longer be seen on the backup application. Link disconnection messages can be noticed on the Solaris host.

SAMPLE:

Apr 18 11:05:22 sia scsi: [ID 243001 kern.warning] WARNING: /ssm@0,0/pci@1f,600000/SUNW,emlxs@2/fp@0,0 (fcp3):
Apr 18 11:05:22 sia PLOGI to D_ID=0x50226 failed: State:ELS is in Progress, Reason:Undefined. Giving up
Apr 18 11:21:17 sia emlxs: [ID 349649 kern.info] [ 5.0606]emlxs0: NOTICE: 730: Link reset. (Resetting link...) ----------> The FC link going up and down is a signal.
Apr 18 11:21:17 sia emlxs: [ID 349649 kern.info] [ 5.0334]emlxs0: NOTICE: 710: Link down.
Apr 18 11:21:17 sia emlxs: [ID 349649 kern.info] [ 5.0646]emlxs0: NOTICE: 730: Link reset.
Apr 18 11:21:19 sia emlxs: [ID 349649 kern.info] [ 5.055E]emlxs0: NOTICE: 720: Link up. (2Gb, fabric, initiator)
Apr 18 11:21:44 sia emlxs: [ID 349649 kern.info] [ 1.037F]emlxs0: NOTICE: 612: Node missing. (FCP2 device (did=050326) missing. Flushing...)
Apr 18 11:21:44 sia last message repeated 11 times
Apr 18 11:23:30 sia scsi: [ID 583861 kern.info] st17 at fp3: unit-address w500104f0008d832b,0: 50226
Apr 18 11:23:30 sia genunix: [ID 936769 kern.info] st17 is /ssm@0,0/pci@1f,600000/SUNW,emlxs@2/fp@0,0/st@w500104f0008d832b,0
Apr 18 11:23:31 sia scsi: [ID 365881 kern.info] /ssm@0,0/pci@1f,600000/SUNW,emlxs@2/fp@0,0/st@w500104f0008d832b,0 (st17):
Apr 18 11:23:31 sia <>
Apr 18 11:23:31 sia genunix: [ID 408114 kern.info] /ssm@0,0/pci@1f,600000/SUNW,emlxs@2/fp@0,0/st@w500104f0008d832b,0 (st17) online ---------> This is showing the PWWN of the drive that disappeared to customer.
Apr 18 11:23:45 sia scsi: [ID 243001 kern.info] /ssm@0,0/pci@1f,600000/SUNW,emlxs@2/fp@0,0 (fcp3):
Apr 18 11:23:45 sia offlining lun=0 (trace=0), target=50326 (trace=2800004)
Apr 18 11:23:45 sia scsi: [ID 107833 kern.warning] WARNING: /ssm@0,0/pci@1f,600000/SUNW,emlxs@2/fp@0,0/st@w500104f0008d832e,0 (st16): --------> This is showing the PWWN of the drive that disappeared to customer.
Apr 18 11:23:45 sia transport rejected
Apr 18 11:24:38 sia scsi: [ID 107833 kern.warning] WARNING: /ssm@0,0/pci@1e,700000/pci@1/scsi@2,1 (glm1):
Apr 18 11:24:38 sia invalid reselection (6.0)
Apr 18 11:24:38 sia genunix: [ID 408822 kern.info] NOTICE: glm1: fault detected in device; service still available
Apr 18 11:24:38 sia genunix: [ID 611667 kern.info] NOTICE: glm1: invalid reselection (6.0)
Apr 18 11:24:38 sia scsi: [ID 107833 kern.warning] WARNING: /ssm@0,0/pci@1e,700000/pci@1/scsi@2,1 (glm1)

ALSO:

They show as "connected configured failed" on the cfgadm -al output:

c9::500104f0008d832b           connected    configured   failed     IBM ULTRIUM-TD5
unavailable tape         n        /devices/ssm@0,0/pci@1f,600000/SUNW,emlxs@2/fp@0,0:fc::500104f0008d832b
c9::500104f0008d832e           connected    configured   failed     IBM ULTRIUM-TD5
unavailable tape         n        /devices/ssm@0,0/pci@1f,600000/SUNW,emlxs@2/fp@0,0:fc::500104f0008d832e

NOTICE ALSO ON THE CFGADM -ALV OUTPUT THAT:

1. Drives on the same channel (c9 in this case) may be fine:

c9::500104f0008d8328 connected configured unknown IBM ULTRIUM-TD5
unavailable tape n /devices/ssm@0,0/pci@1f,600000/SUNW,emlxs@2/fp@0,0:fc::500104f0008d8328

2. Other devices connected to the host may also have this "configured failed" status, such as disk arrays:

c6::500601633b202e14 connected configured failed DGC LUNZ
unavailable disk n /devices/ssm@0,0/pci@1e,600000/SUNW,qlc@1/fp@0,0:fc::500601633b202e14

Changes

Reboot of SAN switch or library, cables pulled.

Cause

- Bug 6793438.  Fixed on Patch-ID 144188-02

- When using Sun/Emulex FC HBA drivers and direct-attached to various tape libraries and drives, the link may not recover after simple
 administration like library reboot or cable pull.  SAN switch reboot is thought to cause this situation as well.

Solution

1. Reinitialize the HBA port:

luxadm -e offline pathname (example: /devices/ssm@0,0/pci@1f,600000/SUNW,emlxs@2)

luxadm -e online pathname (example: /devices/ssm@0,0/pci@1f,600000/SUNW,emlxs@2)

2. Pull the fiber cables from the HBA and leave them unplugged for 2 minutes, then reconnect them.

3. Reboot library and host.

Attachments

This solution has no attachment