Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1487742.1
Update Date:2016-09-06
Keywords:

Solution Type  Problem Resolution Sure

Solution  1487742.1 :   QLogic FC HBA Link Bouncing Online and Offline at 45-48 Second Intervals  


Related Items
  • Sun SPARC Enterprise M4000 Server
  •  
  • Qlogic FC HBA
  •  
Related Categories
  • PLA-Support>Sun Systems>DISK>HBA>SN-DK: FC HBA
  •  




In this Document
Symptoms
Cause
Solution
References


Applies to:

Sun SPARC Enterprise M4000 Server - Version All Versions to All Versions [Release All Releases]
Qlogic FC HBA - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.

Symptoms

A single qlc HBA instance is observed cycling from ONLINE to OFFLINE repeatedly at consistent intervals somewhere between 45 seconds and 48 seconds. The most common time gap is 46 seconds, but occasionally you may see gaps of either 45, 47, or 48 seconds. Here is an example where we see the qlc1 port bouncing repeatedly:

Jun 11 21:20:54 host04 qlc: [ID 630585 kern.info] NOTICE: Qlogic qlc(1): Link OFFLINE
Jun 11 21:21:40 host04 qlc: [ID 630585 kern.info] NOTICE: Qlogic qlc(1): Link ONLINE   <-- 46 second gap
Jun 11 21:21:44 host04 qlc: [ID 630585 kern.info] NOTICE: Qlogic qlc(1): Link OFFLINE
Jun 11 21:22:31 host04 qlc: [ID 630585 kern.info] NOTICE: Qlogic qlc(1): Link ONLINE   <-- 46 second gap
Jun 11 21:22:35 host04 qlc: [ID 630585 kern.info] NOTICE: Qlogic qlc(1): Link OFFLINE
Jun 11 21:23:21 host04 qlc: [ID 630585 kern.info] NOTICE: Qlogic qlc(1): Link ONLINE   <-- 46 second gap
Jun 11 21:23:26 host04 qlc: [ID 630585 kern.info] NOTICE: Qlogic qlc(1): Link OFFLINE
Jun 11 21:24:13 host04 qlc: [ID 630585 kern.info] NOTICE: Qlogic qlc(1): Link ONLINE   <-- 47 second gap
Jun 11 21:24:18 host04 qlc: [ID 630585 kern.info] NOTICE: Qlogic qlc(1): Link OFFLINE
Jun 11 21:25:03 host04 qlc: [ID 630585 kern.info] NOTICE: Qlogic qlc(1): Link ONLINE   <-- 45 second gap
Jun 11 21:25:09 host04 qlc: [ID 630585 kern.info] NOTICE: Qlogic qlc(1): Link OFFLINE
Jun 11 21:25:55 host04 qlc: [ID 630585 kern.info] NOTICE: Qlogic qlc(1): Link ONLINE   <-- 46 second gap
Jun 11 21:25:59 host04 qlc: [ID 630585 kern.info] NOTICE: Qlogic qlc(1): Link OFFLINE
Jun 11 21:26:45 host04 qlc: [ID 630585 kern.info] NOTICE: Qlogic qlc(1): Link ONLINE   <-- 46 second gap
Jun 11 21:26:52 host04 qlc: [ID 630585 kern.info] NOTICE: Qlogic qlc(1): Link OFFLINE
Jun 11 21:27:38 host04 qlc: [ID 630585 kern.info] NOTICE: Qlogic qlc(1): Link ONLINE   <-- 46 second gap

This may continue indefinitely or you may see the port eventually go offline and stay offline, resulting in an "OFFLINE timeout" condition which may trigger LUNs to be offlined down that path and multipathing software to mark the path degraded like this. Note this example is using native Solaris Multipathing (mpxio). Other multipathing software from 3rd-party vendors may show differing messages after the "OFFLINE timeout":

Jun 11 21:35:26 host04 qlc: [ID 630585 kern.info] NOTICE: Qlogic qlc(1): Link OFFLINE
Jun 11 21:36:12 host04 qlc: [ID 630585 kern.info] NOTICE: Qlogic qlc(1): Link ONLINE
Jun 11 21:36:17 host04 qlc: [ID 630585 kern.info] NOTICE: Qlogic qlc(1): Link OFFLINE
Jun 11 21:37:47 host04 fctl: [ID 517869 kern.warning] WARNING: fp(0)::OFFLINE timeout
Jun 11 21:38:07 host04 scsi: [ID 243001 kern.info] /pci@8,600000/SUNW,qlc@1/fp@0,0 (fcp0):
Jun 11 21:38:07 host04 offlining lun=d (trace=0), target=750700 (trace=2800004)
Jun 11 21:38:07 host04 scsi: [ID 243001 kern.info] /pci@8,600000/SUNW,qlc@1/fp@0,0 (fcp0):
Jun 11 21:38:07 host04 offlining lun=c (trace=0), target=750700 (trace=2800004)
Jun 11 21:38:07 host04 scsi: [ID 243001 kern.info] /pci@8,600000/SUNW,qlc@1/fp@0,0 (fcp0):
Jun 11 21:38:07 host04 offlining lun=b (trace=0), target=750700 (trace=2800004)
Jun 11 21:38:14 host04 genunix: [ID 834635 kern.info] /scsi_vhci/ssd@g60060e8005448e000000448e0000b163 (ssd35) multipath status: degraded, 
   path /pci@8,600000/SUNW,qlc@1/fp@0,0 (fp0) to target address: w50060e8005448e5c,d is offline Load balancing: round-robin
Jun 11 21:38:14 host04 genunix: [ID 834635 kern.info] /scsi_vhci/ssd@g60060e8005448e000000448e0000b162 (ssd36) multipath status: degraded, 
   path /pci@8,600000/SUNW,qlc@1/fp@0,0 (fp0) to target address: w50060e8005448e5c,c is offline Load balancing: round-robin
Jun 11 21:38:14 host04 genunix: [ID 834635 kern.info] /scsi_vhci/ssd@g60060e8005448e000000448e0000b161 (ssd37) multipath status: degraded, 
   path /pci@8,600000/SUNW,qlc@1/fp@0,0 (fp0) to target address: w50060e8005448e5c,b is offline Load balancing: round-robin

 

Cause

There can be other situations where a link may be cycling up and down that are caused by other factors (cable, switch port, driver bug, etc.), but this particular signature with the link bounces at 45-48 second intervals is caused by a bad HBA port. In these cases, the HBA should be replaced.

This issue was originally logged as Sun CR 7091115. Logging the CR number here, as there is a MOS bug that is preventing it from being added under the "References" section.

Solution

Replace the HBA that contains the affected qlc instance. 

References

<NOTE:1282491.1> - How to Identify Oracle[TM] Branded Fibre Channel (FC) HBA, CNA/FCoE and Universal 16GB HBA Cards and Their Slot Locations
<BUG:15742033> - SUNBT7091115 4GB QLOGIC HBA PORT BOUNCING OFFLINE/ONLINE AT 45-46 SECOND INTERVA

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback