Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-77-1020216.1
Update Date:2012-07-26
Keywords:

Solution Type  Sun Alert Sure

Solution  1020216.1 :   Issue With Brocade Firmware May Cause a Switch Panic  


Related Items
  • Brocade 300 Switch
  •  
  • Brocade 5100 Switch
  •  
  • Brocade DCX Backbone
  •  
  • Brocade DCX-4S Backbone
  •  
  • Brocade 48000 Director
  •  
  • Brocade 5300 Switch
  •  
Related Categories
  • PLA-Support>Sun Systems>Sun_Other>Sun Collections>SN-OTH: Sun Alert
  •  
  • _Old GCS Categories>Sun Microsystems>Sun Alert>Release Phase>Resolved
  •  

PreviouslyPublishedAs
254408


Bug Id
<BUG: 15542634>

Product
Brocade 300 Switch
Brocade 48000 Director
Brocade DCX Backbone
Brocade DCX-4S Backbone
Brocade 5100 Switch
Brocade 5300 Switch

Date of Workaround Release
11-Mar-2009

Date of Resolved Release
19-Mar-2009

Issue With Brocade Firmware May Cause a Switch Panic

1. Impact

Under certain (rare) conditions, a firmware issue with the Brocade 8Gb switch or director will cause the switch to panic and reboot.  This will disrupt I/O passing through the switch, with indeterminate and unpredictable effects on hosts and target devices.

2. Contributing Factors

This issue can occur on the following platforms:
  • Brocade 300 Switch
  • Brocade 5100 Switch
  • Brocade 5300 Switch
  • Brocade DCX Backbone Director
  • Brocade DCX-4S Backbone Director
  • Brocade 48000 Director (with 8Gbit/sec port blades installed)
with a Brocade FOS version below 6.1.0h, 6.1.1d, or 6.2.0c (please see the "Resolution" section below).

This condition occurs when the switch's control processor receives a parity check and raises an interrupt to deal with the error. It successfully handles the parity error but never clears the interrupt, resulting in excessively high CPU usage. If this condition exists when the "hafailover" command is executed, the switch will panic.

The panic is triggered by the "hafailover" attempt on the switch or director when the director is recovering from an error condition, or it may occur during a firmware upgrade. In many cases, the switch administrator will expect the firmware upgrade to be non-disruptive and will execute it on a live switch.

3. Symptoms

The switch administrator can log into the switch as the "root" user and execute the "top" command to see if the switch might be affected by the error condition. The best approach is to gather several 3-second samples of CPU usage data over a period of approximately 1 minute. This can be accomplished using the following script at the command line:
for i in 0 1 2 3;
do
top -n 3;
echo "Cycle $i complete";
[ $i = 3 ] && break;
sleep 15;
done
Example output from the above:
top - 00:19:17 up 14 days,  7:15,  1 user,  load average: 0.19, 0.08, 0.02
Tasks:  77 total,   1 running,  76 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.3%us,  0.3%sy,  0.0%ni, 99.3%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:    503944k total,   265516k used,   238428k free,    20704k buffers
Swap:        0k total,        0k used,        0k free,   120296k cached
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
23831 root      16   0  2172 1016  832 R  0.7  0.2   0:00.10 top
    1 root      16   0  1696  592  524 S  0.0  0.1   0:00.36 init
    2 root      34  19     0    0    0 S  0.0  0.0   0:00.25 ksoftirqd/0
    3 root      10  -5     0    0    0 S  0.0  0.0   0:00.00 events/0
Examine the values for the "%hi" field and for the "%CPU" usage of the 'ksoftirqd' process.  If the values for either of those fields is 40% or higher, the switch is likely experiencing this issue.

If the administrator upgrades a switch that is experiencing this issue, the switch will most likely panic after the firmware upgrade has been applied.

4. Workaround

Immediately before upgrading to a firmware version that resolves this issue, the administrator must first determine a) that the switch is not currently experiencing the issue, or b) that the switch is (or may be) experiencing the issue.  Although the issue should be a rare occurrence, any delay between verifying that it is safe to proceed and performing the firmware upgrade would make it possible for the issue to reoccur.

If the error condition is present, Sun Services can provide a shell script that can be used to clear the error condition before attempting the upgrade. The administrator must contact Sun Services so they may verify the issue and provide guidance in using the script.

5. Resolution

This issue is addressed in the following releases:
  • Brocade FOS 6.1.0h (as delivered in patch 138148-06 or later)
  • Brocade FOS 6.1.1d (as delivered in patch 140188-02 or later)
  • Brocade FOS 6.2.0c (as delivered in patch 140810-02 or later) recommended for all 8Gb-capable platforms
Note: It is highly recommended to upgrade any 8Gb-capable switches or directors to one of the above FOS versions as soon as possible.

This Sun Alert notification is being provided to you on an "AS IS" basis. This Sun Alert notification may contain information provided by third parties. The issues described in this Sun Alert notification may or may not impact your system(s). Sun makes no representations, warranties, or guarantees as to the information contained herein. ANY AND ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING WITHOUT LIMITATION WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT, ARE HEREBY DISCLAIMED. BY ACCESSING THIS DOCUMENT YOU ACKNOWLEDGE THAT SUN SHALL IN NO EVENT BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, PUNITIVE, OR CONSEQUENTIAL DAMAGES THAT ARISE OUT OF YOUR USE OR FAILURE TO USE THE INFORMATION CONTAINED HEREIN. This Sun Alert notification contains Sun proprietary and confidential information. It is being provided to you pursuant to the provisions of your agreement to purchase services from Sun, or, if you do not have such an agreement, the Sun.com Terms of Use. This Sun Alert notification may only be used for the purposes contemplated by these agreements.

Copyright 2000-2009 Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, CA 95054 U.S.A. All rights reserved.

Modification History
19-Mar-2009: Update Resolution section; issue is Resolved


References

<SUNPATCH: 138148-06>
<SUNPATCH: 140188-02>
<SUNPATCH: 140810-02>

Internal Comments
Please send technical questions to the following email:
sunalert-tech-questions@sun.com
and CC the following persons:
Internal Contributor/Submitter
Internal Eng Responsible Engineer
Internal Services Knowledge Engineer

DCX

SG-XSWBRODCX-ZP



DCX-4S

SG-XSWBRODCX4-ZP



Brocade 48000 with 8Gb I/O blades:

SG-XSWBRO48K-ZP-Z, with any of the following installed:

SG-XSWBRO8GB-MOD16

SG-XSWBRO8GB-MOD32

SG-XSWBRO8GB-MOD48

SG-XSWBRO8GB-M16W8

SG-XSWBRO8GB-M32W8

SG-XSWBRO8GB-M48W8



Brocade 300

SG-XSWBRO300-8P4G

SG-XSWBRO300-8PNE

SG-XSWBRO300-8P8G



Brocade 5100

SG-XSWBRO5100-4EB

SG-XSWBRO5100-4NS

SG-XSWBRO5100-8EB

SG-XSWBRO5100-8NS



Brocade 5300

SG-XSWBRO5300-4EB

SG-XSWBRO5300-4NS

SG-XSWBRO5300-8EB

SG-XSWBRO5300-8NS
Internal Contributor/submitter
scott.thurston@sun.com

Internal Eng Responsible Engineer
scott.thurston@sun.com

Internal Services Knowledge Engineer
david.mariotto@sun.com

Internal Eng Business Unit Group
NWS (Network Storage)

Internal Resolution Patches
138148-06, 140188-02, 140810-02

Internal Sun Alert & FAB Admin Info
06-Mar-2009, david m: draft created, send to Brocade for review
11-Mar-2009, david m: approved by Brocade, completed 24hr review, send to publish
19-Mar-2009, david m: patch released, republish Resolved

References

SUNPATCH:138148-06
SUNPATCH:140188-02
SUNPATCH:140810-02

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback