Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1533591.1
Update Date:2013-08-14
Keywords:

Solution Type  Problem Resolution Sure

Solution  1533591.1 :   Brocade - [CDR-1011] SilkWorm48000, S5,P-1(45): Link Timeout On Internal Port  


Related Items
  • Brocade 48000 Director
  •  
  • Brocade DCX 8510 Backbone/Director
  •  
  • Brocade DCX Backbone
  •  
  • Brocade DCX-4S Backbone
  •  
Related Categories
  • PLA-Support>Sun Systems>DISK>Switch>SN-DK: Brocade Switch
  •  




In this Document
Symptoms
Cause
Solution
References


Created from <SR 3-6864809181>

Applies to:

Brocade DCX Backbone - Version All Versions and later
Brocade 48000 Director - Version All Versions and later
Brocade DCX-4S Backbone - Version All Versions and later
Brocade DCX 8510 Backbone/Director - Version All Versions and later
Information in this document applies to any platform.

Symptoms

To discuss this information further with Oracle experts and industry peers, we encourage you to review, join or start a discussion in the My Oracle Support Community - Disk/Tape Storage Area Networks

Customer is experiencing performance issues , a lot of disk latency from multiple storage arrays and are worried the Brocade 48K switches are the culprits.

We see a lot of errors like this, all of them against  S5,P-1(45)

errdump -a       :
Fabric OS: v6.4.2a

2013/02/22-04:33:38, [CDR-1011], 5014, SLOT 5 | CHASSIS, WARNING, SilkWorm48000,  S5,P-1(45): Link Timeout on internal port  ftx=1801620569 tov=2000 (>1000) vc_no=16 crd(s)lost=2 complete_loss:1.
2013/02/22-04:39:17, [CDR-1011], 5015, SLOT 5 | CHASSIS, WARNING, SilkWorm48000,  S5,P-1(45): Link Timeout on internal port  ftx=1801638457 tov=2000 (>1000) vc_no=16 crd(s)lost=2 complete_loss:1.
2013/02/22-04:44:33, [CDR-1011], 5016, SLOT 5 | CHASSIS, WARNING, SilkWorm48000,  S5,P-1(45): Link Timeout on internal port  ftx=1801650354 tov=2000 (>1000) vc_no=16 crd(s)lost=2 complete_loss:1.
2013/02/22-04:50:52, [CDR-1011], 5017, SLOT 5 | CHASSIS, WARNING, SilkWorm48000,  S5,P-1(45): Link Timeout on internal port  ftx=1801671069 tov=2000 (>1000) vc_no=16 crd(s)lost=2 complete_loss:1.

...
2013/02/27-07:13:30, [CDR-1011], 6030, SLOT 5 | CHASSIS, WARNING, SilkWorm48000,  S5,P-1(45): Link Timeout on internal port  ftx=1852753318 tov=2000 (>1000) vc_no=16 crd(s)lost=2 complete_loss:1.
2013/02/27-07:49:52, [CDR-1011], 6031, SLOT 5 | CHASSIS, WARNING, SilkWorm48000,  S5,P-1(45): Link Timeout on internal port  ftx=1853040329 tov=2000 (>1000) vc_no=16 crd(s)lost=2 complete_loss:1.
2013/02/27-07:57:56, [CDR-1011], 6032, SLOT 5 | CHASSIS, WARNING, SilkWorm48000,  S5,P-1(45): Link Timeout on internal port  ftx=1853069297 tov=2000 (>1000) vc_no=16 crd(s)lost=2 complete_loss:1.
2013/02/27-08:11:45, [CDR-1011], 6034, SLOT 5 | CHASSIS, WARNING, SilkWorm48000,  S5,P-1(45): Link Timeout on internal port  ftx=1853148510 tov=2000 (>1000) vc_no=16 crd(s)lost=2 complete_loss:1.
2013/02/27-08:18:35, [CDR-1011], 6036, SLOT 5 | CHASSIS, WARNING, SilkWorm48000,  S5,P-1(45): Link Timeout on internal port  ftx=1853185839 tov=2000 (>1000) vc_no=16 crd(s)lost=2 complete_loss:1.
2013/02/27-08:24:24, [CDR-1011], 6037, SLOT 5 | CHASSIS, WARNING, SilkWorm48000,  S5,P-1(45): Link Timeout on internal port  ftx=1853207077 tov=2000 (>1000) vc_no=16 crd(s)lost=2 complete_loss:1.

 

No other errors are observed, currently CP0 on Slot 5 is the active CP:

firmwareshow -v       :
Slot Name       Appl     Primary/Secondary Versions               Status
--------------------------------------------------------------------------
 5  CP0        FOS      v6.4.2a                                  ACTIVE *
                        v6.4.2a                                  
 6  CP1        FOS      v6.4.2a                                  STANDBY
                        v6.4.2a

 

Cause

In this case, this silkworm48000 has a stuck path between CP in slot 5 backend port 45 that connects to a backend port in one of the port blades.

This means that the backend port 45 in slot 5 has discarded a frame (A supportsave would be required in order to check which other backend port this one it's connected to).

With FOS 6.4.2a these events are now reported in errdump while before were reported in RASlog instead.

 

Solution

This error is documented on the "Fabric OS Message Reference  Fabric OS v7.0.1"  53-1002448-01

------------------------------------
CDR-1011
Message <timestamp>, [CDR-1011], <sequence-number>,, WARNING, <system-name>, S<slot number>,P<port number>(<blade port number>): Link Timeout on internal port ftx=<frame transmitted> tov=<real timeout value>(><expected timeout value>) vc_no=<vc number> crd(s)lost=<Credit(s) lost> complete_loss:<complete credit loss>.

Probable Cause
Indicates that one or more credits have been lost on a back-end port, and there is no traffic on that port for two seconds.

Recommended Action
Turn on the back-end credit recovery to reset the link and recover the lost credits.
If credit recovery has already been turned on, the link will be reset to recover the credits and no action is required.

Severity
WARNING
------------------------------------

Please, run the following command:

bottleneckmon --cfgcredittools -intport -recover onLrOnly

 

The mission of command "bottleneckmon --cfgcredittools -intport -recover onLrOnly" is to reset the backend ports that is reporting these events (a better description is available in Brocade documentation, "Fabric OS Command Reference Fabric OS v7.0.1" 53-1002447-01 )

Leaving this command active is harmless, otherwise if there is a backend port stuck and discarding frames, could lead to a performance degradation, that seems to be our case here.


There is a good explanation on this Brocade community thread Errors which are documented in FOS 7.x but not in FOS 6.4.x

 

After that, if you still get messages pointing to lost credits, please collect a new supportsave and open a SR with Oracle Support.


A much better description can be found on this IBM external seb's sanblog

References

<NOTE:1017730.1> - Brocade switch segmented from the fabric after replacement

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback