![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||
Solution Type Problem Resolution Sure Solution 1018903.1 : Brocade Silkworm Switches: Host Reboots Can Cause Brocade Marginal/Warning/DOWN Healthy/OK Errors
PreviouslyPublishedAs 230744 Symptoms Brocade Silkworm switches can log "Marginal/Warning/DOWN" "HEALTHY/OK" error messages when a host reboots. Resolution Modify the switch policy for "FaultyPorts" either at the command line interface (CLI) or Web graphical user interface (GUI). You can increase the "Down" and "Marginal" settings or designate disable by entering "0." # help switchstatuspolicyset # switchstatuspolicyset The minimum number of FaultyPorts contributing to DOWN status: (0..64) [2] 3 FaultyPorts contributing to MARGINAL status: (0..64) [1] 2 MissingSFPs contributing to DOWN status: (0..64) [0] hit return until.. Policy parameter set has been changed The changes take place immediately. There is no need to reboot the switch. Additional Information During the "non-healthy" state, the switch will appear orange in color if viewed using the Web GUI and may send an SNMP trap if configured to do so. Brocade Silkworm switches have an error policy mechanism that logs error >> Switch: 1, Warning FW-STATUS_SWITCH, 3, Switch status changed from >> HEALTHY/OK to Marginal/Warning ( --- 1 faulty port;) When the port recovers and is deemed non-faulty, the counter is lowered >> Switch: 1, Warning FW-STATUS_SWITCH, 3, Switch status changed from >> Marginal/Warning to HEALTHY/OK Online documentation within the CLI details how to change the threshold or # help switchstatuspolicyset The customer can increase the default values to avoid this scenario. This Obviously, it should not be taken for granted that a rebooting host caused For example, the following output is from a 12000 switch. The current overall switch status policy parameters: Down Marginal ---------------------------------- FaultyPorts 2 1 MissingSFPs 0 0 PowerSupplies 2 1 Temperatures 2 1 Fans 2 1 PortStatus 0 0 ISLStatus 0 0 CP 0 1 WWN 0 1 Blade 0 1 During the "non-healthy" state, it is possible to identify the cause of the A problem can be that the marginal / healthy transitions are brief and that To identify the cause of the problem, you can cross-match the dates of the Depending on the condition, there may be other messages recorded such as BL-nnnn or PORT-nnnn which will identify the port in question. Additionally, you can use 'fabstateshow' and match fabric changes to the date&time of the errdump entry. sample errdump output:- 2007/05/03-14:47:21, [FW-1437], 150,, WARNING, A, Switch status change contributing factor Faulty ports: 1 faulty ports. corresponding date&time entries in sample 'fabstateshow' output:- Here you can see that port 39 is likely to be a contributory cause of the status change and error log entry. Additionally, you can use Fabric Watch (which has a finer granularity of 1) Identify if Fabric Watch is licensed on this switch: # licenseshow Fabric Watch License 2) Use fwshow or the GUI to identify current thresholds: # fwshow 1 : Show class thresholds 3 : Port class 3) Use fwconfigure to modify port class to custom link loss settings: # fwconfigure 3 : Port class 1 : Link loss 4 : Advanced configuration 6 : change custom low [1] 7 : change custom high [0] 3 : change threshold boundary level [2] custom 9 : apply threshold boundary changes 11 : change threshold alarm level [2] custom 14 : change below alarm [1] 15 : change above alarm [1] 16 : change inBetween alarm [1] 17 : apply threshold alarm changes ^C When the next "Marginal/Warning HEALTHY/OK" error entry occurs, For example: WARNING FW-ABOVE2, 3, portLink006, Port #006 Link Failures is above high boundary. current value : 1 Error(s)/minute. (faulty) If there is no such accompanying error, this suggests that a # fwconfigure 3 : Port class 1 : Link loss 4 : Advanced configuration 3 : change threshold boundary level [1] default 11 : change threshold alarm level [1] default 9 : apply threshold boundary changes ^C Refer to Brocade documentation for guidelines for Fabric Watch settings. Setting switch status policy values and Fabric Watch definitions should be For additional information refer to:
Product SAN Brocade 3800 2 GB 16-Port Switch Brocade SilkWorm 3250 Fabric Switch Brocade SilkWorm 3850 Fabric Switch Brocade SilkWorm 3250 Switch Brocade SilkWorm 24000 Director Internal Comments Brocade Silkworm Switches: Host Reboots Can Cause Brocade Marginal/Warning/DOWN Healthy/OK Errors The behaviour of some Qlogic HBAs during boot/reboot is eluded to in: FAB < Solution: 201068 > see the root cause paragraph in that document. brocade, warning, healthy, down, policy Previously Published As 76978 Change History Date: 2009-12-01 User Name: 84789 Action: Reviewed Comment: Reviewed Attachments This solution has no attachment |
||||||||||||
|