Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1942533.1
Update Date:2018-05-17
Keywords:

Solution Type  Problem Resolution Sure

Solution  1942533.1 :   M-Series Servers: XSCF watchdog timeout without auto negotiation on Ethernet port  


Related Items
  • Sun SPARC Enterprise M4000 Server
  •  
  • Sun SPARC Enterprise M9000-32 Server
  •  
  • Sun SPARC Enterprise M5000 Server
  •  
  • Sun SPARC Enterprise M9000-64 Server
  •  
  • Sun SPARC Enterprise M8000 Server
  •  
  • Sun SPARC Enterprise M3000 Server
  •  
Related Categories
  • PLA-Support>Sun Systems>SPARC>Enterprise>SN-SPARC: Mx000
  •  




In this Document
Symptoms
Cause
Solution
References


Created from <SR 3-9802791551>

Applies to:

Sun SPARC Enterprise M5000 Server - Version All Versions to All Versions [Release All Releases]
Sun SPARC Enterprise M8000 Server - Version All Versions to All Versions [Release All Releases]
Sun SPARC Enterprise M9000-32 Server - Version All Versions to All Versions [Release All Releases]
Sun SPARC Enterprise M9000-64 Server - Version All Versions to All Versions [Release All Releases]
Sun SPARC Enterprise M3000 Server - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.

Symptoms

XSCF WDT (WatchDog Timeout event) may occur if the embedded XSCF's network port is attached to a link partner
that does not negotiate the link speed and forces it instead. Below are some examples of possible consequences:

XSCF watchdog timeouts

- as seen in the output of " XSCF> showlogs error -v "

Date: Oct 29 13:54:56 CET 2014 Code: 60000000-c201faff-011d001200000000
  Status: Warning Occurred: Oct 29 13:54:51.455 CET 2014
  FRU: /XSCFU,/FIRMWARE
  Msg: XSCF watchdog timeout
  Diagnostic Code:
  00000000 00000000 00000000
  57617463 68646f67 2054696d 656f7574
  00000000 00000000 00000000 00000000
  UUID: 2378db46-7b2e-471f-8df9-8dbfebb7a5a2 MSG-ID: SCF-8006-YS


- as seen in the output of " XSCF> showlogs monitor "

Oct 29 13:54:58 xscfhostname Warning: /XSCFU,/FIRMWARE:SCF:XSCF watchdog timeout


and possible degradation of the XSCF unit

- as seen in the output of " XSCF> showstatus "

*   XSCFU Status:Degraded;

Cause

XSCF-LAN requires auto-negotiation on the port on the link partner (switch/router) to which the XSCF LAN port is connected to.

If auto negotiation is not enabled on the corresponding port on the link partner, you may observe packet errors on the affected
LAN interface, as seen in the output of the command " XSCF> shownetwork -a "

xscf#0-lan#0
          Link encap:Ethernet  HWaddr 00:21:28:44:84:F6  
          inet addr:x.x.x.x  Bcast:x.x.x.x  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:2289 errors:0 dropped:0 overruns:0 frame:0
          TX packets:3311 errors:45 dropped:0 overruns:0 carrier:45
          collisions:0 txqueuelen:1000
          RX bytes:236912 (231.3 KiB)  TX bytes:2904851 (2.7 MiB)
          Base address:0xe000


In the messages file of the XSCF internal operating system, you will see that the network speed of the LAN interface that is in use has been
set to Half Duplex mode with 100BT.

Oct 29 12:53:55 (none) kernel: eth0: PHY is Intel LXT972A (1378e2)
Oct 29 12:53:58 (none) kernel: eth0: Half Duplex
Oct 29 12:53:58 (none) kernel: eth0: Speed 100BT
Oct 29 12:53:58 (none) kernel: eth0: Link is up

This output is found in the XSCF Snapshot within the directory: <snapshot_name>/spos_logs/@var@log@messages  

Solution

If the above conditions are met

1. enable auto negotiation on the corresponding port on the link partner

2. reset the XSCF by executing the command "XSCF> rebootxscf -y"

3. if the XSCF unit has been marked as degraded and

       - if the XCP Firmware is 1115 or later, please clear the status of the XSCF unit by executing the command:
         "XSCF> clearfault /xscfu"

        You will get the info, that the "FRU will be marked to clear fault on next circuit breaker off and on."
        On the M3000, M4000 and M5000 Server, this means that you should find a downtime for the domain(s) and then poweroff the
        server and remove the power cords from the power supplies.

        The M8000/M9000 Server have a power circuit breaker. Also here, power off and power on requires a downtime for the domains.

       - if the XCP Firmware is below 1115, please open a Service Request at the Oracle Global Systems Support.
         The Service Engineer will clear the status of the XSCF unit within a Shared Shell or WebEx session.
         Also in this case, the status will be cleared after the next circuit breaker off and on.


If after executing the above commands further Watchdog Timeouts occur, please open a Service Request at the Oracle Global Systems
Support for further investigation.
The Support Engineer will request a XSCF Snapshot in order to investigate the issue. The following
document explains how to generate the XSCF snapshot:

Gathering diagnostic data for SPARC Enterprise M3000/M4000/M5000/M8000/M9000 (OPL) Servers (Doc ID 1008229.1)


For further details, see:
https://stbeehive.oracle.com/teamcollab/wiki/OPL+-+WatchDog+Timeout:Known+SW+causes#Improper+network+configuration

 

References

<NOTE:1008229.1> - Gathering diagnostic data for SPARC Enterprise M3000/M4000/M5000/M8000/M9000 (OPL) Servers

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback