![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||
Solution Type Problem Resolution Sure Solution 2384922.1 : Eagle5 STP - E5ENETB IPSG Card rebooted with Obit: Module ath_vxw.c Line 3314 Class 0001
In this Document
Created from <SR 3-17099951941> Applies to:Oracle Communications EAGLE (Hardware) - Version EAGLE 46.5 and laterInformation in this document applies to any platform. SymptomsE5-ENETB IPSG Card rebooted. After the reload the card was fully functional. ****18-03-25 02:28:37**** 0223.0096 CARD 1203 IPSG Card has been reloaded ****18-03-25 02:27:31**** 0218.0014 CARD 1203 IPSG Card is present ASSY SN: 10212325147 ****18-03-25 02:22:30**** 0131.0013 ** CARD 1203 IPSG Card is isolated from the system ASSY SN: 10212325147 Upon reload the card generated Obit ath_vxw.c Line 3314 Class 0001 on the active MASP: STH: Received a BOOT APPL-Obituary reply for restart Card 1203 Module ath_vxw.c Line 3314 Class 0001 Register Dump : EFL=00000000 CS =0000 EIP=00000000 SS =0000 EAX=00000000 ECX=00000000 EDX=00000000 EBX=00000000 ESP=00000000 EBP=00000000 ESI=00000000 EDI=00000000 DS =0000 ES =0000 FS =0000 GS =0000
Stack Dump : [SP+1E]=0000 [SP+16]=0000 [SP+0E]=0000 [SP+06]=0000 [SP+1C]=0000 [SP+14]=0000 [SP+0C]=0000 [SP+04]=0000 [SP+1A]=0000 [SP+12]=0000 [SP+0A]=0000 [SP+02]=0000 [SP+18]=0000 [SP+10]=0000 [SP+08]=0000 [SP+00]=0000
User Data Dump : 30 78 66 66 66 66 66 66 66 66 20 41 50 50 4c 20 0xffffffff.APPL. 57 61 74 63 68 64 6f 67 20 74 69 6d 65 6f 75 74 Watchdog.timeout 20 72 65 73 65 74 .reset
Report Date:18-03-25 Time:02:27:31 Changes
CauseStart by analyzing the logs and search for other possible symptoms in the node. A single malfunction can have multiple causes: internal causes (for example bouncing DPCs) or external causes (for example an issue on a port of the switch which made the card to reboot as a recovering mechanism, or due to a router which is causing heavy retransmissions). OBIT ath_vxw.c class 0001 is related to Application Trouble Handler. This indicates a HW fault if it keeps repeating. In the user data dump section we see 0xffffffff.APPL.Watchdog.timeout.reset. On the E5 cards we use 3 types of watchdog mechanisms (hardware watchdogs, low priority starvation, and sanity). In our case the system points to the hardware. This is a hardware watchdog and because hardware reset the system without any software involvement, there is no post fail data available. These are typically difficult obits to debug due to the lack of post mortem. Proceed with gathering more data.
In order to check if any messages are being discarded by the card: rept-stat-mfc:mode=stats:service=vsccp:sample=tot24h rept-stat-mfc:mode=stats:service=mtp3:sample=tot24h
rtrv-trbl:loc=<active MASP 1113 or 1115> rtrv-obit:loc=<active MASP 1113 or 1115> rtrv-log:mode=full:dir=bkwd:num=500:outgrp=sys:slog=act rtrv-log:mode=full:dir=bkwd:num=500:outgrp=card:slog=act rept-stat-rtd rept-stat-imt:mode=full rept-imt-lvl1:sloc=1201:eloc=1115:r=summary rept-stat-mux rept-stat-db:display=all rept-stat-ddb:display=all
rept-stat-card:loc=<card location>:mode=full rtrv-card:loc=<card location> SolutionWhile gathering all the logs, continue to monitor the card. If all the other logs are clear continue monitoring for an extended period of time, agreed with the customer. If a second reboot takes place consider a hard reset by re-seating. If a third reboot takes place change the board ASAP with a spare. Monitor the behavior after the replacement to confirm the normal functionality.
Attachments This solution has no attachment |
||||||||||||||||||
|