Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-2267515.1
Update Date:2017-05-19
Keywords:

Solution Type  Problem Resolution Sure

Solution  2267515.1 :   Diameter Signaling Router (DSR): Event "31226 - HA Availability Status Degraded" & Event "5001 - IPFE Backend Unavailable"  


Related Items
  • Oracle Communications Diameter Signaling Router (DSR)
  •  
Related Categories
  • PLA-Support>Sun Systems>CommsGBU>Global Signaling Solutions>SN-SND: Tekelec DSR
  •  




In this Document
Symptoms
Cause
Solution


Created from <SR 3-14619030561>

Applies to:

Oracle Communications Diameter Signaling Router (DSR) - Version DSR 7.0.1 and later
Information in this document applies to any platform.

Symptoms

Event "31226 - HA Availability Status Degraded" Occurring Several Times A Day.

TIMESTAMP : 2017-04-18 13:11:18.825 EDT
NETWORK_ELEMENT : SOAM
SERVER : ipfe
SEQ_NUM : 265
EVENT_NUMBER : 31226
SEVERITY : MAJOR
PROCESS : cmha
TYPE : HA
INSTANCE :
NAME : HA Availability Status Degraded
DESCRIPTION : The high availability status is degraded due to raised alarms
ERR_INFO : GN_WARNING/WRN condition may require attention if persists ^^ [31903:DbResourceStates.cxx:1203]
SECS : 1492535478
USECS : 825000
CISECS : 1492535478
CIUSECS : 878000
ID : 1

 

Cause

The description in 31226 itself showed that the high availability status is degraded due to raised alarms.
The event history from Active SOAM showed that the every time event 31226 was raised there was a Event "5001 - IPFE Backend Unavailable" also raised before it.

TIMESTAMP : 2017-04-18 13:11:18.809 EDT
NETWORK_ELEMENT : SOAM
SERVER : ipfe
SEQ_NUM : 264
EVENT_NUMBER : 5001
SEVERITY : MINOR
PROCESS : ipfe
TYPE : IPFE
INSTANCE : ipfe: 1.2.3.4
NAME : IPFE Backend Unavailable
DESCRIPTION : A backend server has indicated through the monitoring protocol that it is dead.
ERR_INFO : GN_DOWN/WRN ^^ [34266:IpfeBackendMonitor.C:194]
SECS : 1492535478
USECS : 809000
CISECS : 1492535478
CIUSECS : 878000
ID : 1

Ping between MP and IPFE did not showed any packet drops or latency.

TCPDUMP was captured simultaneously on MP and IPFE using the below command

sudo tcpdump -s 0 -i xsi2 port 9675 -w /var/TKLC/db/filemgmt/trace1.pcap


From the tcpdump, it was seen that sometimes MP was sending Heart Beat to IPFE Destination IP 1.1.1.1 and with Destination MAC Address x.x.x.x.x.x and sometimes MP sends Heart Beat to IPFE with Destination MAC Address y.y.y.y.y.y.

When the Heartbeat is sent to Destination with MAC Address y.y.y.y.y.y then IPFE sends RST to reset the connection and hence the alarms are generated.

Now, to identify which device owns the MAC Address y.y.y.y.y.y tcpdump was again collected to include only ARP messages from both MP and IPFE simultaneously

sudo tcpdump -i any -p arp /var/TKLC/db/filemgmt/trace2.pcap

From this tcpdump we could identify that MAC Address y.y.y.y.y.y was owned by IP 1.1.1.1 (which is same IP as of IPFE), so this indicated a IP conflict.

customer identified that this IP and MAC address was assigned to a Cisco 4948 Aggregation switch.

They assigned a new IP to Cisco 4948 switch and that resolved both alarms.

Solution

Resolve the IP conflict.
 


Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback