Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-2326439.1
Update Date:2018-01-08
Keywords:

Solution Type  Problem Resolution Sure

Solution  2326439.1 :   DSR Critical Event 25500 No DA-MP Leader Detected  


Related Items
  • Oracle Communications Diameter Signaling Router (DSR)
  •  
Related Categories
  • PLA-Support>Sun Systems>CommsGBU>Global Signaling Solutions>SN-SND: Tekelec DSR
  •  




In this Document
Symptoms
Changes
Cause
Solution
References


Created from <SR 3-15988867631>

Applies to:

Oracle Communications Diameter Signaling Router (DSR) - Version DSR 7.2.0 and later
Tekelec

Symptoms

 A RouteList having two RouteGroups. When all connections in one RouteGroup go down in very short time (in few milliseconds difference) while other remains in service.

Events related Connection Unavailable and Non-Preferred Route Group In Use can be seen in system events. Note the very short span in which the alarms appear.

TIMESTAMP EVENT_NUMBER DESCRIPTION

YYYY-MM-DD 18:29:24.135 22101 FsmOpStateUnavailable
YYYY-MM-DD 18:29:24.135 22104 SctpPathMismatch SCTP
YYYY-MM-DD 18:29:24.190 22101 FsmOpStateUnavailable
YYYY-MM-DD 18:29:24.190 22104 SctpPathMismatch SCTP
YYYY-MM-DD 18:29:27.884 22055 Non-Preferred Route Group In Use

After this, following WatchDog Alarm can be seen on the MP:

31003 MINOR dsr SW EXGSTACK_Process Thread Watchdog Failure Thread watchdog timed out
25500 CRITICAL dsroam DIAM dsrSO No DA-MP Leader Detected

Changes

 

Cause

When links within same RouteGroup get disabled simultaneously (i.e. within few ms) the diameter traffic during that period gets stored as pending transaction with transient Peer list.

When the Pending transactions containing diameter message get reprocessed upon time-out, the transient Peer List will be containing stale data because all the peers in the list have gone down.

While trying to reroute the message this stale peer list causes forever loop and results in reroute thread to miss the heartbeat with WatchDog process.

Solution

This issue is described in BUG 26978034 - DRLReroute thread hangs while processing stale RouteGroup data

The bug is fixed in DSR 8.2.

Workaround is to keep single Route Group under Route Lists on the connections that are fluctuating with high frequency (i.e. all links go down in a very short span of time).

 

References

<BUG:26978034> - DRLREROUTE THREAD HANGS WHILE PROCESSING STALE ROUTEGROUP DATA

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback