Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-2197972.1
Update Date:2016-11-16
Keywords:

Solution Type  Problem Resolution Sure

Solution  2197972.1 :   Diameter Signaling Router (DSR) - Connections Not Coming UP Due to Continuous Restarts of the DSR Process on MP Server  


Related Items
  • Oracle Communications Diameter Signaling Router (DSR)
  •  
Related Categories
  • PLA-Support>Sun Systems>CommsGBU>Global Signaling Solutions>SN-SND: Tekelec DSR
  •  




In this Document
Symptoms
Changes
Cause
Solution


Created from <SR 3-13526124521>

Applies to:

Oracle Communications Diameter Signaling Router (DSR) - Version DSR 4.0 and later
Information in this document applies to any platform.

Symptoms

After the restart of the dsr process, all connections defined on the impacted MPs are not coming up.
Several symptoms have been observed on the system:

  • In the “Main Menu -> Diameter -> Maintenance -> Connection” SOAM menu, the “MP Server Hostname” column is having "Unknown" entries (Unk). Same entries are observed for "Operational Status", "CPL" and "Operational Reason" columns.
  • System is raising alarm 31201 from the impacted MPs: "Process Nor Running (EXGSTACK_PROCESS). A managed process cannot be started or has unexpectedly terminated"

Changes

 

Cause

The dsr process is continuously crashing on the impacted MP servers and this explains why connections are not established. dsr process logs show that the dsr process was not able to compile one specific single Mediation Rule Set/Rule.

In the dsr process logs (tt dsr) and just before the restart of the process (PROGRAM STARTED), we can distinguish the following entries related to a mediation rule set:

1025:224300.750 DBC1 processing ADD change in CAPM rule table: capm_rule (ruleset: CAPM-F-348ff580-e4b7-bb68-9318-000069623e55) [2465/CapmDbcaResponder.C:248]
1025:224300.750 DBC0 libflexroute: DBG: build_rule(): new flexroute rule parsed (id:3097 priority:98 precedence:1 description: data: @msg.avp["Destination-Host"][1].data=~"(.*)epc.mncxxx.mccxxx.3gppnetwork.org" && @msg.avp["Origin-Host"][1].data=="xxx.epc.mncxxx.mccxxx.3gppnetwork.org" -> dma_save_avp("Session-Id",1),dma_subst_avp_value("Session-Id",1,"/^xxx.epc.mncxxx.mccxxx.3gppnetwork.org(.*)/\"hssxxx.node.epc.mncxxx.mccxxx.3gppnetwork.org\"+\\1/"),dma_change_avp_value("Origin-Host",1,"hssxxx.node.epc.mncxxx.mccxxx.3gppnetwork.org"),dma_change_avp_value("Origin-Realm",1,"epc.mncxxx.mccxxx.3gppnetwork.org"),dma_change_avp_value("Destination-Realm",1,"epc.mncxxx.mccxxx.3gppnetwork.org")||)
  [2465/flexroute_db.c:1760]
1025:224300.750 DBC0 libflexroute: DBG: cache_map_add(): Value cache index is found: 0
  [2465/value_cache.c:46]
1025:224300.750 DBC0 libflexroute: DBG: cache_map_add(): Value cache index is found: 1
  [2465/value_cache.c:46]
1025:224303.528 TR-V PROGRAM STARTED -- dsr (pid: 2798) [2798/ProcUtil.cxx:416]

Moreover, the dsr process crashes create abterm files under the /var/TKLC/rundb/run/proc/dsr folder. In the abterm files we can note the following entries:

#4
#5 0x00007fe6947dc713 in tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, unsigned long, int) () from /usr/lib64/libtcmalloc.so.4
#6 0x00007fe6947dc7b6 in tcmalloc::ThreadCache::ListTooLong(tcmalloc::ThreadCache::FreeList*, unsigned long) () from /usr/lib64/libtcmalloc.so.4
#7 0x00007fe6947eb41f in tc_delete () from /usr/lib64/libtcmalloc.so.4
#8 0x00007fe68de3c864 in ?? () from /usr/lib64/libre2.so.0
#9 0x00007fe68de3cbb4 in ?? () from /usr/lib64/libre2.so.0
#10 0x00007fe68de3bd79 in ?? () from /usr/lib64/libre2.so.0
#11 0x00007fe68de126f2 in ?? () from /usr/lib64/libre2.so.0
#12 0x00007fe68de332d2 in re2::RE2::Init(re2::StringPiece const&, re2::RE2::Options const&) () from /usr/lib64/libre2.so.0
#13 0x00007fe68de33d1b in re2::RE2::RE2(re2::StringPiece const&, re2::RE2::Options const&) () from /usr/lib64/libre2.so.0
#14 0x00007fe68fd2b609 in re2w_compile () from /usr/TKLC/capm/prod/lib/libflexroute.so.5
#15 0x00007fe68fd01de5 in new_shm_regex () from /usr/TKLC/capm/prod/lib/libflexroute.so.5
#16 0x00007fe68fd03fb5 in new_shm_regex_subst () from /usr/TKLC/capm/prod/lib/libflexroute.so.5
#17 0x00007fe68ff6f08c in fix_subst_re(f_var*, int, frcomctx_t*) () from /usr/TKLC/dpi/lib/libdpiDml.so
#18 0x00007fe68fcfc128 in _parse_f_var_func () from /usr/TKLC/capm/prod/lib/libflexroute.so.5
#19 0x00007fe68fcfcd61 in parse_f_var_right () from /usr/TKLC/capm/prod/lib/libflexroute.so.5
#20 0x00007fe68fcfb842 in _parse_f_var () from /usr/TKLC/capm/prod/lib/libflexroute.so.5
#21 0x00007fe68fcfd52d in parse_f_var_pair () from /usr/TKLC/capm/prod/lib/libflexroute.so.5
#22 0x00007fe68fcfd9fd in parse_f_var_pair_list () from /usr/TKLC/capm/prod/lib/libflexroute.so.5
#23 0x00007fe68fcfff64 in set_action_rule () from /usr/TKLC/capm/prod/lib/libflexroute.so.5
#24 0x00007fe68fd0e041 in build_rule () from /usr/TKLC/capm/prod/lib/libflexroute.so.5
#25 0x00007fe68fd117ce in f_add_rule () from /usr/TKLC/capm/prod/lib/libflexroute.so.5
#26 0x00007fe69262a4ef in CapmDbcaResponder::notifyTable(CmList&, CmString, GnRepTblChg::ChgType) () from /usr/TKLC/dpi/lib/libdpiDbca.so
#27 0x00007fe6930fc483 in DbRecordHandler::processAddRecords(CmMap<CmString, DbParseRecords::DbRecordSet*>&) () from /usr/TKLC/awpcommon/lib/libstackDbChangeAgent.so
#28 0x00007fe6930fc922 in DbRecordHandler::notify(GnRepTblChg::ChgType) () from /usr/TKLC/awpcommon/lib/libstackDbChangeAgent.so
#29 0x00007fe693101d50 in DbParseRecords::notifyRecords() () from /usr/TKLC/awpcommon/lib/libstackDbChangeAgent.so
#30 0x00007fe6930ff67f in TableMonitor::initializeDb() () from /usr/TKLC/awpcommon/lib/libstackDbChangeAgent.so
#31 0x00007fe69310005f in TableMonitor::initialize() () from /usr/TKLC/awpcommon/lib/libstackDbChangeAgent.so
#32 0x00007fe6930f8226 in DbChangeAgent::initialize() () from /usr/TKLC/awpcommon/lib/libstackDbChangeAgent.so
#33 0x00007fe6939c8355 in ExgStackManager::init(CmString) () from /usr/TKLC/awpcommon/lib/libstackBase.so
#34 0x000000000040b438 in main ()

To get the dsr process advanced logs, the level of logs needs to be changed using the following command:

tr.setmask 0-10 dsr.DBC

Once the impacted Rule Set is identified revert back the changes on the dsr process log level:

tr.setmask none dsr.DBC

Solution

  1. Identify the "faulty" Mediation Rule Set/Rule. The concerned rule set name can be identified by its def_id in the capm_def table (in the above example it is"348ff580-e4b7-bb68-9318-000069623e55" and it can be picked out from the dsr process logs):
    $ igrep -p 348ff580-e4b7-bb68-9318-000069623e55 capm_def
    id def_id name conditions_grouping complex_expression flags help_title help_text prov_data capm_task_state action_error_handling flexroute_name birthTime feature
    31 348ff580-e4b7-bb68-9318-000069623e55 Rule_Set_1 and A AND B 3 1 Test Ignore CAPM-F-348ff580-e4b7-bb68-9318-000069623e55 06/15/2016 00:24:06.000 mediation
     
  2. Go to "Main Menu -> Diameter -> Mediation -> Rule Sets" SOAM menu and select the rule set identified previously in step 1.
  3. Delete all rules related to the identified rule set by clicking on the "Delete All Rules" button. dsr process will then be able to start and the related connections will get reestablished.

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback