![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||
Solution Type Problem Resolution Sure Solution 2128317.1 : Primary and Secondary CMP (DR) displays Active-Active Condition (Split Brain)
In this Document
Created from <SR 3-12476581381> Applies to:Oracle Communications Policy Management - Version POLICY 12.0.0 and laterTekelec SymptomsPrimary and Secondary CMP (DR) displays Active-Active Condition and following Alarm is seen: CauseThere is no actual split brain; the false detection was due to a bug which gets exposed when the failover is done between DR-CMP nodes and is more likely to be seen during upgrade . The bug is in the COMCOL inetmerge process responsible for collecting and updating alarms from the DR-CMP to the Active CMP was in a stuck condition. In this state, the DR-CMP inetmerge paused sending alarm updates to the Primary CMP.
Due to this bug, inetmerge connections between CMP and DR-CMP clusters can exhibit several problems. 1. Audit which synchronizes Log tables between CMP's fails to complete thus Log tables on CMP's does not remain synchronized. 2.Problems are exposed when a fail over between CMP nodes occurs. The CMP fail over triggers defective logic that falsely detects a state mismatch on the CMP servers. This logic then attempts to correct the mismatch but instead causes inetmerge to behave as if a split brain were taking place. This is observed by the following alarm (31107):-* 08/25/2015 04:50:52.191 230 inetmerge DB Merge From Child Failure 01cmp01a
GN_ACTACT: Active-Active conflict detected with peer [17044:MergeReceiver.cxx:1783] Once this alarm is present, inetmerge pauses collection of stateful tables (including the list of Active Alarms) to the Active CMP at the Primary site until the perceived split brain is resolved. Unfortunately, since there is no actual split brain, this means that all stateful tables on the Active CMP are frozen with respect to the CMP server. In other words, any alarms that happened to be present in the Alarm table on the CMP remained stuck until this false split-brain can be cleared by manual intervention. 0406:195950.962 TR-V SenderLink[05cmp01a]: Detected Remote HA Change to Active [10150/MergeSender.cxx:1256]
0406:195953.610 TR-V ===[STATE PendingStandby]=== anyPeerAvail=1,peerActive=1, parentAckList=1 [10150/IdbMerge.cxx:964] 0406:195956.012 TR-V SenderLink[05cmp01a]: Detected Remote HA Change to Active [10150/MergeSender.cxx:1256] 0406:195958.628 TR-V ===[STATE PendingStandby]=== anyPeerAvail=1,peerActive=1, parentAckList=1 [10150/IdbMerge.cxx:964] 0406:200001.012 TR-V SenderLink[05cmp01a]: Detected Remote HA Change to Active [10150/MergeSender.cxx:1256]
SolutionThis issue is fixed in following releases: 12.1.1.0.0 12.2.0.0.0 References<BUG:22007430> - CMP01B HAS A STUCK MYSQL SYNC ALARM<BUG:20686427> - [LRGSYS] SEEING MANY "OLDER VERSION OF INETMERGE ON DRNO" LOGS <BUG:21899519> - 31106 AND 31107 ALARMS NOT GETTING AUTO CLEARED <BUG:23066369> - ACTIVE-ACTIVE CONDITION BETWEEN CMP'S <BUG:21697864> - CMP01B HAS A STUCK MYSQL SYNC ALARM Attachments This solution has no attachment |
||||||||||||||||||
|