Asset ID: |
1-77-2194261.1 |
Update Date: | 2016-12-25 |
Keywords: | |
Solution Type
Sun Alert Sure
Solution
2194261.1
:
Potential System Impact For Oracle (DSR) Products From Leap Second On December 31, 2016 At 23:59:60 UTC
Related Items |
- Tekelec HLR Router
- Oracle Communications Diameter Signaling Router (DSR)
|
Related Categories |
- PLA-Support>Sun Systems>CommsGBU>Global Signaling Solutions>SN-SND: Tekelec DSR
|
In this Document
Applies to:
Tekelec HLR Router - Version HLRR 4.0 and later
Oracle Communications Diameter Signaling Router (DSR) - Version DSR 5.0 and later
Tekelec
Description
A leap second is a one-second adjustment that is occasionally applied to Coordinated Universal Time (UTC) in order to keep its time of day close to the mean solar time. Without such a correction, time reckoned by Earth's rotation drifts away from atomic time because of irregularities in the Earth's rate of rotation. The purpose of a leap second is to compensate for this drift, by scheduling days with 86401 or 86399 seconds. Insertion of each UTC leap second is usually decided about six months in advance by the International Earth Rotation and Reference Systems Service (IERS).
Occurrence
A leap second will again be inserted at the end of December 31, 2016 at 23:59:60 UTC. This event can cause application and OS level malfunctions on systems with vulnerable operating systems.
Symptoms
The following is the vulnerability statements and recommendation for the DSR Products.
Product in DSR Family
| Vulnerability Statement | Recommendation/Notes |
DSR 7.1.x -to- Latest GA Release (Premier Support) |
All Releases higher than DSR 7.1 including latest GA release (currently 7.4) for DSR, IDIH and SDS |
No Impact |
N/A |
DSR 7.0.x GA (Premier Support) |
DSR 7.0.0.0.0-70.22.0 IDIH 7.0.0.0.0-70.22.0 |
No Impact |
N/A |
DSR 7.0.1.0.0-70.28.0 IDIH 7.0.1.0.0-70.28.1 |
No Impact |
N/A |
SDS 5.0.1-50.23.0* |
ComAgent related alarms -ComAgent Event ID 19814: Communication Agent Peer has not responded to heartbeat (observed in Oracle lab)
DB Replication alarms -Platform Alarm ID 31100: DB Replication Fault -Platform Alarm ID 31101: DB Replication To Slave Failure -Platform Alarm ID 31102: DB Replication from Master Failure -Platform Alarm ID 31119: Database updatelog overrun -Platform Alarm ID 31127: DB Replication Audit Complete -Platform Alarm ID 31147: DB upsynclog overrun
All Traffic impacting alarms on the system were cleared at the end of the one second “event” in past occurrence of leap second event and in simulation.
SDS to DSR interaction (FABR): No Impact
PDB relay to HLRR: SDS uses timestamp based approach to find initial record for relaying provisioning data to HLRR. If server switchover happens during the leap second adjustment, then PDB relay on Active SDS will miss out new provisioning data during leap second cycle. However any new records added after leap second window will be relayed to HLRR successfully.
|
Even if chances of switchover or pdbrelay process restart on Active NOAM are slim, still it is possible. To avoid DB inconsistency between SDS and HLRR, we can recommend Customer to disable SDS provisioning during that duration of leap second.
For alarms seen during the event, system expected to self-heal in less than 2 hours. No manual intervention recommended for alarms.
Recommend customer re-enable SDS provisioning 2 hours after leap second event.
|
DSR 6.0 GA (EOL) |
DSR 6.0.0-60.24.0 DSR 6.0.1-60.33.0 IDIH 6.0.0-60.22.0 |
No Impact |
N/A |
SDS 5.0.1-50.23.0* |
See impact above for same SDS release |
See recommendation above for same SDS release |
DSR 5.1 GA (EOL) |
DSR 5.1 GA ** |
PDRA related alarms:
-Application Event ID 22704: Policy DRA Communication Agent Error -Application Event ID 22712: Policy SBR Communication Error -Application Event ID 22713: Policy SBR Alternate Key Creation
ComAgent related alarms -ComAgent Alarm ID 19825: Communication Agent Transaction Failure Rate -ComAgent Event ID 19832: Communication Agent Reliable Transaction Failed -ComAgent Event ID 19814: Communication Agent Peer has not responded to heartbeat (observed in Oracle lab)
DB Replication alarms -Platform Alarm ID 31100: DB Replication Fault -Platform Alarm ID 31101: DB Replication To Slave Failure -Platform Alarm ID 31102: DB Replication from Master Failure -Plaftorm Alarm ID 31119: Database updatelog overrun -Plaftorm Alarm ID 31127: DB Replication Audit Complete -Plaftorm Alarm ID 31147: DB upsynclog overrun
All Traffic impacting alarms on the system were cleared at the end of the one second “event” in past occurrence of leap second event and in simulation.
|
System expected to self-heal in less than 2 hours. No manual intervention recommended. |
DSR 5.0 GA (EOL) |
DSR 5.0.2-50.27.1 DSR 5.0.1-50.26.0 DSR 5.0.0-50.21.0 |
There should be no impact to system traffic that will last for longer than one second. |
No manual intervention recommended. |
SDS 5.0.0-50.19.0 |
ComAgent related alarms -ComAgent Event ID 19814: Communication Agent Peer has not responded to heartbeat (observed in Oracle lab)
DB Replication alarms -Platform Alarm ID 31100: DB Replication Fault -Platform Alarm ID 31101: DB Replication To Slave Failure -Platform Alarm ID 31102: DB Replication from Master Failure -Platform Alarm ID 31119: Database updatelog overrun -Platform Alarm ID 31127: DB Replication Audit Complete -Platform Alarm ID 31147: DB upsynclog overrun
All Traffic impacting alarms on the system were cleared at the end of the one second “event” in past occurrence of leap second event and in simulation.
SDS to DSR interaction (FABR): No Impact
PDB relay to HLRR: SDS uses timestamp based approach to find initial record for relaying provisioning data to HLRR. If server switchover happens during the leap second adjustment, then PDB relay on Active SDS will miss out new provisioning data during leap second cycle. However any new records added after leap second window will be relayed to HLRR successfully.
|
Even if chances of switchover or pdbrelay process restart on Active NOAM are slim, still it is possible. To avoid DB inconsistency between SDS and HLRR, we can recommend Customer to disable SDS provisioning during that duration of leap second.
For alarms seen during the event, system expected to self-heal in less than 2 hours. No manual intervention recommended for alarms.
Recommend customer re-enable SDS provisioning 2 hours after leap second event.
|
HLRR |
HLRR 4.1 (Premier Support) |
No Impact |
N/A |
HLRR 4.0 (Premier Support) ** |
Possible Alarms that may be generated:
-Alarm ID 31119 - DB Updatelog Overrun, with an error ID of "IDB_BADUPSEQ/WRN updatelog sequence number inconsistency encountered". -Alarm ID 31147 - DB upsynclog overrun -Alarm ID 31101 - DB Replication To Slave Failure - this alarm will be raised on the active server that is having problems -Alarm ID 31102 - DB Replication From Master Failure - this alarm will be raised on standby/spare/observer servers having problems
|
System expected to self-heal in less than 2 hours. No manual intervention recommended. |
* Note SDS 5.0.1-50.23.0 is included with both DSR 6.0 and DSR 7.0. However DSR 7.1 and above includes SDS 7.1
**These alarms may appear and could continue to be set and cleared repeatedly as replication links come up and down while inconsistency in the update log timestamp is encountered repeatedly until the bad timestamp rolls off after hours. There should be no impact to signaling traffic while replication is down, but if the active server is lost or an HA switchover occurs, updates written to the active will be lost.
Note: Test system setup/result and customer production system would be different and all possible alarms/events and error scenarios cannot be predicted.
History
23-12-2016 -publish
Attachments
This solution has no attachment