Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1606013.1
Update Date:2013-12-16
Keywords:

Solution Type  Problem Resolution Sure

Solution  1606013.1 :   Time lag issue, some of the compute nodes showing different time ( when compared to the other compute nodes) in Exalogic X2-2 Rack  


Related Items
  • Oracle Exalogic Elastic Cloud Software
  •  
  • Oracle Exalogic Elastic Cloud X2-2 Hardware
  •  
Related Categories
  • PLA-Support>Eng Systems>Exalogic/OVCA>Oracle Exalogic>MW: Exalogic Core
  •  


Time lag issue with few of the compute node in the exalogic RACK. It usually starts after we re-start the compute node. Initially compute node shows the less variation, but gradually it increases to hours.

In this Document
Symptoms
Cause
Solution
References


Created from <SR 3-7863102671>

Applies to:

Oracle Exalogic Elastic Cloud Software - Version 2.0.3.0.0 to 2.0.3.0.4
Oracle Exalogic Elastic Cloud X2-2 Hardware - Version All Versions and later
Linux x86-64

Symptoms

 Time lag issue with few of the compute node in Exalogic x2-2 rack. It usually starts after re-starting a compute node. Initially compute node shows the less variation, but gradually it increases to hours. 

[root@orarcxlpacn01 ~]# dcli -l root -g cnodes "date"
orarcxlpacn01: Wed Nov 13 22:49:43 EST 2013
orarcxlpacn02: Wed Nov 13 22:49:43 EST 2013
orarcxlpacn03: Wed Nov 13 22:49:43 EST 2013
orarcxlpacn04: Wed Nov 13 22:49:43 EST 2013
orarcxlpacn05: Wed Nov 13 22:49:43 EST 2013
orarcxlpacn06: Wed Nov 13 22:49:42 EST 2013
orarcxlpacn07: Wed Nov 13 22:49:43 EST 2013

 After sometime, it can be observed the the lag will increase:

[root@orarcxlpacn01 ~]# dcli -l root -g cnodes "date"
orarcxlpacn01: Wed Nov 13 22:48:18 EST 2013
orarcxlpacn02: Wed Nov 13 22:48:18 EST 2013
orarcxlpacn03: Wed Nov 13 22:48:18 EST 2013
orarcxlpacn04: Wed Nov 13 22:48:18 EST 2013
orarcxlpacn05: Wed Nov 13 22:48:18 EST 2013
orarcxlpacn06: Wed Nov 13 22:47:55 EST 2013
orarcxlpacn07: Wed Nov 13 22:48:18 EST 2013

 

Cause

The system clock can experience large drifts following a reboot. Drifts of around 2-3 seconds per each elapsed minute, when compared to reference (correct) time, can occur. NTP, even when configured with the most aggressive polling settings possible, is often unable to compensate for this drift, which can exceed NTP's maximum tolerance of 500 PPM.
The system initially works without issues, and then starts exhibiting symptoms following a reboot. No other hardware or software issues besides the time drifts in system clock are observed.

The below printed kernel resulted in following messages over the last 3 reboots of affected node. Note the 2942.008 MHz value computed by tsc_init during the symptomatic bootup:

Aug 31 22:06:33 tx2xdb08 kernel: Detected 2892.712 MHz processor.
Sep 18 10:59:24 tx2xdb08 kernel: Detected 2892.833 MHz processor.
Sep 18 16:10:47 tx2xdb08 kernel: Detected 2942.008 MHz processor. <<<<<<

 On unaffected nodes, we see: 

Aug 31 22:10:11 tx2xdb04 kernel: Detected 2892.653 MHz processor.
Aug 31 22:27:26 tx2xdb04 kernel: Detected 2892.955 MHz processor.
Sep 18 11:01:39 tx2xdb04 kernel: Detected 2893.103 MHz processor.
Sep 18 16:07:49 tx2xdb04 kernel: Detected 2892.926 MHz processor.

 The cause of the issue is, TSC clock source is unstable.

 

Solution

Instead of TSC clock, use much reliable clock source HPET. Review the KM Note: Time Difference On Cluster Nodes Even With NTP Service Enabled (ODA, Exadata) (Doc ID 1577310.1)

For the Exalogic, a kernel patch has been created “2.6.32-400.21.1.el5uek.bug17569550v2” but that is not for exalogic. Also it won’t be the ideal situation for apply the patch which basically change the source clock to HPET. Also this fix is mainline in UEK#3, which is yet to be released.

The motherboard has to be replaced. Please open a service request with Oracle Support. 

How to Replace a Motherboard on a compute node Server that is part of an Oracle Exalogic Elastic Cloud X2-2, X3-2, or X4-2 (Doc ID 1496559.1)

References

<NOTE:1511921.1> - Exalogic: Time Drift in Compute Nodes
<NOTE:1577310.1> - Time Difference On Cluster Nodes Even With NTP Service Enabled (ODA, Exadata)
<BUG:17569550> - COMPUTED FREQ VARIATION IN QUICK_PIT_CALIBRATE() LEADS TO INACCURATE SYSTEM TIME

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback