Bug 666558 - ntpd fails to synchronise, keeps diverging
Summary: ntpd fails to synchronise, keeps diverging
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 14
Hardware: x86_64
OS: Linux
low
medium
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-12-31 19:59 UTC by Nivag
Modified: 2012-08-16 18:47 UTC (History)
10 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2012-08-16 18:47:13 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Nivag 2010-12-31 19:59:07 UTC
Description of problem:
From time to time ntpd fails to synchronise, it starts and offset gets progressively larger until it rests itself and starts the cycle again of getting further out of sunc.  This has happened for several release of Fedora.

Version-Release number of selected component (if applicable):
Applies to current version of ntpd and some previous versions.

How reproducible:
Intermittant.


Steps to Reproduce:
1.run ntpd repeatedly, may need to hibernate and umhibernate.
2. unclear exactly how to reproduce
3.
  
Actual results:
Gets progressively out of kilter

Expected results:
Should converge and sync within a reasonable time frame.

Additional info:
Problem occurred several releases of Fedora ago, and still occurs in Fedora 14.

grep ntpd /var/log/message
[...]
Jan  1 08:13:36 saturn ntpd[5185]: ntpd 4.2.6p3-RC10 Thu Nov 25 16:18:33 UTC 2010 (1)
Jan  1 08:13:36 saturn ntpd[5186]: proto: precision = 0.094 usec
Jan  1 08:13:36 saturn ntpd[5186]: 0.0.0.0 c01d 0d kern kernel time sync enabled
Jan  1 08:13:36 saturn ntpd[5186]: Listen and drop on 0 v4wildcard 0.0.0.0 UDP 123
Jan  1 08:13:36 saturn ntpd[5186]: Listen and drop on 1 v6wildcard :: UDP 123
Jan  1 08:13:36 saturn ntpd[5186]: Listen normally on 2 lo 127.0.0.1 UDP 123
Jan  1 08:13:36 saturn ntpd[5186]: Listen normally on 3 eth1 192.168.1.204 UDP 123
Jan  1 08:13:36 saturn ntpd[5186]: Listen normally on 4 eth0 10.1.1.3 UDP 123
Jan  1 08:13:36 saturn ntpd[5186]: Listen normally on 5 eth1 fe80::208:54ff:fe55:e1e7 UDP 123
Jan  1 08:13:36 saturn ntpd[5186]: Listen normally on 6 lo ::1 UDP 123
Jan  1 08:13:36 saturn ntpd[5186]: Listen normally on 7 eth0 fe80::224:8cff:fe44:49bc UDP 123
Jan  1 08:13:36 saturn ntpd[5186]: Listening on routing socket on fd #24 for interface updates
Jan  1 08:13:38 saturn ntpd[5186]: 0.0.0.0 c016 06 restart
Jan  1 08:13:38 saturn ntpd[5186]: 0.0.0.0 c012 02 freq_set kernel -16.306 PPM
Jan  1 08:13:46 saturn ntpd[5186]: 0.0.0.0 c61c 0c clock_step +1.553619 s
Jan  1 08:13:46 saturn ntpd[5186]: 0.0.0.0 c614 04 freq_mode
Jan  1 08:13:47 saturn ntpd[5186]: 0.0.0.0 c618 08 no_sys_peer
Jan  1 08:29:04 saturn ntpd[5186]: 0.0.0.0 c612 02 freq_set kernel 3224.736 PPM
Jan  1 08:29:04 saturn ntpd[5186]: 0.0.0.0 c61c 0c clock_step +2.965553 s
Jan  1 08:29:06 saturn ntpd[5186]: 0.0.0.0 c618 08 no_sys_peer
Jan  1 08:44:27 saturn ntpd[5186]: 0.0.0.0 c612 02 freq_set kernel 3285.768 PPM
Jan  1 08:44:27 saturn ntpd[5186]: 0.0.0.0 c61c 0c clock_step +2.562907 s
Jan  1 08:44:27 saturn ntpd[5186]: 0.0.0.0 c615 05 clock_sync
Jan  1 08:44:28 saturn ntpd[5186]: 0.0.0.0 c618 08 no_sys_peer
Jan  1 08:47:11 saturn ntpd[5186]: 0.0.0.0 0613 03 spike_detect +0.602972 s

Comment 1 Nivag 2010-12-31 20:07:07 UTC
# cat /etc/sysconfig/ntpd
# Command line options for ntpd
OPTIONS="-g -N"


# cat /etc/ntp.conf
# For more information about this file, see the man pages
# ntp.conf(5), ntp_acc(5), ntp_auth(5), ntp_clock(5), ntp_misc(5), ntp_mon(5).

driftfile /var/lib/ntp/drift

# Permit time synchronization with our time source, but do not
# permit the source to query or modify the service on this system.
#restrict default kod nomodify notrap nopeer noquery

# Permit all access over the loopback interface.  This could
# be tightened as well, but to do so would effect some of
# the administrative functions.
#restrict 127.0.0.1 
#restrict 192.168.1.0/24 
#restrict -6 ::1

# Hosts on local network are less restricted.
#restrict 192.168.1.0 mask 255.255.255.0 nomodify notrap

# Use public servers from the pool.ntp.org project.
# Please consider joining the pool (http://www.pool.ntp.org/join.html).
peer 192.168.1.205  burst minpoll 4 maxpoll 6
peer 192.168.1.210  burst minpoll 4 maxpoll 6
#server msltime.irl.cri.nz iburst  
server p1.ntp.net.nz iburst
server p2.ntp.net.nz iburst
server p3.ntp.net.nz iburst
server ntp.hugpar.gen.nz minpoll 4
server 1.au.pool.ntp.org 
server 2.au.pool.ntp.org
server 0.us.pool.ntp.org
server 1.us.pool.ntp.org
server jp.pool.ntp.org
server 0.europe.pool.ntp.org
server 1.europe.pool.ntp.org

#broadcast 192.168.1.255 autokey	# broadcast server
#broadcastclient			# broadcast client
#broadcast 224.0.1.1 autokey		# multicast server
#multicastclient 224.0.1.1		# multicast client
#manycastserver 239.255.254.254		# manycast server
#manycastclient 239.255.254.254 autokey # manycast client

# Undisciplined Local Clock. This is a fake driver intended for backup
# and when no outside source of synchronized time is available. 
server	127.127.1.0	# local clock
fudge	127.127.1.0 stratum 10

# Enable public key cryptography.
#crypto

includefile /etc/ntp/crypto/pw

# Key file containing the keys and key identifiers used when operating
# with symmetric key cryptography. 
keys /etc/ntp/keys

# Specify the key identifiers which are trusted.
#trustedkey 4 8 42

# Specify the key identifier to use with the ntpdc utility.
#requestkey 8

# Specify the key identifier to use with the ntpq utility.
#controlkey 8

# Enable writing of statistics records.
#statistics clockstats cryptostats loopstats peerstats
#restrict 0.fedora.pool.ntp.org mask 255.255.255.255 nomodify notrap noquery
#restrict 1.fedora.pool.ntp.org mask 255.255.255.255 nomodify notrap noquery
#restrict 2.fedora.pool.ntp.org mask 255.255.255.255 nomodify notrap noquery


Every 10.0s: ntpstat ; ntpq -p                                         Sat Jan  1 08:59:55 2011

synchronised to NTP server (202.46.191.123) at stratum 2
   time correct to within 2267 ms
   polling server every 64 s
     remote           refid	 st t when poll reach   delay   offset  jitter
==============================================================================
 jupiter         .STEP.          16 u    -   64    0    0.000    0.000   0.000
-neptune         LOCAL(0)        13 u    9   64  177    4.911  2566.64 843.587
*ntp1.ntp.net.nz .GPS.            1 u   11   64  377   35.928  1783.00 555.050
+ntp2.ntp.net.nz .GPS.            1 u    6   64  377   35.446  2747.85 863.854
+ntp3.ntp.net.nz .GPS.            1 u    2   64  377   45.309  2184.64 483.450
+manhire.hugpar. 131.203.16.10    2 u   46   64  377   45.671  2453.48 673.210
+warrane.connect 192.189.54.17    3 u    5   64  377   62.230  1632.61 676.702
+203.171.85.237. 130.194.10.150   2 u   10   64  377  104.997  2169.14 464.656
xdione.cbane.org 204.123.2.5      2 u   62   64  377  194.519  1273.60 832.031
+ns1.anodized.co 128.118.25.5     2 u    4   64  377  256.227  2190.81 457.757
+s246.GkanagawaF 133.100.9.2      2 u    2   64  377  191.339  1658.56 665.750
+web01.ookoo.org 81.25.192.148    3 u    2   64  377  297.303  2193.64 465.955
xmerlin.ensma.fr 131.188.3.221    2 u   62   64  377  351.923  1282.23 827.071
 LOCAL(0)        .LOCL.          10 l  788   64    0    0.000    0.000   0.000

Comment 2 Miroslav Lichvar 2011-01-03 09:15:05 UTC
This looks like a hw or kernel problem.

ntpd doesn't work with such large frequency errors, it makes corrections only up to 500 ppm. (in the meantime, you can try chronyd, it should handle errors up to 100000 ppm)

Comment 3 Nivag 2011-01-03 10:26:12 UTC
Most of the time ntpd works fine (see below), just from time to time it can't seem to get its act together!  The problem always seems to be cleared up by a reboot, and sometimes (every time?) by hibernating and restoring from hibernation.  It has been good all day, as it is most days.

Is there anything I can practicably do to provide more useful diagnostics?  So long as such procedures are not too drastic, as this is both my work machine and the gateway to the Internet for my household.

I noted from the log: that if ntpd get out by more than some threshold, it make a large step re-adjustment.

Every 10.0s: ntpstat ; ntpq -p                                       Mon Jan  3 23:02:48 2011

synchronised to NTP server (202.46.191.123) at stratum 2
   time correct to within 54 ms
   polling server every 1024 s
     remote           refid	 st t when poll reach   delay   offset  jitter
==============================================================================
 jupiter         LOCAL(0)        13 u   32   16  354    0.104   -0.014 124.304
 neptune         192.168.1.204    3 u   13   64  377    0.137    0.127   0.034
*ntp1.ntp.net.nz .GPS.            1 u  535 1024  377   37.129    8.648   1.866
+ntp2.ntp.net.nz .GPS.            1 u  914 1024  377   36.445    9.533   0.892
+ntp3.ntp.net.nz .GPS.            1 u  315 1024  377   48.701    8.656   1.455
+manhire.hugpar. 131.203.16.10    2 u  842 1024  377   48.599    9.987   0.731
-202.81.208.160  203.12.160.2     3 u  269 1024  377  117.759    3.266   1.167
-pond.thecave.ws 18.26.4.105	  2 u  687 1024  377  113.987    4.707   2.219
+private.ssl119. .CDMA.           1 u  355 1024  377  214.075    9.811   3.059
+monitor.uplogon 204.9.54.119     2 u  831 1024  377  227.456    9.649   2.516
#doga.jp         210.171.226.40   2 u  865 1024  377  238.641   20.818   9.568
-ntp.univ-angers 195.220.94.163   2 u  756 1024  377  328.895    6.374  13.304
-farnsworth.1270 193.190.230.65   2 u  833 1024  377  299.413   10.449   5.568
 LOCAL(0)        .LOCL.          10 l  11h   64    0    0.000    0.000   0.000

$ uname -a
Linux saturn 2.6.35.10-74.fc14.x86_64 #1 SMP Thu Dec 23 16:04:50 UTC 2010 x86_64 x86_64 x86_64 GNU/Linux
$ 

up to date Fedora 14 install
AMD 810 quad core 64 bit
8 GB DDR3 RAM
5 * 500GB in software RAID-6 configuration
ASUS M4A78T-E motherboard 
Bus 001 Device 004: ID 046d:0990 Logitech, Inc. QuickCam Pro 9000

Comment 4 Philip Walden 2011-01-21 00:05:03 UTC
I have the same/similar problem on my F14 instance. The message log shows frequent large (+/-1 minute)clock adjustments. My F12 instance is very stable hardly any clock adjustments.

[pwalden@walden6 ~]$ ntpstat ; ntpq -p
unsynchronised
   polling server every 64 s
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
*cheezum.mattnor 129.7.1.66       2 u    1   64  177   72.964  447.656 178.267
+211.229.223.67. 149.20.68.17     3 u   66   64  177  122.984  408.859 181.426
+javanese.kjsl.c 69.36.224.15     2 u   65   64  177  113.806  416.892 178.484
+208.94.240.2    204.123.2.5      2 u   62   64  177   77.757  408.507 187.000

Log excerpt

Jan 20 14:15:33 walden6 ntpd[24098]: 0.0.0.0 c612 02 freq_set kernel -24762.012 PPM
Jan 20 14:15:33 walden6 ntpd[24098]: 0.0.0.0 c61c 0c clock_step -75.000893 s
Jan 20 14:15:33 walden6 ntpd[24098]: 0.0.0.0 c615 05 clock_sync
Jan 20 14:15:34 walden6 ntpd[24098]: 0.0.0.0 c618 08 no_sys_peer
Jan 20 14:17:58 walden6 ntpd[24098]: 0.0.0.0 0628 08 no_sys_peer
Jan 20 15:16:28 walden6 ntpd[24098]: 0.0.0.0 0613 03 spike_detect -100.940453 s
Jan 20 15:21:55 walden6 ntpd[24098]: 0.0.0.0 0618 08 no_sys_peer
Jan 20 15:35:43 walden6 ntpd[24098]: 0.0.0.0 061c 0c clock_step -119.122743 s
Jan 20 15:33:44 walden6 ntpd[24098]: 0.0.0.0 0615 05 clock_sync
Jan 20 15:33:45 walden6 ntpd[24098]: 0.0.0.0 c618 08 no_sys_peer
Jan 20 15:34:55 walden6 ntpd[24098]: 0.0.0.0 0613 03 spike_detect +0.130698 s
Jan 20 15:50:35 walden6 ntpd[24098]: 0.0.0.0 061c 0c clock_step +0.524019 s
Jan 20 15:50:36 walden6 ntpd[24098]: 0.0.0.0 0614 04 freq_mode
Jan 20 15:50:37 walden6 ntpd[24098]: 0.0.0.0 c618 08 no_sys_peer
[

Comment 5 Alan Altmann 2011-10-26 22:54:47 UTC
I am having the same problem on F15. Offset (always a lag) is about 5 minutes/day.  Stopping ntpd, running ntpdate, and restarting ntpd resets the time but the problem reasserts itself.

Comment 6 Philip Walden 2011-10-27 16:48:16 UTC
After much fiddling with my F14 system (Comment 4) and reading of ntp documents, I have come to believe the problem is with the HW clock on my system.

If the internal clock loses or gains too rapidly, then ntpd eventually gives up trying to synchronize. The protocol is to gradually slew the internal clock to synchronization. If the internal clock is so poor as to be outside these limits, ntpd gives up.

Comment 7 Nivag 2011-10-27 18:08:24 UTC
I never have the problem after a reboot, only sometimes after recovering from hibernation.  Also, amazingly, sometimes hibernating again appears to cure the problem!

The problem still happens for kernel: 
2.6.35.14-97.fc14.x86_64 #1 SMP Sat Sep 17 00:15:37 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

Comment 8 Fedora End Of Life 2012-08-16 18:47:16 UTC
This message is a notice that Fedora 14 is now at end of life. Fedora 
has stopped maintaining and issuing updates for Fedora 14. It is 
Fedora's policy to close all bug reports from releases that are no 
longer maintained.  At this time, all open bugs with a Fedora 'version'
of '14' have been closed as WONTFIX.

(Please note: Our normal process is to give advanced warning of this 
occurring, but we forgot to do that. A thousand apologies.)

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, feel free to reopen 
this bug and simply change the 'version' to a later Fedora version.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we were unable to fix it before Fedora 14 reached end of life. If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora, you are encouraged to click on 
"Clone This Bug" (top right of this page) and open it against that 
version of Fedora.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping


Note You need to log in before you can comment on or make changes to this bug.