Bug 187364

Summary: ntpd not updating drift value
Product: [Fedora] Fedora Reporter: Craig Goodyear <goodyca48>
Component: kernelAssignee: Dave Jones <davej>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 5CC: dad, mlichvar, pfrields, wtogami
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-11-12 21:17:53 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
ntpd entries in /var/log/messages none

Description Craig Goodyear 2006-03-30 15:15:57 UTC
Description of problem:
Drift value in /var/lib/ntp/drift is not being updated.  The value stays at the
starting value of 0.000

Version-Release number of selected component (if applicable):
ntp-4.2.0.a.20050816-11

How reproducible:
always

Steps to Reproduce:
1. start ntpd
2.
3.
  
Actual results:
drift = 0

Expected results:
drift equal some non-zero value

Additional info:

Comment 1 Miroslav Lichvar 2006-03-30 15:45:23 UTC
For how long it was running? It can take some time before the value is updated.

If it was longer than few hours then probably ntp didn't reached a stable state.
Can you please post output from command "ntpq -p" and relevant lines from syslog?

Comment 2 Craig Goodyear 2006-03-30 21:00:36 UTC
(In reply to comment #1)
The system had been up a couple of days with no change to /var/lib/ntp/drift

Output from "ntpq -p"

     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 lolly.dreamcomm 194.109.22.18    3 u   49   64   37  155.759  -1065.9 667.140
 ntp04.oal.ul.pt 194.117.9.129    2 u   47   64   37  201.944  -1308.3 470.732
 ntpt1.core.thep 195.92.195.222   3 u   53   64   37  131.798  -1321.3 453.473

> relevant lines from syslog?

Are you asking for ntpq entries in /var/log/messages?


Comment 3 Miroslav Lichvar 2006-03-31 07:26:21 UTC
Yes, grep 'ntpd' /var/log/messages is fine. Can you please also send another
ntpq -p output when the ntpd runs for at least an hour?

Comment 4 Craig Goodyear 2006-03-31 14:39:24 UTC
Created attachment 127127 [details]
ntpd entries in /var/log/messages

Comment 5 Craig Goodyear 2006-03-31 14:44:56 UTC
(In reply to comment #3)
> Yes, grep 'ntpd' /var/log/messages is fine. Can you please also send another
> ntpq -p output when the ntpd runs for at least an hour?

ntpd had been running for about 9 hours when the previous ntpq listing was
given.  Here is a current listing with ntpd running for about 24 hours.

     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 lolly.dreamcomm 194.109.22.18    3 u   47   64  177  155.713  -1138.8 819.464
 ntp04.oal.ul.pt 194.117.9.129    2 u   40   64  177  575.602  -1229.2 686.196
*ntpt1.core.thep 194.152.64.68    3 u   46   64  177  131.764  -1657.3 597.747


Comment 6 Miroslav Lichvar 2006-03-31 19:32:22 UTC
Ok, there are two problems. One is a network connectivity problem, 37 in the
reach column means that from last 8 attempts to connect a ntp server only 4 were
successful.

Second is the system time, every 15-20 minutes there is a time reset (about -3
seconds). The time is running much faster than ntp can handle. Probably a
hw/kernel problem.

Reassigning to kernel component.

Comment 7 Craig Goodyear 2006-04-23 13:07:18 UTC
I have found a work around to this problem.  I am using MSI K7N2 motherboard w/
Nvidia nForce2 chipset.  The problem goes away when I disable the FSB spread
spectrum in the BIOS.

Comment 8 David A. De Graaf 2006-06-04 20:22:38 UTC
I have exactly the same problem.  Since installing FC5 the ntpd servo
operation has failed to adjust the clock rate, but merely did a time reset
when the time error grew too large.  For example, logwatch for the past
three days reported:

 Time Reset 54 times (total: -127.148698 s  average: -2.354606 s)
 Time Reset 64 times (total: -115.399111 s  average: -1.803111 s)
 Time Reset 66 times (total: -96.624774 s  average: -1.464012 s)

The /var/lib/ntp/drift file has been stuck with a value of 0.000 for
the past two weeks.  Before that it had a fixed value of -510.000, which
gave rise to logwatch complaints that such a drift rate exceeded the
max tolerable value of 500.  In a futile attempt to restore normal
servo operation, I removed the drift file.  I noted that a new one
was created with the 0.000 value, but it never budged from that value.

Yesterday I discovered this bugzilla report with Craig Goodyear's
workaround.  I immediately tried turning off the "spread spectrum"
feature in the BIOS - both of them - and there have been no time
resets at all in the past 48 hours.  Normal ntpd servo operation is
working again.  The drift file has stepped thru values: -149.814, 
-164.576, -174.299, -178.087.

This machine has an ABIT NF7 motherboard with an Athlon Thunderbird
1200 mHz cpu.

The ntpd servo has worked perfectly in the past on this machine.
The yum.log.1 file shows I installed FC5 on May 3, 19:38.
The messages.4 file shows the superabundance of "time reset" messages
commenced May 3, 20:32.

This clearly proves that FC5 introduced this new bug.


Comment 9 Dave Jones 2006-10-16 18:56:34 UTC
A new kernel update has been released (Version: 2.6.18-1.2200.fc5)
based upon a new upstream kernel release.

Please retest against this new kernel, as a large number of patches
go into each upstream release, possibly including changes that
may address this problem.

This bug has been placed in NEEDINFO state.
Due to the large volume of inactive bugs in bugzilla, if this bug is
still in this state in two weeks time, it will be closed.

Should this bug still be relevant after this period, the reporter
can reopen the bug at any time. Any other users on the Cc: list
of this bug can request that the bug be reopened by adding a
comment to the bug.

In the last few updates, some users upgrading from FC4->FC5
have reported that installing a kernel update has left their
systems unbootable. If you have been affected by this problem
please check you only have one version of device-mapper & lvm2
installed.  See bug 207474 for further details.

If this bug is a problem preventing you from installing the
release this version is filed against, please see bug 169613.

If this bug has been fixed, but you are now experiencing a different
problem, please file a separate bug for the new problem.

Thank you.

Comment 10 Craig Goodyear 2006-10-18 21:33:13 UTC
I am no longer seeing this bug with a fully updated 
FC5 system and FSB spread spectrum enabled in the BIOS.

Comment 11 David A. De Graaf 2006-10-21 15:19:02 UTC
I updated to kernel-2.6.18-1.2200.fc5 and edited my BIOS, changing the FSB
spread-spectrum from OFF to 0.5% (whatever that means).  There is another
available setting of 1.0%, but I haven't tried it.  I removed /var/lib/ntp/drift
by renaming it and restarted ntpd.  Since this change ntpd has run a bit more
erratically than with SS OFF, but /var/lib/ntp/drift has been updated, with
values similar to the old value - -135.106 currently, vs -155.325 (old).

However, the clock is now being reset occasionally, whereas it wasn't before:
$ grep ntpd /var/log/messages | grep reset
Oct 19 04:56:06 datium ntpd[3843]: time reset +0.183967 s
Oct 19 05:15:23 datium ntpd[3843]: time reset -0.353233 s
Oct 19 05:37:58 datium ntpd[3843]: time reset +0.232632 s
Oct 20 07:12:55 datium ntpd[3843]: time reset +0.167106 s
Oct 20 07:34:20 datium ntpd[3843]: time reset -0.295322 s
Oct 20 14:57:41 datium ntpd[3843]: time reset -0.140890 s
Oct 20 23:38:10 datium ntpd[3843]: time reset +0.167025 s
Oct 20 23:58:37 datium ntpd[3843]: time reset -0.191109 s
Oct 21 03:44:25 datium ntpd[3843]: time reset +0.131058 s

I will try the larger value of SS in a few days, but will probably revert to
OFF, which seems to work more smoothly.

Comment 12 Dave Jones 2006-11-12 06:15:36 UTC
There was an important fix in the update kernel released yesterday which may help.
The ntp code was being miscompiled by the fc5 gcc.  (that was bz 191458).

It'd be good to know whether or not this fixes your problem.

The spread spectrum stuff is probably completely unrelated btw. It's some voodoo
that prevents RFI by modulating the system clock signal.



Comment 13 Craig Goodyear 2006-11-12 17:04:11 UTC
I can't say which kernel update fixed my ntpd problem, but 
I haven't had any trouble with ntpd since I disable the 
FSB spread spectrum in the BIOS on 2006-4-22.

After installing kernel 2.6.18-1.2200.fc5 on 2006-10-14, 
I enable the FSB spread spectrum in the BIOS and tested 
ntpd.  The problem had been corrected, ntpd worked normally 
with FSB spread spectrum enabled.

Kernel 2.6.18-1.2239.fc5 is now installed and ntpd is 
still working with FSB spread spectrum enabled.

Comment 14 Dave Jones 2006-11-12 21:17:53 UTC
cool, thanks for re-testing.

Comment 15 David A. De Graaf 2006-11-16 16:59:16 UTC
Ntpd is working perfectly.  I updated to  kernel.i686 2.6.18-1.2239.fc5 on Nov
13, and there have been >no< time resets for the past three days.
Of course, these events have been minimal with kernel.i686 2.6.18-1.2200.fc5 as
well.  There were 13 resets with the 2200 kernel, but none since Nov 6. 
Evidently, ntp stabilized somewhat.  

With kernel 2239 I also increased the BIOS setting for FSB spread spectrum from
0.5% to 1.0%, and there haven't been any resets at all.  (FWIW)

Thanks for fixing it.