Bug 154625 - ntpd -x won't synchronise & reports frequency error messages
Summary: ntpd -x won't synchronise & reports frequency error messages
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: ntp
Version: 4.0
Hardware: i386
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Miroslav Lichvar
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2005-04-13 04:37 UTC by Katrina Maffey
Modified: 2007-11-30 22:07 UTC (History)
1 user (show)

Fixed In Version: RHBA-2007-0189
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-05-01 17:40:34 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
syslogs for ntp versions 4.2.0 and 4.1.2 with and without -x (5.77 KB, text/plain)
2006-08-14 03:06 UTC, Katrina Maffey
no flags Details
Use adjtime when offset is more than 0.5s (1.04 KB, patch)
2006-08-15 08:37 UTC, Miroslav Lichvar
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2007:0189 0 normal SHIPPED_LIVE ntp bug fix update 2007-05-01 17:39:38 UTC

Description Katrina Maffey 2005-04-13 04:37:24 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4.2) Gecko/20040308

Description of problem:
When using the -x option with ntpd, if the time offset is off by more than ~0.5 sec,  it eventually starts printing errors every few minutes to /var/log/messages like these:
ntpd[16406]: frequency error 621 PPM exceeds tolerance 500 PPM
ntpd[16406]: frequency error 655 PPM exceeds tolerance 500 PPM
ntpd[16406]: frequency error 655 PPM exceeds tolerance 500 PPM
ntpd[16406]: frequency error 540 PPM exceeds tolerance 500 PPM
After that the time just drifts further away from reality and never resynchronises

The symptoms look just like those described in bug 75558 (comments #5 to #13, not the original problem with the option), and this bug was fixed for us with ntp version 4.1.2, but something very similar is back in 4.2.0.


Version-Release number of selected component (if applicable):
ntp-4.2.0.a.20040617-4

How reproducible:
Always

Steps to Reproduce:
1. start ntpd
2. set the date so the time is out by a couple of seconds
3. wait until the error messages start to appear
  

Actual Results:  "frequency error" error messages, and the time continues to slew further from the ntp server.

Expected Results:  Time slowly slews back in sync over a couple of hours, like it does with ntp-4.1.2-0.rc1.2.

Additional info:

Comment 1 Miroslav Lichvar 2006-08-09 15:28:54 UTC
Is this still an issue?

ntpd doesn't support such large errors. It's usually a hardware or kernel
problem when the 500PPM limit isn't enough. Can you provide more info from the
log file when ntpd is running with -x and without it?

Comment 2 Katrina Maffey 2006-08-14 03:03:58 UTC
Yes, this is still an issue as we would like to be able to upgrade to the latest
version of ntpd and use -x.
I'll attach some log files (ntp messages from syslog), but there isn't a lot of
information in them - I'd be happy to re-run the experiments with some level of
debugging turned on if you let me know what you need.

Comment 3 Katrina Maffey 2006-08-14 03:06:11 UTC
Created attachment 134114 [details]
syslogs for ntp versions 4.2.0 and 4.1.2 with and without -x

Comment 4 Miroslav Lichvar 2006-08-14 15:53:03 UTC
Ok, I can reproduce it. The bug is reported upstream already:
https://ntp.isc.org/bugs/show_bug.cgi?id=628

Do you experience this bug when you don't set the date manually?


Comment 5 Katrina Maffey 2006-08-14 23:11:46 UTC
We used to see it all the time because we had a setup where did not connect to
the external network at boot time, therefore the initial 'ntpdate' in the ntpd
startup script would fail. When we did connect the system time would inevitably
be a second or two out, and the bug would occur.

We no longer need to boot without the external network, but it's still possible
to lose touch with the ntp servers for a significant time and have the system
clock drift far enough to induce the problem when the servers come back.

Comment 6 Miroslav Lichvar 2006-08-15 08:37:49 UTC
Created attachment 134197 [details]
Use adjtime when offset is more than 0.5s

Comment 7 Miroslav Lichvar 2006-08-15 08:50:44 UTC
It is also possible to workaround this bug by disabling kernel discipline,
"disable kernel" in ntp.conf.

Comment 9 Jay Turner 2006-08-22 02:26:30 UTC
QE ack for 4.5.

Comment 16 Red Hat Bugzilla 2007-05-01 17:40:34 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0189.html



Note You need to log in before you can comment on or make changes to this bug.