From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4.2) Gecko/20040308 Description of problem: When using the -x option with ntpd, if the time offset is off by more than ~0.5 sec, it eventually starts printing errors every few minutes to /var/log/messages like these: ntpd[16406]: frequency error 621 PPM exceeds tolerance 500 PPM ntpd[16406]: frequency error 655 PPM exceeds tolerance 500 PPM ntpd[16406]: frequency error 655 PPM exceeds tolerance 500 PPM ntpd[16406]: frequency error 540 PPM exceeds tolerance 500 PPM After that the time just drifts further away from reality and never resynchronises The symptoms look just like those described in bug 75558 (comments #5 to #13, not the original problem with the option), and this bug was fixed for us with ntp version 4.1.2, but something very similar is back in 4.2.0. Version-Release number of selected component (if applicable): ntp-4.2.0.a.20040617-4 How reproducible: Always Steps to Reproduce: 1. start ntpd 2. set the date so the time is out by a couple of seconds 3. wait until the error messages start to appear Actual Results: "frequency error" error messages, and the time continues to slew further from the ntp server. Expected Results: Time slowly slews back in sync over a couple of hours, like it does with ntp-4.1.2-0.rc1.2. Additional info:
Is this still an issue? ntpd doesn't support such large errors. It's usually a hardware or kernel problem when the 500PPM limit isn't enough. Can you provide more info from the log file when ntpd is running with -x and without it?
Yes, this is still an issue as we would like to be able to upgrade to the latest version of ntpd and use -x. I'll attach some log files (ntp messages from syslog), but there isn't a lot of information in them - I'd be happy to re-run the experiments with some level of debugging turned on if you let me know what you need.
Created attachment 134114 [details] syslogs for ntp versions 4.2.0 and 4.1.2 with and without -x
Ok, I can reproduce it. The bug is reported upstream already: https://ntp.isc.org/bugs/show_bug.cgi?id=628 Do you experience this bug when you don't set the date manually?
We used to see it all the time because we had a setup where did not connect to the external network at boot time, therefore the initial 'ntpdate' in the ntpd startup script would fail. When we did connect the system time would inevitably be a second or two out, and the bug would occur. We no longer need to boot without the external network, but it's still possible to lose touch with the ntp servers for a significant time and have the system clock drift far enough to induce the problem when the servers come back.
Created attachment 134197 [details] Use adjtime when offset is more than 0.5s
It is also possible to workaround this bug by disabling kernel discipline, "disable kernel" in ntp.conf.
QE ack for 4.5.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2007-0189.html