Description of problem: Fatal error : adjtimex failed for set_frequency, freq_ppm=-3.5554e+01 required_freq=3.5554e+01 required_tick=10000 Version-Release number of selected component (if applicable): kernel-3.18.5-200.fc21.i686 chrony-1.31-1.fc21.i686 How reproducible: always Steps to Reproduce: 1. systemctl start chronyd, or reboot 2. 3. Actual results: adjtimex fails, causing chronyd to refuse to start. Expected results: chronyd and adjtimex work as in previous kernels Additional info: While chronyd is reporting the error - it only starting failing with this kernel. Also, this is an updates-testing kernel. I'm not sure if I should be reporting bugs against testing kernels - but how else will you know what is broken? Jan 31 08:33:17 elissa.gathman.org chronyd[722]: Frequency -35.554 +/- 1.838 ppm read from /var/lib/chrony/drift Jan 31 08:33:17 elissa.gathman.org chronyd[722]: Fatal error : adjtimex failed for set_frequency, freq_ppm=-3.5554e+01 required_freq=3.5554e+01 required_tick=10000 Jan 31 08:33:17 elissa.gathman.org chronyd[718]: adjtimex failed for set_frequency, freq_ppm=-3.5554e+01 required_freq=3.5554e+01 required_tick=10000
It is of course possible that chronyd has been broken all along, and previous kernels were just ignoring the garbage.
(In reply to Stuart D Gathman from comment #1) > It is of course possible that chronyd has been broken all along, and > previous kernels were just ignoring the garbage. I've emailed upstream. They included: commit e8e3b6ddd0a6afa267029141946918392b425cf4 Author: Sasha Levin <sasha.levin> Date: Wed Dec 3 19:25:05 2014 -0500 time: adjtimex: Validate the ADJ_FREQUENCY values commit 5e5aeb4367b450a28f447f6d5ab57d8f2ab16a5f upstream. Verify that the frequency value from userspace is valid and makes sense. Unverified values can cause overflows later on. Cc: Thomas Gleixner <tglx> Cc: Ingo Molnar <mingo> Signed-off-by: Sasha Levin <sasha.levin> [jstultz: Fix up bug for negative values and drop redunent cap check] Signed-off-by: John Stultz <john.stultz> Signed-off-by: Greg Kroah-Hartman <gregkh> in 3.18.5, which is very likely hitting this. Your comment about it being broken all along might be true, but getting it unbroken is something we should figure out either way.
To me it looks like that commit should used MAXFREQ_SCALE instead of LONG_MAX/MIN. The frequency tolerance is 500 ppm, that's 500 * (1<<16) in the timex freq field. --- a/kernel/time/ntp.c +++ b/kernel/time/ntp.c @@ -634,9 +634,9 @@ int ntp_validate_timex(struct timex *txc) return -EPERM; if (txc->modes & ADJ_FREQUENCY) { - if (LONG_MIN / PPM_SCALE > txc->freq) + if (-MAXFREQ_SCALED / PPM_SCALE > txc->freq) return -EINVAL; - if (LONG_MAX / PPM_SCALE < txc->freq) + if (MAXFREQ_SCALED / PPM_SCALE < txc->freq) return -EINVAL; }
ntpd works ok, and I assume it also uses adjtimex.
ntpd sets the frequency directly only on start (after that the kernel PLL takes over) and it doesn't check for adjtimex errors. Try ntptime -f 100, or any value between -500 and 500.
(In reply to Stuart D Gathman from comment #0) > Jan 31 08:33:17 elissa.gathman.org chronyd[722]: Frequency -35.554 +/- 1.838 > ppm read from /var/lib/chrony/drift > Jan 31 08:33:17 elissa.gathman.org chronyd[722]: Fatal error : adjtimex > failed for set_frequency, freq_ppm=-3.5554e+01 required_freq=3.5554e+01 > required_tick=10000 > Jan 31 08:33:17 elissa.gathman.org chronyd[718]: adjtimex failed for > set_frequency, freq_ppm=-3.5554e+01 required_freq=3.5554e+01 > required_tick=10000 Could you share your /var/lib/chrony/drift please?
# cat /var/lib/chrony/drift -35.554442 1.838092 # ntptime -f -35.554442 ntp_gettime() returns code 0 (OK) time d87a2511.ebc62160 Mon, Feb 2 2015 11:32:17.920, (.920992747), maximum error 82579 us, estimated error 25421 us, TAI offset 0 ntp_adjtime() call fails: Invalid argument
Feb 02 10:44:33 localhost ntpd[9491]: ntpd 4.2.6p5 Fri Dec 19 19:02:22 UTC 2014 (1) Feb 02 10:44:33 localhost ntpd[9492]: proto: precision = 1.467 usec Feb 02 10:44:33 localhost ntpd[9492]: 0.0.0.0 c01d 0d kern kernel time sync enabled ... Feb 02 10:44:34 localhost ntpd[9492]: 0.0.0.0 c016 06 restart Feb 02 10:44:34 localhost ntpd[9492]: 0.0.0.0 c012 02 freq_set kernel 0.000 PPM Feb 02 10:44:34 localhost ntpd[9492]: 0.0.0.0 c011 01 freq_not_set Feb 02 10:44:34 localhost ntpd[9492]: 0.0.0.0 c61c 0c clock_step +0.159777 s Feb 02 10:44:34 localhost ntpd[9492]: 0.0.0.0 c614 04 freq_mode Feb 02 10:44:34 localhost systemd[1443]: Time has been changed Feb 02 10:44:34 localhost systemd[1321]: Time has been changed Feb 02 10:44:35 localhost ntpd[9492]: 0.0.0.0 c618 08 no_sys_peer Feb 02 11:00:16 localhost ntpd[9492]: 0.0.0.0 c612 02 freq_set kernel 38.555 PPM Feb 02 11:00:16 localhost ntpd[9492]: 0.0.0.0 c615 05 clock_sync
So I've also heard issues on arm. Since the report is against i686, just wanted to confirm that this hasn't been reported on x86_64? thanks -john
Looking at the Fedora update filed, I think that is correct, John. Both reports I've seen have been i686 thus far. That being said, this particular kernel is still in updates-testing at the moment, so it isn't wide spread.
Yea, looks like it should have been LLONG_MAX / PPM_SCALE.
Created attachment 987224 [details] Potential fix to avoid false positives on 32bit systems Can folks try this patch to validate it resolves the issue for them?
I'm seeing this on ARMv7 with 3.19rc7 too btw: 01:00 AM root@cubietruck:~ $ uname -a Linux cubietruck.home.roving-it.com 3.19.0-0.rc7.git0.1.fc22.armv7hl+lpae #1 SMP Mon Feb 2 15:17:41 UTC 2015 armv7l armv7l armv7l GNU/Linux 01:00 AM root@cubietruck:~ $ systemctl status chronyd.service -l ● chronyd.service - NTP client/server Loaded: loaded (/usr/lib/systemd/system/chronyd.service; enabled) Active: failed (Result: exit-code) since Thu 1970-01-01 01:00:16 BST; 46s ago Process: 389 ExecStart=/usr/sbin/chronyd $OPTIONS (code=exited, status=1/FAILURE) Jan 01 01:00:16 cubietruck.home.roving-it.com systemd[1]: Starting NTP client/server... Jan 01 01:00:16 cubietruck.home.roving-it.com chronyd[389]: adjtimex failed for set_frequency, freq_ppm=-7.0716e+00 required_freq=7.0716e+00 required_tick=10000 Jan 01 01:00:16 cubietruck.home.roving-it.com systemd[1]: chronyd.service: control process exited, code=exited status=1 Jan 01 01:00:16 cubietruck.home.roving-it.com systemd[1]: Failed to start NTP client/server. Jan 01 01:00:16 cubietruck.home.roving-it.com systemd[1]: Unit chronyd.service entered failed state. Jan 01 01:00:16 cubietruck.home.roving-it.com systemd[1]: chronyd.service failed.
Peter: Yes, 32bit systems are affected, and the patch attached to this bug should resolve it. Let me know if it does not.
kernel-3.18.5-201.fc21 has been submitted as an update for Fedora 21. https://admin.fedoraproject.org/updates/kernel-3.18.5-201.fc21
kernel-3.18.5-101.fc20 has been submitted as an update for Fedora 20. https://admin.fedoraproject.org/updates/kernel-3.18.5-101.fc20
So I assume this means the patch is working for you? I was hoping to get some tested-by lines to add to the submission.
The bug seems fixed on 3.18.5-201. Now chronyd works correctly for me on my i686 laptop. Below is journal logs of chronyd.service on 3.18.5-200 and 3.18.5-201. *3.18.5-200 %%%%%%%%%%%%%%%%%%%%%%% Feb 03 21:12:42 vaiox.localdomain systemd[1]: Starting NTP client/server... Feb 03 21:12:42 vaiox.localdomain chronyd[1690]: chronyd version 1.31 starting Feb 03 21:12:42 vaiox.localdomain chronyd[1690]: Frequency 0.000 +/- 1000000.000 ppm read from /var/lib/chrony/drift Feb 03 21:12:42 vaiox.localdomain systemd[1]: Started NTP client/server. Feb 03 21:13:08 vaiox.localdomain chronyd[1690]: Selected source 157.7.154.23 Feb 03 21:13:08 vaiox.localdomain chronyd[1690]: Fatal error : adjtimex failed for set_frequency, freq_ppm=8.3333e+04 required_freq=-3.3333e+01 required_tick=9167 Feb 03 21:13:08 vaiox.localdomain systemd[1]: chronyd.service: main process exited, code=exited, status=1/FAILURE Feb 03 21:13:08 vaiox.localdomain systemd[1]: Unit chronyd.service entered failed state. Feb 03 21:13:08 vaiox.localdomain systemd[1]: chronyd.service failed. %%%%%%%%%%%%%%%%%%%%%%% *3.18.5-201 %%%%%%%%%%%%%%%%%%%%%%% Feb 03 21:16:53 vaiox.localdomain systemd[1]: Starting NTP client/server... Feb 03 21:16:53 vaiox.localdomain chronyd[1692]: chronyd version 1.31 starting Feb 03 21:16:53 vaiox.localdomain chronyd[1692]: Frequency 0.000 +/- 1000000.000 ppm read from /var/lib/chrony/drift Feb 03 21:16:53 vaiox.localdomain systemd[1]: Started NTP client/server. Feb 03 21:17:09 vaiox.localdomain chronyd[1692]: Selected source 106.186.114.89 Feb 03 21:17:09 vaiox.localdomain chronyd[1692]: System clock wrong by 0.745615 seconds, adjustment started %%%%%%%%%%%%%%%%%%%%%%%
John, sorry for the miscommunication. We looked at the patch and it looked correct so we added it to a build we were doing for another bug. Happily, Yu has confirmed it does indeed work.
Package kernel-3.18.5-201.fc21: * should fix your issue, * was pushed to the Fedora 21 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing kernel-3.18.5-201.fc21' as soon as you are able to, then reboot. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2015-1657/kernel-3.18.5-201.fc21 then log in and leave karma (feedback).
kernel-3.18.5-201.fc21 has been pushed to the Fedora 21 stable repository. If problems still persist, please make note of it in this bug report.
kernel-3.18.5-101.fc20 has been pushed to the Fedora 20 stable repository. If problems still persist, please make note of it in this bug report.