Bug 786765

Summary: Feature Request: sanity checks against bogus time servers
Product: [Fedora] Fedora Reporter: Endre "Hrebicek" Balint-Nagy <endre>
Component: chronyAssignee: Miroslav Lichvar <mlichvar>
Status: CLOSED ERRATA QA Contact: Endre "Hrebicek" Balint-Nagy <endre>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 17CC: benl, mlichvar
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: chrony-1.27-0.1.pre1.fc17 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-03-08 06:04:06 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Endre "Hrebicek" Balint-Nagy 2012-02-02 10:53:12 UTC
Description of problem:
At the moment chronyd is accepting bogus time servers at startup when only one bogus time server is configured. This behaviour is unacceptable.
As an old UNIX sysadmin I expect some minimal sanity checks. (|offset| < 15MIN, |drift| < 15PPM as an example, but configurable limits are better)

Version-Release number of selected component (if applicable):


How reproducible:

bogus offset: https://beaker.engineering.redhat.com/jobs/187920
bogus drift: https://beaker.engineering.redhat.com/jobs/187912
Steps to Reproduce:
1. configure a bogus offset/drift on an ntp server
2. configure the previously prepared time server as the only clock source in chronyd.conf
3.wait for chronyd to settle and see the consequences
  
Actual results:
chronyd at the moment is happily synchronizing to the bogus reference clock

Expected results:
chronyd remains unsynhronized

Additional info:

Comment 2 Miroslav Lichvar 2012-02-03 12:36:37 UTC
I think we can add a new option for maximum acceptable offset and frequency offset and ignore such measurements with a warning in syslog.

Comment 3 Endre "Hrebicek" Balint-Nagy 2012-02-06 12:37:22 UTC
Hello Mirek,
A new configuration option will do.
We nee some noise in syslog as you offered, an we need a loud complaint when chrony is unable to synchronise for a significant amount of time.
Thanks for the (just theoretiqal) solution.

Comment 4 Endre "Hrebicek" Balint-Nagy 2012-02-06 12:40:42 UTC
The tests I mentioned in the bug description are available to check the new configuration option. (Some additional work need to be done as the it syntax will be defined.)

Comment 5 Miroslav Lichvar 2012-02-21 13:38:49 UTC
Upstream git now has a maxchange directive, it will be in the next release.

    This directive sets the maximum allowed offset corrected on a clock
    update.  The check is performed only after the specified number of
    updates to allow a large initial adjustment of the system clock.  When
    an offset larger than the specified maximum occurs, it will be ignored
    for the specified number of times and then chronyd will give up
    and exit (a negative value can be used to never exit).  In both cases
    a message is sent to syslog.

Comment 6 Fedora Update System 2012-02-28 15:16:12 UTC
chrony-1.27-0.1.pre1.fc17 has been submitted as an update for Fedora 17.
https://admin.fedoraproject.org/updates/chrony-1.27-0.1.pre1.fc17

Comment 7 Fedora Update System 2012-02-28 20:41:34 UTC
Package chrony-1.27-0.1.pre1.fc17:
* should fix your issue,
* was pushed to the Fedora 17 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing chrony-1.27-0.1.pre1.fc17'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2012-2629/chrony-1.27-0.1.pre1.fc17
then log in and leave karma (feedback).

Comment 8 Endre "Hrebicek" Balint-Nagy 2012-03-01 09:32:55 UTC
Probably I was a bit brief in describing the feature request.
My original intention was to limit the initial change, as chronyd filters out falsetickers properly, if a truechimer tells the true time.
My case is when no truechimers are available, as I configured in the test, only a single falseticker is present. In this case only the local clock is present to give some sense of the correct time, even it is uncertain to +/- 0.5 sec. My intention is to prevent insane stepping of the clock even in a such degenerate case. Pretty sure, we will step the clock in this case - before going multi-user and starting other services -, but stepping too far maybe not in the intention of the local system administrator, and might wish to set a limit for it - perhaps in the 1-15 minute range. Clearly from this point onward the sysadmins task to keep the clock within the aforementioned range from the true time, and has to do manual intervention to mend the clock, if it is gone so far.
So I am arguing for an option for the local sysadmin to decide, in which tolerance limits he/she is intended to keep the local clock - thus the RTC clock.
Manual intervention comes in when the system is powered down for a really long time - in my practice there were only a few cases, when a system was keept on the stock as a cold standby for a year. Otherwise the RTC will keep a sane time. So there is no need for a waiver at boot time, but surely we want to grant a such deviation at installation time.
Sorry, if I was too lengthy this time, but I wanted to make my position clear, if failed to make it in the first place.
Hrebicek.

Comment 9 Miroslav Lichvar 2012-03-01 09:42:53 UTC
I'm not sure I follow. The new maxchange option can be used limit even the first update or is that not enough to cover the case you have described?

Comment 10 Endre "Hrebicek" Balint-Nagy 2012-03-01 10:13:26 UTC
Hello Mirek, previously you wrote "The check is performed only after the specified number of updates to allow a large initial adjustment of the system clock.", which initiated my musings. Your quoted sentence holds or the opposite in Comment#9?

Comment 11 Miroslav Lichvar 2012-03-01 11:12:43 UTC
The specified number of updates can be zero to always perform the check. The second number can be zero too to exit when the limit is reached. That would be similar to ntpd without the -g option.

Comment 12 Fedora Update System 2012-03-08 06:04:06 UTC
chrony-1.27-0.1.pre1.fc17 has been pushed to the Fedora 17 stable repository.  If problems still persist, please make note of it in this bug report.