Created attachment 992270 [details]
Description of problem:
RFE: to better support regaining sync while maintaining monotonic time,
implement separate "tinker step" variant commands for the config file
to set the forward and backward slew-size limits separately.
Patch series attached.
The typical use config would have a small (maybe default) forward
slew-limit and a large backward one. Sizeable forward steps would be
made, but medium backward adjustment would still be done by slewing.
The large number of time-sensitive applications that fail on time apparently
repeating, due to a backward step, will be better supported. However,
forward steps (which trouble fewer applications) will still be made.
Version-Release number of selected component (if applicable):
RHEL7 has 4.2.6p5-19.el7_0
Patchset applies to ntp-4.2.8p1
Use case: a virtual machine where the VM support does not provide good time
service, resulting in a high apparent jitter with large forward steps. Might
also be used in hibernate or sleep environments.
ntp.conf(5) does not actually discuss the tinker command.
I think this would be an interesting feature. The patch looks good to me, only it should probably disable the kernel discipline when either of the two thresholds is larger than 0.5 as the kernel doesn't accept larger offsets.
Will you file a bug with the patch in the upstream bugzilla (https://bugs.ntp.org) or do you want me to do it?
The change you point out seems reasonable. We should probably also add a docs
note clarifying the side-effect and noting (any?) disadvantages. Would we expect
a higher-noise lock running the userland rather than kernel implementation?
I'm happy for you to propose it upstream; thanks.
(In reply to Jeremy Harris from comment #3)
> Would we expect a higher-noise lock running the userland rather
> than kernel implementation?
Possibly. It depends on how noisy are the NTP sources and how stable is the clock. Sometimes disabling kernel discipline worsens the quality of the timekeeping, sometimes it improves it. The two disciplines are slightly different.
This is now included in the upstream code.
However, there seems to be a problem when recovering from a large slew when step threshold in the opposite direction has the default value. We may need to limit the minimum value when the thresholds are allowed to differ.
For my interest, in that problem noted upstream (bug 2811) - why did the PLL accumulate a frequency offset? Is it unlocked during the slew, and if so why given that we know the slew rate?
(In reply to Jeremy Harris from comment #7)
> For my interest, in that problem noted upstream (bug 2811) - why did the PLL
> accumulate a frequency offset? Is it unlocked during the slew, and if so
> why given that we know the slew rate?
It's a limitation of the loop design. It can't adjust phase without adjusting also frequency. With a large phase offset a large frequency offset will be accumulated, which will then need a long time to get back to the correct value. This is one of the things that chrony does much better.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.