Bug 1565580 - RFE: improve stability of system clock
Summary: RFE: improve stability of system clock
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: kernel
Version: 7.5
Hardware: All
OS: Linux
high
high
Target Milestone: rc
: 7.7
Assignee: Prarit Bhargava
QA Contact: Qiao Zhao
URL:
Whiteboard:
Depends On: 1131131
Blocks: 1586275 1614007 1643962 1598750
TreeView+ depends on / blocked
 
Reported: 2018-04-10 10:59 UTC by Miroslav Lichvar
Modified: 2019-06-03 11:01 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-06-03 11:01:39 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
IBM Linux Technology Center 166642 None None None 2019-05-31 02:08:04 UTC

Description Miroslav Lichvar 2018-04-10 10:59:14 UTC
Description of problem:
There is a long-standing issue with the kernel's internal steering of the system clock according to adjustments made by NTP/PTP daemons. As the updates of the clock may be irregular due to nohz, the frequency is changed slowly and in a limited range to avoid an instability, which was an issue in the past. The slow response may prevent the NTP/PTP daemons from controlling the system clock with a better stability/accuracy than few tens or hundreds of nanoseconds.

Recent changes in the upstream code fix the issue by using a division to determine the clock multiplier directly from the frequency set by the NTP/PTP daemons and simplifying the error correction to a +1/0 adjustment of the multiplier. This requires the GENERIC_TIME_VSYSCALL_OLD code to be removed from the kernel. For powerpc there is bug #1131131.

Upstream commits:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c2cda2a5bda9f1369c9d1ab54a20571c13cf2743
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=78b98e3c5a66d569a53b8f57b6a698f912794a43

Version-Release number of selected component (if applicable):
kernel-3.10.0-857.el7.x86_64

How reproducible:
always

Steps to Reproduce:
1. make sure /sys/devices/system/clocksource/clocksource0/current_clocksource shows tsc
2. stop any services controlling the clock (e.g. ntpd, chronyd, ptp4l, phc2sys)
3. compile and run tools/testing/selftests/timers/freq-step.c from the upstream kernel tree (you may need to increase MAX_PRECISION or disable KPTI)

Actual results:
  Step           1st interval              2nd interval  
             Freq    Dev     Max       Freq    Dev     Max
 40960     +0.146      1       3     +0.146      1       2      [OK]
 40960     +0.001      5      35     +0.000      1       2      [OK]
 40960     +3.346  25352  172950     +0.045      1       5      [OK]
 40960     +0.161      2      12     +0.161      1       3      [OK]
 40960     +3.897  23014  156184     +0.136      1       4      [OK]
   640     +0.104      1       2     +0.103      3      16      [OK]
   640     -0.119      4      28     -0.010     19      64      [OK]
   640     +0.153      1       2     +0.118     50     128      [OK]
   640     +0.029      1       3     +0.029      1       2      [OK]
   640     +0.102      1       3     +0.102      1       2      [OK]
    10     +0.006      9      28     -0.000      2       5      [OK]
    10     +0.009     20      76     +0.000      2       5      [OK]
    10     +0.000      2       5     -0.000      2       4      [OK]
    10     -0.000      1       4     -0.000      1       4      [OK]
    10     +0.001      3      10     +0.000      2       5      [OK]


Expected results:
  Much smaller errors like this:

  Step           1st interval              2nd interval
             Freq    Dev     Max       Freq    Dev     Max
 40960     +0.000      2       5     -0.000      2       5      [OK]
 40960     -0.000      2       5     -0.000      3       8      [OK]
 40960     -0.000      3       7     +0.000      3       6      [OK]
 40960     +0.000      3       6     -0.000      2       6      [OK]
 40960     -0.000      2       5     +0.000      3       7      [OK]
   640     +0.001      2       9     -0.000      2       5      [OK]
   640     -0.000      3       6     +0.000      2       5      [OK]
   640     -0.001      2       4     +0.000      1       3      [OK]
   640     -0.000      2       5     +0.000      2       4      [OK]
   640     +0.000      3       6     -0.000      3       6      [OK]
    10     -0.000      2       5     +0.000      2       5      [OK]
    10     -0.000      2       5     +0.000      2       5      [OK]
    10     +0.000      2       5     +0.000      2       5      [OK]
    10     +0.000      2       5     +0.000      2       5      [OK]
    10     +0.000      3      12     +0.000      4      11      [OK]

Comment 2 Miroslav Lichvar 2018-04-10 11:42:13 UTC
A different reproducer is to use phc2sys to synchronize the system clock to a stable PTP hardware clock with a larger initial offset, which is not corrected by stepping to not reset the NTP error. This causes the frequency to be less stable and the measured offset to be larger than would be expected on the hardware.

For example:
# phc2sys -s /dev/ptp6 -m -O 0 -F 0.0
phc2sys[3445287.444]: phc offset 999954026 s0 freq  -16868 delay   1915
phc2sys[3445288.445]: phc offset 999954049 s2 freq  -16845 delay   1909
phc2sys[3445289.445]: phc offset 999954060 s2 freq +100000000 delay   1922
phc2sys[3445290.445]: phc offset 889018287 s2 freq +100000000 delay   1722
phc2sys[3445291.445]: phc offset 777881303 s2 freq +100000000 delay   1734
phc2sys[3445292.445]: phc offset 666744093 s2 freq +100000000 delay   1718
phc2sys[3445293.445]: phc offset 555610150 s2 freq +100000000 delay   1732
phc2sys[3445294.445]: phc offset 444473218 s2 freq +100000000 delay   1724
phc2sys[3445295.445]: phc offset 333336247 s2 freq +100000000 delay   1740
phc2sys[3445296.445]: phc offset 222199334 s2 freq +100000000 delay   1722
phc2sys[3445297.445]: phc offset 111062205 s2 freq +100000000 delay   1736
phc2sys[3445298.445]: phc offset    -74458 s2 freq  -91303 delay   1730
phc2sys[3445299.445]: phc offset    -43689 s2 freq  -82871 delay   1937
phc2sys[3445300.446]: phc offset     22285 s2 freq  -30004 delay   1913
...
phc2sys[3445372.454]: phc offset       -16 s2 freq  -16916 delay   1909
phc2sys[3445373.454]: phc offset       -38 s2 freq  -16942 delay   1925
phc2sys[3445374.454]: phc offset       -57 s2 freq  -16973 delay   1919
phc2sys[3445375.454]: phc offset       -85 s2 freq  -17018 delay   1926
phc2sys[3445376.454]: phc offset        83 s2 freq  -16875 delay   1921
phc2sys[3445377.454]: phc offset        50 s2 freq  -16884 delay   1929
phc2sys[3445378.454]: phc offset        54 s2 freq  -16865 delay   1911
phc2sys[3445379.454]: phc offset        26 s2 freq  -16876 delay   1936
phc2sys[3445380.454]: phc offset        14 s2 freq  -16881 delay   1911
phc2sys[3445381.455]: phc offset       -25 s2 freq  -16915 delay   1920
phc2sys[3445382.455]: phc offset       -41 s2 freq  -16939 delay   1920
phc2sys[3445383.455]: phc offset       -73 s2 freq  -16983 delay   1912

After restarting phc2sys with a small non-zero step threshold to force the step in order to reset the NTP error it settles down to smaller offsets:

# phc2sys -s /dev/ptp6 -m -O 0 -F 1e-9
phc2sys[3445424.661]: phc offset      -898 s0 freq  -16807 delay   1920
phc2sys[3445425.661]: phc offset     -1075 s1 freq  -16984 delay   1913
phc2sys[3445426.661]: phc offset       158 s2 freq  -16826 delay   1921
phc2sys[3445427.661]: phc offset       194 s2 freq  -16743 delay   1921
...
phc2sys[3445481.667]: phc offset         2 s2 freq  -16821 delay   1928
phc2sys[3445482.667]: phc offset        -1 s2 freq  -16824 delay   1923
phc2sys[3445483.667]: phc offset         6 s2 freq  -16817 delay   1912
phc2sys[3445484.668]: phc offset        -7 s2 freq  -16828 delay   1911
phc2sys[3445485.668]: phc offset         1 s2 freq  -16822 delay   1927
phc2sys[3445486.668]: phc offset        12 s2 freq  -16811 delay   1906
phc2sys[3445487.668]: phc offset        -1 s2 freq  -16821 delay   1934
phc2sys[3445488.668]: phc offset       -12 s2 freq  -16832 delay   1908
phc2sys[3445489.668]: phc offset        -4 s2 freq  -16827 delay   1901
phc2sys[3445490.668]: phc offset        -1 s2 freq  -16826 delay   1925
phc2sys[3445491.668]: phc offset         3 s2 freq  -16822 delay   1923
phc2sys[3445492.668]: phc offset        13 s2 freq  -16811 delay   1914

Comment 6 IBM Bug Proxy 2019-05-31 00:20:42 UTC
------- Comment From seg@us.ibm.com 2019-05-30 20:16 EDT-------
Not clear to me whether this is already fixed and we should close or not already fixed and we're not going to bother for RHEL 7, at a minimum.

Comment 7 Miroslav Lichvar 2019-06-03 11:01:39 UTC
We cannot fix this without bug #1131131, which was closed as WONTFIX. The nohz=off kernel option is recommended as a workaround for a highly accurate synchronization.


Note You need to log in before you can comment on or make changes to this bug.