Description of problem: There is a long-standing issue with the kernel's internal steering of the system clock according to adjustments made by NTP/PTP daemons. As the updates of the clock may be irregular due to nohz, the frequency is changed slowly and in a limited range to avoid an instability, which was an issue in the past. The slow response may prevent the NTP/PTP daemons from controlling the system clock with a better stability/accuracy than few tens or hundreds of nanoseconds. Recent changes in the upstream code fix the issue by using a division to determine the clock multiplier directly from the frequency set by the NTP/PTP daemons and simplifying the error correction to a +1/0 adjustment of the multiplier. This requires the GENERIC_TIME_VSYSCALL_OLD code to be removed from the kernel. For powerpc there is bug #1131131. Upstream commits: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c2cda2a5bda9f1369c9d1ab54a20571c13cf2743 https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=78b98e3c5a66d569a53b8f57b6a698f912794a43 Version-Release number of selected component (if applicable): kernel-3.10.0-857.el7.x86_64 How reproducible: always Steps to Reproduce: 1. make sure /sys/devices/system/clocksource/clocksource0/current_clocksource shows tsc 2. stop any services controlling the clock (e.g. ntpd, chronyd, ptp4l, phc2sys) 3. compile and run tools/testing/selftests/timers/freq-step.c from the upstream kernel tree (you may need to increase MAX_PRECISION or disable KPTI) Actual results: Step 1st interval 2nd interval Freq Dev Max Freq Dev Max 40960 +0.146 1 3 +0.146 1 2 [OK] 40960 +0.001 5 35 +0.000 1 2 [OK] 40960 +3.346 25352 172950 +0.045 1 5 [OK] 40960 +0.161 2 12 +0.161 1 3 [OK] 40960 +3.897 23014 156184 +0.136 1 4 [OK] 640 +0.104 1 2 +0.103 3 16 [OK] 640 -0.119 4 28 -0.010 19 64 [OK] 640 +0.153 1 2 +0.118 50 128 [OK] 640 +0.029 1 3 +0.029 1 2 [OK] 640 +0.102 1 3 +0.102 1 2 [OK] 10 +0.006 9 28 -0.000 2 5 [OK] 10 +0.009 20 76 +0.000 2 5 [OK] 10 +0.000 2 5 -0.000 2 4 [OK] 10 -0.000 1 4 -0.000 1 4 [OK] 10 +0.001 3 10 +0.000 2 5 [OK] Expected results: Much smaller errors like this: Step 1st interval 2nd interval Freq Dev Max Freq Dev Max 40960 +0.000 2 5 -0.000 2 5 [OK] 40960 -0.000 2 5 -0.000 3 8 [OK] 40960 -0.000 3 7 +0.000 3 6 [OK] 40960 +0.000 3 6 -0.000 2 6 [OK] 40960 -0.000 2 5 +0.000 3 7 [OK] 640 +0.001 2 9 -0.000 2 5 [OK] 640 -0.000 3 6 +0.000 2 5 [OK] 640 -0.001 2 4 +0.000 1 3 [OK] 640 -0.000 2 5 +0.000 2 4 [OK] 640 +0.000 3 6 -0.000 3 6 [OK] 10 -0.000 2 5 +0.000 2 5 [OK] 10 -0.000 2 5 +0.000 2 5 [OK] 10 +0.000 2 5 +0.000 2 5 [OK] 10 +0.000 2 5 +0.000 2 5 [OK] 10 +0.000 3 12 +0.000 4 11 [OK]
A different reproducer is to use phc2sys to synchronize the system clock to a stable PTP hardware clock with a larger initial offset, which is not corrected by stepping to not reset the NTP error. This causes the frequency to be less stable and the measured offset to be larger than would be expected on the hardware. For example: # phc2sys -s /dev/ptp6 -m -O 0 -F 0.0 phc2sys[3445287.444]: phc offset 999954026 s0 freq -16868 delay 1915 phc2sys[3445288.445]: phc offset 999954049 s2 freq -16845 delay 1909 phc2sys[3445289.445]: phc offset 999954060 s2 freq +100000000 delay 1922 phc2sys[3445290.445]: phc offset 889018287 s2 freq +100000000 delay 1722 phc2sys[3445291.445]: phc offset 777881303 s2 freq +100000000 delay 1734 phc2sys[3445292.445]: phc offset 666744093 s2 freq +100000000 delay 1718 phc2sys[3445293.445]: phc offset 555610150 s2 freq +100000000 delay 1732 phc2sys[3445294.445]: phc offset 444473218 s2 freq +100000000 delay 1724 phc2sys[3445295.445]: phc offset 333336247 s2 freq +100000000 delay 1740 phc2sys[3445296.445]: phc offset 222199334 s2 freq +100000000 delay 1722 phc2sys[3445297.445]: phc offset 111062205 s2 freq +100000000 delay 1736 phc2sys[3445298.445]: phc offset -74458 s2 freq -91303 delay 1730 phc2sys[3445299.445]: phc offset -43689 s2 freq -82871 delay 1937 phc2sys[3445300.446]: phc offset 22285 s2 freq -30004 delay 1913 ... phc2sys[3445372.454]: phc offset -16 s2 freq -16916 delay 1909 phc2sys[3445373.454]: phc offset -38 s2 freq -16942 delay 1925 phc2sys[3445374.454]: phc offset -57 s2 freq -16973 delay 1919 phc2sys[3445375.454]: phc offset -85 s2 freq -17018 delay 1926 phc2sys[3445376.454]: phc offset 83 s2 freq -16875 delay 1921 phc2sys[3445377.454]: phc offset 50 s2 freq -16884 delay 1929 phc2sys[3445378.454]: phc offset 54 s2 freq -16865 delay 1911 phc2sys[3445379.454]: phc offset 26 s2 freq -16876 delay 1936 phc2sys[3445380.454]: phc offset 14 s2 freq -16881 delay 1911 phc2sys[3445381.455]: phc offset -25 s2 freq -16915 delay 1920 phc2sys[3445382.455]: phc offset -41 s2 freq -16939 delay 1920 phc2sys[3445383.455]: phc offset -73 s2 freq -16983 delay 1912 After restarting phc2sys with a small non-zero step threshold to force the step in order to reset the NTP error it settles down to smaller offsets: # phc2sys -s /dev/ptp6 -m -O 0 -F 1e-9 phc2sys[3445424.661]: phc offset -898 s0 freq -16807 delay 1920 phc2sys[3445425.661]: phc offset -1075 s1 freq -16984 delay 1913 phc2sys[3445426.661]: phc offset 158 s2 freq -16826 delay 1921 phc2sys[3445427.661]: phc offset 194 s2 freq -16743 delay 1921 ... phc2sys[3445481.667]: phc offset 2 s2 freq -16821 delay 1928 phc2sys[3445482.667]: phc offset -1 s2 freq -16824 delay 1923 phc2sys[3445483.667]: phc offset 6 s2 freq -16817 delay 1912 phc2sys[3445484.668]: phc offset -7 s2 freq -16828 delay 1911 phc2sys[3445485.668]: phc offset 1 s2 freq -16822 delay 1927 phc2sys[3445486.668]: phc offset 12 s2 freq -16811 delay 1906 phc2sys[3445487.668]: phc offset -1 s2 freq -16821 delay 1934 phc2sys[3445488.668]: phc offset -12 s2 freq -16832 delay 1908 phc2sys[3445489.668]: phc offset -4 s2 freq -16827 delay 1901 phc2sys[3445490.668]: phc offset -1 s2 freq -16826 delay 1925 phc2sys[3445491.668]: phc offset 3 s2 freq -16822 delay 1923 phc2sys[3445492.668]: phc offset 13 s2 freq -16811 delay 1914
------- Comment From seg.com 2019-05-30 20:16 EDT------- Not clear to me whether this is already fixed and we should close or not already fixed and we're not going to bother for RHEL 7, at a minimum.
We cannot fix this without bug #1131131, which was closed as WONTFIX. The nohz=off kernel option is recommended as a workaround for a highly accurate synchronization.