Bug 1260140

Summary: ntpdate -u -q -p 2 clock.redhat.com takes almost 3.5 times longer in RHEL 7
Product: Red Hat Enterprise Linux 7 Reporter: Bryan Totty <btotty>
Component: ntpAssignee: Miroslav Lichvar <mlichvar>
Status: CLOSED NOTABUG QA Contact: qe-baseos-daemons
Severity: high Docs Contact:
Priority: medium    
Version: 7.3CC: bwelterl, mlichvar, mvanderw
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-09-29 07:59:06 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Bryan Totty 2015-09-04 14:11:35 UTC
Description of problem:

ntpdate -u -q -p 2 clock.redhat.com

takes about 3.5 times longer to run in RHEL 7 than it does in RHEL 6.

With the query only option (-q) and with a samples value (-p option) >= 2 seconds more, on RHEL 7 (7.1) then it does when running the same ntpdate command on RHEL 6 (6.7).


RHEL 6:

# time ntpdate -u -q -p 2 clock.redhat.com
server 10.5.26.10, stratum 1, offset 0.002779, delay 0.07281
server 10.11.160.238, stratum 1, offset 0.002903, delay 0.02704
server 10.16.255.1, stratum 1, offset 0.003301, delay 0.04623
server 10.5.27.10, stratum 2, offset 0.002761, delay 0.07288
 3 Sep 00:27:02 ntpdate[17655]: adjust time server 10.11.160.238 offset 0.002903 sec

real	0m0.799s  <<<<<<<
user	0m0.000s
sys	0m0.002s

RHEL 7:

# time ntpdate -u -q -p 2 clock.redhat.com
server 10.11.160.238, stratum 1, offset -0.000219, delay 0.02773
server 10.16.255.1, stratum 1, offset 0.000375, delay 0.04633
server 10.5.27.10, stratum 2, offset -0.000113, delay 0.07291
server 10.5.26.10, stratum 1, offset 0.000068, delay 0.07382
 3 Sep 12:38:20 ntpdate[16786]: adjust time server 10.11.160.238 offset -0.000219 sec

real	0m2.754s  <<<<<<<
user	0m0.001s
sys	0m0.003s


Note the approximate 2 second difference every time. You can try this multiple times with the same time sources on systems in the same subnet and reproduce it consistently. 

Version-Release number of selected component (if applicable):
4.2.6p5-19.el7_1.1

How reproducible:
Always.

Steps to Reproduce:
1. Run once first because the first time can take longer sometimes leading to inconsistent data:

ntpdate -u -q -p 2 clock.redhat.com

2. Then rerun as many times as desired and note the approximate 2 second difference between RHEL 6 and RHEL 7

time ntpdate -u -q -p 2 clock.redhat.com


Actual results:
real	0m2.754s


Expected results:
real	0m0.799s
Near or better results to RHEL 6 on RHEL 7.

Additional info:
There is an alternative to use ntpq, but this does not address the problem with ntpdate for existing production environments.



RHEL 6:

# time ntpq -p
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 ntp.newfxlabs.c .INIT.          16 u    - 1024    0    0.000    0.000   0.000
 131.107.13.100  .INIT.          16 u    - 1024    0    0.000    0.000   0.000
 ntp.glorb.com   .INIT.          16 u    - 1024    0    0.000    0.000   0.000
 104.41.150.68   .INIT.          16 u    - 1024    0    0.000    0.000   0.000
*clock1.rdu2.red .CDMA.           1 u   92 1024  377    1.989    3.145   1.442
+clock.bos.redha .CDMA.           1 u 1046 1024  377   20.035    2.977   1.802
+clock.util.phx2 .CDMA.           1 u  969 1024  377   47.781    2.778   1.713
+clock02.util.ph 10.5.26.10       2 u  845 1024  377   46.793    2.648   1.801

real	0m0.060s  <<<<<<
user	0m0.007s
sys	0m0.004s


RHEL 7:


# time ntpq -p
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 propjet.latt.ne .INIT.          16 u    -   64    0    0.000    0.000   0.000
 209.118.204.201 .INIT.          16 u    -   64    0    0.000    0.000   0.000
 2600:3c03:e000: .INIT.          16 u    -   64    0    0.000    0.000   0.000
 blue.1e400.net  .INIT.          16 u    -   64    0    0.000    0.000   0.000
*clock1.rdu2.red .CDMA.           1 u    9   64    1    2.498    0.021   0.000
 clock.bos.redha .CDMA.           1 u    8   64    1   19.925    0.077   0.000
 clock.util.phx2 .CDMA.           1 u    7   64    1   47.222   -0.100   0.000
 clock02.util.ph 10.5.26.10       2 u    6   64    1   46.865   -0.387   0.000

real	0m0.040s  <<<<<<
user	0m0.017s
sys	0m0.005s

Comment 2 Miroslav Lichvar 2015-09-04 14:34:39 UTC
This is the standard behavior of ntpdate since 4.2.6. The 2s spacing between packets was added to avoid a rapid burst which can trigger the KoD RATE response on the server. NTP clients are not supposed to use such short polling intervals. In the RHEL6 ntp rebase to 4.2.6 the original behavior of ntpdate from 4.2.4 was restored in a patch to avoid problems as described in this bug report after minor RHEL updates, but I don't think the same should be done in the RHEL7 ntp.

My suggestion would be to use -p 1 to send just one packet to each server. For monitoring purposes there shouldn't be any need to send more packets.

Comment 6 Miroslav Lichvar 2015-09-14 08:21:24 UTC
Here is the original upstream bug report which requested increasing the ntpdate polling interval to 2 seconds:
https://bugs.ntp.org/show_bug.cgi?id=1504

Some comments about the new ntpdate behavior:
https://bugs.ntp.org/show_bug.cgi?id=1853