Bug 1461196 - leap second acceptance only if majority of servers breaks hierarchical NTP configuration with peers (since 4.2.6)
leap second acceptance only if majority of servers breaks hierarchical NTP co...
Status: NEW
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: ntp (Show other bugs)
7.3
All Linux
unspecified Severity high
: rc
: ---
Assigned To: Miroslav Lichvar
qe-baseos-daemons
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-06-13 15:34 EDT by Laurent Deniel
Modified: 2018-06-20 22:14 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1410457
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Laurent Deniel 2017-06-13 15:34:29 EDT
+++ This bug was initially created as a clone of Bug #1410457 +++

Description of problem:

Since 4.2.6 the leap second is only accepted if:
- provided by file or by a REF_CLOCK
- or if a majority of servers announce it (while it was previously accepted if only one server announces it)

This breaks hierarchical NTP configuration when for instance you have stratum 2 servers configured as peers for fault tolerance (and other) reasons.

Example :

  (stratum 1)        (stratum 2)          (stratum 3)
[NTP server 1]  <----- [NTP server A]  <------- NTP Clients 
[NTP server 2]  <----- [NTP server B]  <------- ...
                <----- [NTP server C]  <------- ...
                                       <------- ....
1 and 2 are servers for A,B,C
A,B,C are peers 

There is no distinction between the stratum 1 NTP server and the stratum 2 peers when computing the majority. So the leap lecond announced by the two servers are not taken into account in this case by any stratum 2 servers.

The code is :

} else if (leap_vote > sys_survivors / 2) {

so this gives for the example :

2 > 5/2 ==> 2 > 2 ==> KO

This would only work with 2 machines at stratum 2 (i.e. 1 peer) but this could again fail if one of the 2 NTP server of stratum 1 is not available at this time.

So the new algorithm is not that optimal and breaks existing configurations in production.

Version-Release number of selected component: any 4.2.6 version

How reproducible: yes

Steps to Reproduce: 
1. configure more peers than servers of higher stratum
2. insert leap second at higher stratum

Actual results: received leap second indication is ignored

Expected results: leap second is injected

Additional info:

Suggested changes:

- revert change ;-) i.e. if (leap_vote > 1)
  but I agree that this might not be ideal in all cases so a more complex 
  scheme might need to be found:

- always accept leap if the server is at higher stratum else still take the  
  majority (so of servers/peers of same or lower stratums)

- or introduce a new configurable parameter : (if leap_vote > parameter))

- or ?

--- Additional comment from Miroslav Lichvar on 2017-01-05 09:56:31 EST ---

The problem with the original approach was that it was too easy to spread false leap seconds and it did happen in the past. Your suggestion to take into account stratum sounds interesting, but this is a complex problem and I'd strongly suggest to report this on bugs.ntp.org, or one of the ntp mailing lists, so more people can discuss it.

--- Additional comment from Laurent Deniel on 2017-01-05 11:10:01 EST ---

As customer of Red Hat and user of RHEL, I would assume it is part of Red Hat job to check/discuss with upstream if the patch is not trivial ;-) 
Anyway: http://bugs.ntp.org/show_bug.cgi?id=3364

--- Additional comment from Chris Williams on 2017-06-13 14:34:46 EDT ---

Red Hat Enterprise Linux 6 transitioned to the Production 3 Phase on May 10, 2017.  During the Production 3 Phase, Critical impact Security Advisories (RHSAs) and selected Urgent Priority Bug Fix Advisories (RHBAs) may be released as they become available.
 
The official life cycle policy can be reviewed here:
 
http://redhat.com/rhel/lifecycle
 
This issue does not appear to meet the inclusion criteria for the Production Phase 3 and will be marked as CLOSED/WONTFIX. If this remains a critical requirement, please contact Red Hat Customer Support to request a re-evaluation of the issue, citing a clear business justification.  Red Hat Customer Support can be contacted via the Red Hat Customer Portal at the following URL:
 
https://access.redhat.com

Note You need to log in before you can comment on or make changes to this bug.