RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1461196 - leap second acceptance only if majority of servers breaks hierarchical NTP configuration with peers (since 4.2.6)
Summary: leap second acceptance only if majority of servers breaks hierarchical NTP co...
Keywords:
Status: CLOSED CANTFIX
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: ntp
Version: 7.3
Hardware: All
OS: Linux
medium
high
Target Milestone: rc
: ---
Assignee: Miroslav Lichvar
QA Contact: qe-baseos-daemons
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-06-13 19:34 UTC by Laurent Deniel
Modified: 2019-05-22 14:38 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1410457
Environment:
Last Closed: 2019-05-22 14:38:30 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Laurent Deniel 2017-06-13 19:34:29 UTC
+++ This bug was initially created as a clone of Bug #1410457 +++

Description of problem:

Since 4.2.6 the leap second is only accepted if:
- provided by file or by a REF_CLOCK
- or if a majority of servers announce it (while it was previously accepted if only one server announces it)

This breaks hierarchical NTP configuration when for instance you have stratum 2 servers configured as peers for fault tolerance (and other) reasons.

Example :

  (stratum 1)        (stratum 2)          (stratum 3)
[NTP server 1]  <----- [NTP server A]  <------- NTP Clients 
[NTP server 2]  <----- [NTP server B]  <------- ...
                <----- [NTP server C]  <------- ...
                                       <------- ....
1 and 2 are servers for A,B,C
A,B,C are peers 

There is no distinction between the stratum 1 NTP server and the stratum 2 peers when computing the majority. So the leap lecond announced by the two servers are not taken into account in this case by any stratum 2 servers.

The code is :

} else if (leap_vote > sys_survivors / 2) {

so this gives for the example :

2 > 5/2 ==> 2 > 2 ==> KO

This would only work with 2 machines at stratum 2 (i.e. 1 peer) but this could again fail if one of the 2 NTP server of stratum 1 is not available at this time.

So the new algorithm is not that optimal and breaks existing configurations in production.

Version-Release number of selected component: any 4.2.6 version

How reproducible: yes

Steps to Reproduce: 
1. configure more peers than servers of higher stratum
2. insert leap second at higher stratum

Actual results: received leap second indication is ignored

Expected results: leap second is injected

Additional info:

Suggested changes:

- revert change ;-) i.e. if (leap_vote > 1)
  but I agree that this might not be ideal in all cases so a more complex 
  scheme might need to be found:

- always accept leap if the server is at higher stratum else still take the  
  majority (so of servers/peers of same or lower stratums)

- or introduce a new configurable parameter : (if leap_vote > parameter))

- or ?

--- Additional comment from Miroslav Lichvar on 2017-01-05 09:56:31 EST ---

The problem with the original approach was that it was too easy to spread false leap seconds and it did happen in the past. Your suggestion to take into account stratum sounds interesting, but this is a complex problem and I'd strongly suggest to report this on bugs.ntp.org, or one of the ntp mailing lists, so more people can discuss it.

--- Additional comment from Laurent Deniel on 2017-01-05 11:10:01 EST ---

As customer of Red Hat and user of RHEL, I would assume it is part of Red Hat job to check/discuss with upstream if the patch is not trivial ;-) 
Anyway: http://bugs.ntp.org/show_bug.cgi?id=3364

--- Additional comment from Chris Williams on 2017-06-13 14:34:46 EDT ---

Red Hat Enterprise Linux 6 transitioned to the Production 3 Phase on May 10, 2017.  During the Production 3 Phase, Critical impact Security Advisories (RHSAs) and selected Urgent Priority Bug Fix Advisories (RHBAs) may be released as they become available.
 
The official life cycle policy can be reviewed here:
 
http://redhat.com/rhel/lifecycle
 
This issue does not appear to meet the inclusion criteria for the Production Phase 3 and will be marked as CLOSED/WONTFIX. If this remains a critical requirement, please contact Red Hat Customer Support to request a re-evaluation of the issue, citing a clear business justification.  Red Hat Customer Support can be contacted via the Red Hat Customer Portal at the following URL:
 
https://access.redhat.com

Comment 3 Miroslav Lichvar 2019-05-22 14:38:30 UTC
This issue is still not fixed in upstream. There doesn't seem to be a good solution that would not break some cases where the current approach worked well. It's unlikely to be fixed in RHEL7 as it will be entering the maintenance phase with the next minor release.


Note You need to log in before you can comment on or make changes to this bug.