Red Hat Bugzilla – Bug 1462081
/etc/NetworkManager/dispatcher.d/20-chrony is executed result in the source force to offline which is not expected.
Last modified: 2018-04-10 08:06:44 EDT
[issue] When I set one of NICs down(the other NIC can connect to the chrony source), the /etc/NetworkManager/dispatcher.d/20-chrony is executed, result in the source is force to offline. [version] RHEL7.1 chrony-2.4-4.fc24.x86_64 [Env] chrony server: two NICs with different network. ~~~ inet 192.168.122.131/24 brd 192.168.122.255 scope global eth0 inet 192.168.100.189/24 brd 192.168.100.255 scope global bond2 # ip r 192.168.100.0/24 dev bond2 proto kernel scope link src 192.168.100.189 metric 300 192.168.122.0/24 dev eth0 proto kernel scope link src 192.168.122.131 metric 100 ~~~ chrony source's IP: 192.168.122.111. eth0's network is the same as source's network, so the default is unnecessary. How reproducible: Steps to Reproduce: 1. check the status ~~~ [root@localhost ~]# chronyc sources 210 Number of sources = 1 MS Name/IP address Stratum Poll Reach LastRx Last sample =============================================================================== ^* r511 11 6 77 41 -8410ms[ -37.8s] +/- 13ms ~~~ 2. make the bond2 down # ifdown bond2 2. check the status ~~~ [root@localhost ~]# chronyc activity 200 OK 0 sources online 1 sources offline <--- the source is offline due to the down bond2. 0 sources doing burst (return to online) 0 sources doing burst (return to offline) 0 sources with unknown address [root@localhost ~]# chronyc sources 210 Number of sources = 1 MS Name/IP address Stratum Poll Reach LastRx Last sample =============================================================================== ^* r511 11 6 177 446 +221ms[-8410ms] +/- 13ms ^ this result tell us that the sync is normal. ~~~ 3. change the source time, so we can check the source is offline or not. I got that result. ~~~ [root@localhost ~]# chronyc sources 210 Number of sources = 1 MS Name/IP address Stratum Poll Reach LastRx Last sample =============================================================================== ^* r511 11 6 177 1018 +221ms[-8410ms] +/- 13ms ~~~ The "Reach" is not get bigger, and the "LastRx" is getting bigger. The time is not be synced. Actual results: Making Unrelated NIC down, cause the source offline. Expected results: I still have an NIC that can connect to the source, when I set the Unrelated NIC to down. The source shouldn't get offline. I don't really understand why this script should be executed. Can anyone give some advice. Thanks mengyi
The chrony dispatcher script (incorrectly) assumes that a default route is needed to reach all NTP sources. That works in most cases, but may fail with local NTP servers. The simple solution would be to keep all sources in the online state. This would slow down resynchronization of sources that are not reachable when the network is down. A better solution would probably be to check each source separately if it isn't in a local network. I'm not sure how feasible that would be in a shell script.
A fix for this issue was included in upstream git: https://git.tuxfamily.org/chrony/chrony.git/commit/?id=ae82bbbacecec56e1f893fd038ca10323f8760f0
Sanity Only.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0753