RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2101063 - when chronyd cannot reach sources at startup they remain offline
Summary: when chronyd cannot reach sources at startup they remain offline
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: Documentation
Version: CentOS Stream
Hardware: Unspecified
OS: Unspecified
high
unspecified
Target Milestone: rc
: ---
Assignee: Šárka Jana
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-06-25 03:43 UTC by Andrew Schorr
Modified: 2023-03-01 20:20 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-03-01 20:20:13 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-126294 0 None None None 2022-06-25 03:50:57 UTC

Description Andrew Schorr 2022-06-25 03:43:12 UTC
Description of problem:
The chronyd.service file After conditions result in its starting very early in the boot process, before network interfaces are up. It seems that it is not connecting to those sources even after the network comes up. The simple fix seems to be to patch the service file to start chronyd after the network comes up, but I'm guessing there's a bug here and that it ought to be able to find those sources once network connectivity becomes available. In my case, the servers are not on directly connected network, so require routing, but that aspect seems unlikely to matter.


Version-Release number of selected component (if applicable):
chrony-4.2-1.el9.x86_64


How reproducible:
Always

Steps to Reproduce:
1. Configure /etc/chrony.conf with some servers
2. Reboot system
3. Run 'chronyc sources' and 'chronyc activity'

Actual results:
bash-5.1$ chronyc -n sources
MS Name/IP address         Stratum Poll Reach LastRx Last sample               
===============================================================================
^? 192.168.79.30                 0   7     0     -     +0ns[   +0ns] +/-    0ns
^? 192.168.79.32                 0   7     0     -     +0ns[   +0ns] +/-    0ns
^? 192.168.59.23                 0   7     0     -     +0ns[   +0ns] +/-    0ns
^? 192.168.59.50                 0   7     0     -     +0ns[   +0ns] +/-    0ns
bash-5.1$ chronyc activity
200 OK
0 sources online
4 sources offline
0 sources doing burst (return to online)
0 sources doing burst (return to offline)
0 sources with unknown address


Expected results:
It should connect to the servers and set the clock properly.


Additional info:
Restarting the service fixes the problem, or adding After=network-online.target seems to fix the problem. But I feel as if the daemon ought to be able to connect after the network comes up without requiring a restart; maybe I am confused.

Comment 1 Miroslav Lichvar 2022-06-27 08:05:33 UTC
Is the network configured by NetworkManager? Static configuration or DHCP?

There is a NetworkManager-dispatcher script (/usr/lib/NetworkManager/dispatcher.d/20-chrony-onoffline) which calls "chronyc onoffline" on some specific events. If the NTP sources remain in the offline state, that indicates it ran at least once when there was no route to the servers and it didn't run again when they become reachable.

Comment 2 Andrew Schorr 2022-06-27 13:18:17 UTC
Yes. It is configured by NetworkManager. It is a static configuration using legacy-style
/etc/sysconfig/network-scripts/ifcfg-* files.

However, I am configuring static routes separately at a later stage. Maybe that's the problem.
I guess that the "chronyc onoffline" command is getting called after NetworkManager brings up
the interfaces but before the default route is in place. 

My solution was to add a dropin script for chronyd to start after the network was up. Is
there some benefit to starting chronyd before network-online.target?

Maybe this is a quirk of how I'm configuring routes, but isn't it actually a problem on any
system that doesn't have a static routing configuration that is loaded by NetworkManager?
What happens if a system is a router running quagga or FRRouting? In such a case, the route
to the time servers may not become available until after a bit of a delay as routes are learned.
Does chronyc stay stuck in that case? I did not have this issue in RHEL 8. Why isn't chronyc
smart enough to retry contacting its sources periodically?

Comment 3 Miroslav Lichvar 2022-06-28 08:58:03 UTC
chrony supports reference clocks and other modes of operation where it doesn't make sense to wait for the network connection.

chronyd polls all online sources regularly. The point of switching the sources between the offline and online states is to minimize the time needed for a resync on machines that are only rarely or briefly connected to network. If you don't need that, you can disable the dispatcher script by adding a symlink to /dev/null:

ln -s /dev/null /etc/NetworkManager/dispatcher.d/20-chrony-onoffline

Otherwise you would need to be modify your scripts to run the chronyc onoffline command.

With the routing daemons it probably won't work. It doesn't seem to be a common configuration, or at least I don't recall any bug reports.

I'm not sure why it worked for you on RHEL8. Do you know what was the chrony package version and release?

Comment 4 Miroslav Lichvar 2022-06-28 09:11:53 UTC
Note that servers specified by hostname are not switched to the offline state if their address is not resolved yet. If DNS depends on the same network route as NTP, this wouldn't be an issue.

Comment 5 Andrew Schorr 2022-06-28 15:27:57 UTC
I use numeric IPv4 addresses in /etc/chrony.conf to avoid hostname lookup issues.
On 8, I'm using chrony-4.1-1.el8.x86_64, but to be fair, I use the legacy
network.service to bring up interfaces instead of NetworkManager, so that could
affect the timing.

Comment 6 Miroslav Lichvar 2022-06-30 12:00:47 UTC
I don't see a good solution as there are conflicting requirements, but I think it would be good to at least document how the dispatcher script can be enabled to keep the sources online.

Comment 7 Andrew Schorr 2022-06-30 13:12:15 UTC
Agreed. Thanks for the explanation. For a site like mine where all of the clock sources are over the
network, I think simply adding /etc/systemd/system/chronyd.service.d/after-network.conf
is the simplest solution:

[Unit]
After=network-online.target

Regards,
Andy


Note You need to log in before you can comment on or make changes to this bug.