Bug 2040915
| Summary: | Rebooting OSP16.1 nodes can leave chronyd with all of its time sources marked as offline | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Mark Jones <marjones> |
| Component: | chrony | Assignee: | Miroslav Lichvar <mlichvar> |
| Status: | CLOSED WONTFIX | QA Contact: | rhel-cs-infra-services-qe <rhel-cs-infra-services-qe> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 8.4 | CC: | aschultz, astupnik, bdobreli |
| Target Milestone: | rc | Flags: | pm-rhel:
mirror+
|
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-01-25 22:28:07 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Mark Jones
2022-01-14 21:12:26 UTC
OSP Does not control the startup of chrony so this is likely an issue with chrony shipped with RHEL8.4 as OSP16.2 is based on 8.4 while OSP16.1 is based on 8.2. I found Bug 1930468 which may be related to the reported issue. It should be noted that we do not use NetworkManager to manage interfaces in OSP but it is still running. The only difference between the RHEL8.2 and RHEL8.4 chrony packages is that the later recommends dhcp-client in order to enable the NetworkManager dispatcher script (bug #1930468). In RHEL8.5 the script was reworked to not rely on the dhcp-client package and it is no longer recommended by the chrony package. From the logs in referenced sosreport it seems the management interface is controlled by NetworkManager. The NTP servers specified in chrony.conf are in a different network and they are specified by IP addresses. That means if the management interface is up before the interface needed to access the NTP servers (which is not controlled by NetworkManager), the chronyc onoffline command called from the dispatcher script will set the sources to the offline state and there is no other call of the command later to set them online. This is a limitation of the chrony dispatcher script. It doesn't work with interfaces and routes (which can change over time) to set the state of the sources individually. The assumption of the script is that if it is called at least once with the up or down event, it will get all events needed to cover all configured NTP servers. Using NetworkManager only for an interface which doesn't give access to the servers breaks that assumption. I don't see a good fix that would work in the default configuration. I guess we could add a new option to disable the dispatcher script completely, similarly to how PEERNTP=no disables NTP servers from DHCP. In RHEL9 the dispatcher scripts are located in /usr/lib/NetworkManager/dispatcher.d and can be disabled by making a /dev/null symlink of the same name in /etc/NetworkManager/dispatcher.d. I'm not sure if moving the chrony dispatcher to that directory in a RHEL8 update would be acceptable. What is controlling the other interfaces? Could it be patched to call the "chronyc onoffline" command when the interfaces are up? The interfaces in OSP are managed via the legacy network-scripts package as we do not support NetworkManager due to other limitations. OSP16.2 will never get updates to 8.5 as it is following 8.4 EUS. A fix would be required to 8.4 to address the customer issue. |