Bug 1903777

Summary: chronyd is disabled after upgrading RHV-H 4.4.2 -> 4.4.3
Product: Red Hat Enterprise Virtualization Manager Reporter: amashah
Component: imgbasedAssignee: Asaf Rachmani <arachman>
Status: CLOSED ERRATA QA Contact: peyu
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.4.3CC: arachman, cshao, dfediuck, emarcus, lsurette, lsvaty, matteo.panella, mavital, patrizio.bassi, peyu, sacpatil, sbonazzo, shlei, weiwang, yaniwang, ycui
Target Milestone: ovirt-4.4.4-2Keywords: Regression
Target Release: 4.4.4   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: imgbased-1.2.15 Doc Type: Bug Fix
Doc Text:
Previously, the chronyd symlink was removed during the upgrade process. As a result, the chronyd service was disabled following the upgrade. In this release, the chronyd service is enabled after upgrade.
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-03-23 18:51:42 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Node RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1916659    

Description amashah 2020-12-02 18:38:15 UTC
Description of problem:
After upgrading from RHV-H 4.4.2 to RHV-H 4.4.3, chronyd is disabled.

Version-Release number of selected component (if applicable):
4.4.3

How reproducible:
100%

Steps to Reproduce:
1. Update RHV-H 4.4.2 -> 4.4.3 
2. systemctl status chronyd

Actual results:
chronyd.service is not running and disabled.

Expected results:
chronyd.service should be running, enabled

Additional info:
This caused issues in a customer environment, as even after enabling / starting chronyd, vdsmd was still confused and caused VM's to be non-responsive even though they were running fine.

Restarting vdsmd on the affected node fixes the VM state, but this would cause serious issues when performing the update/reboot as part of a clusterwide update.


- For some reason we remove /etc/systemd/system/multi-user.target.wants/chronyd.service below is from imgbased.log 

~~~
2020-11-16 18:15:19,936 [DEBUG] (migrate_ntp_to_chrony) Calling: (['systemctl', 'status', 'ntpd.service'],) {'close_fds': True, 'stderr': -2}
2020-11-16 18:15:19,955 [DEBUG] (migrate_ntp_to_chrony) Exception! b'Unit ntpd.service could not be found.\n'
2020-11-16 18:15:19,955 [DEBUG] (migrate_ntp_to_chrony) ntpd is disabled, not migrating conf to chrony
2020-12-02 18:12:04,751 [DEBUG] (remediate_etc) Planning to remove /etc/systemd/system/multi-user.target.wants/chronyd.service
2020-12-02 18:12:05,033 [DEBUG] (migrate_var) Migrating files by rpm databases: ['/var/lock', '/var/cache/ldconfig/aux-cache', '/var/cache/dnf/expired_repos.json', '/var/cache/dnf/anaconda-filenames.s
2020-12-02 18:12:05,046 [DEBUG] (migrate_var) Calling: (['rpm', '-qf', '/var/lock', '/var/cache/ldconfig/aux-cache', '/var/cache/dnf/expired_repos.json', '/var/cache/dnf/anaconda-filenames.solvx', '/v
2020-12-02 18:12:10,440 [DEBUG] (remediate_etc) os.unlink(/tmp/mnt.J7x6l////etc/systemd/system/multi-user.target.wants/chronyd.service)
2020-12-02 18:12:25,133 [DEBUG] (migrate_var) Updated file /var/lib/selinux/targeted/active/modules/100/chronyd/cil
2020-12-02 18:12:25,135 [DEBUG] (migrate_var) Updated file /var/lib/selinux/targeted/active/modules/100/chronyd/hll
2020-12-02 18:12:26,999 [DEBUG] (migrate_etc) chrony changed from 995 to 996
2020-12-02 18:12:27,000 [DEBUG] (migrate_etc) chrony changed from 992 to 993
2020-12-02 18:12:27,002 [DEBUG] (migrate_etc) chrony changed from 995 to 996
2020-12-02 18:12:27,003 [DEBUG] (migrate_etc) chrony changed from 992 to 993
~~~

- While on 4.4.2, the service is active/running/enabled:
~~~
# systemctl status chronyd
● chronyd.service - NTP client/server
   Loaded: loaded (/usr/lib/systemd/system/chronyd.service; enabled; vendor preset: disabled)
   Active: active (running) since Mon 2020-11-16 18:38:32 UTC; 2 weeks 1 days ago
     Docs: man:chronyd(8)
           man:chrony.conf(5)
 Main PID: 1311 (chronyd)
    Tasks: 1 (limit: 204376)
   Memory: 2.7M
   CGroup: /system.slice/chronyd.service
           └─1311 /usr/sbin/chronyd

Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.
~~~


The file removed is a symlink:

# ll /etc/systemd/system/multi-user.target.wants/chronyd.service
lrwxrwxrwx. 1 root root 39 Sep 29 21:52 /etc/systemd/system/multi-user.target.wants/chronyd.service -> /usr/lib/systemd/system/chronyd.service

# ll /usr/lib/systemd/system/chronyd.service
-rw-r--r--. 1 root root 491 May 22  2019 /usr/lib/systemd/system/chronyd.service



==============================

After updating to 4.4.3, reboot


# systemctl status chronyd
● chronyd.service - NTP client/server
   Loaded: loaded (/usr/lib/systemd/system/chronyd.service; disabled; vendor preset: disabled)
   Active: inactive (dead)
     Docs: man:chronyd(8)
           man:chrony.conf(5)

# ll /etc/systemd/system/multi-user.target.wants/chronyd.service
ls: cannot access '/etc/systemd/system/multi-user.target.wants/chronyd.service': No such file or directory

Comment 1 Sandro Bonazzola 2020-12-03 07:56:14 UTC
reducing severity to high since you can still re-enable chronyd. Asaf can you please check why this service is being disabled on uprgade?

Comment 3 peyu 2020-12-04 04:10:15 UTC
QE reproduced this issue.

Test Steps:
1. Install RHVH-4.4-20200930.0-RHVH-x86_64-dvd1.iso
2. Check the chronyd service status
   # systemctl status chronyd
3. Set up local repo and point to "redhat-virtualization-host-4.4.3-20201116.0.el8_3"
4. Add host to RHVM
5. Upgrade the host via RHVM
6. Check the chronyd service status again after upgrade

Test results:
chronyd.service is inactive.
~~~~~~
# systemctl status chronyd
● chronyd.service - NTP client/server
   Loaded: loaded (/usr/lib/systemd/system/chronyd.service; disabled; vendor preset: disabled)
   Active: inactive (dead)
     Docs: man:chronyd(8)
           man:chrony.conf(5)
~~~~~~

Comment 4 peyu 2020-12-14 07:20:38 UTC
This issue does not occur, when upgrading RHVH from “redhat-virtualization-host-4.4.3-20201116.0.el8_3” to the latest "redhat-virtualization-host-4.4.3-20201210.0.el8_3".

Comment 9 peyu 2021-01-18 02:20:31 UTC
Pending New Build

Comment 12 peyu 2021-03-09 05:17:38 UTC
QE verified this issue on "redhat-virtualization-host-4.4.4-20210307.0.el8_3".

Test Steps:
1. Install RHVH-4.4-20210202.0-RHVH-x86_64-dvd1.iso
2. Check the chronyd service status
   # systemctl status chronyd
3. Set up local repo and point to "redhat-virtualization-host-4.4.4-20210307.0.el8_3"
4. Add host to RHVM
5. Upgrade the host via RHVM
6. Check the chronyd service status again after upgrade

Test results:
After upgrade, chronyd.service is active.
~~~~~~
# imgbase w
You are on rhvh-4.4.4.2-0.20210307.0+1

# imgbase layout
rhvh-4.4.4.1-0.20210201.0
 +- rhvh-4.4.4.1-0.20210201.0+1
rhvh-4.4.4.2-0.20210307.0
 +- rhvh-4.4.4.2-0.20210307.0+1

# systemctl status chronyd
● chronyd.service - NTP client/server
   Loaded: loaded (/usr/lib/systemd/system/chronyd.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2021-03-09 00:02:46 EST; 8min ago
     Docs: man:chronyd(8)
           man:chrony.conf(5)
  Process: 1766 ExecStartPost=/usr/libexec/chrony-helper update-daemon (code=exited, status=0/SUCCESS)
  Process: 1736 ExecStart=/usr/sbin/chronyd $OPTIONS (code=exited, status=0/SUCCESS)
 Main PID: 1751 (chronyd)
    Tasks: 1 (limit: 203464)
   Memory: 5.7M
   CGroup: /system.slice/chronyd.service
           └─1751 /usr/sbin/chronyd

Mar 09 00:02:45 localhost.localdomain chronyd[1751]: Using right/UTC timezone to obtain leap second data
Mar 09 00:02:46 localhost.localdomain systemd[1]: Started NTP client/server.
Mar 09 00:02:51 localhost.localdomain chronyd[1751]: Source 10.66.127.10 offline
Mar 09 00:02:51 localhost.localdomain chronyd[1751]: Source 134.226.81.3 offline
Mar 09 00:02:51 localhost.localdomain chronyd[1751]: Source 10.5.26.10 offline
Mar 09 00:02:58 dell-per730-35.lab.eng.pek2.redhat.com chronyd[1751]: Source 10.66.127.10 online
Mar 09 00:02:58 dell-per730-35.lab.eng.pek2.redhat.com chronyd[1751]: Source 134.226.81.3 online
Mar 09 00:02:58 dell-per730-35.lab.eng.pek2.redhat.com chronyd[1751]: Source 10.5.26.10 online
Mar 09 00:03:04 dell-per730-35.lab.eng.pek2.redhat.com chronyd[1751]: Selected source 10.5.26.10
Mar 09 00:03:04 dell-per730-35.lab.eng.pek2.redhat.com chronyd[1751]: System clock TAI offset set to 37 seconds
~~~~~~

Comment 15 errata-xmlrpc 2021-03-23 18:51:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat Virtualization Host security, bug fix and enhancement update (4.4.4-2)), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:0976