Description of problem: Upgrading RHV-H restores some hosted-engine configuration from previous layers. A host that was undeployed gets its deployment configuration back after upgrade, reverting the undeployment and joining the HA cluster again. Version-Release number of selected component (if applicable): Several, including 4.1.8 Async, see reproduction for specifics. How reproducible: Always Steps to Reproduce: 1) Install host with 4.1.7 image, deploying Hosted-Engine Results: IMAGE: rhvh-4.1-0.20171101.0: # ll /etc/ovirt-hosted-engine/hosted-engine.conf -rw-r--r--. 1 root root 1089 Jan 18 14:50 /etc/ovirt-hosted-engine/hosted-engine.conf # systemctl status ovirt-ha-agent | grep Active Active: active (running) since Thu 2018-01-18 14:55:49 AEST; 1min 58s ago 2) Switch host to maintenance and upgrade to 4.1.8 (not async yet), reboot. # yum install redhat-virtualization-host-image-update-4.1-20171207.0.el7_4 Results: IMAGE: rhvh-4.1-0.20171207.0+1 # ll /etc/ovirt-hosted-engine/hosted-engine.conf -rw-r--r--. 1 root root 1089 Jan 18 14:50 /etc/ovirt-hosted-engine/hosted-engine.conf # systemctl status ovirt-ha-agent | grep Active Active: active (running) since Thu 2018-01-18 15:27:03 AEST; 1min 38s ago 3) Undeploy Hosted Engine (Maintenance -> Re-install -> Hosted-Engine -> Undeploy) Results: # ls /etc/ovirt-hosted-engine/ hosted-engine.conf.20180118152956 virsh_auth.conf # systemctl status ovirt-ha-agent | grep Loaded Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; disabled; vendor preset: disabled) So all good until this point. The file was moved away and the ha daemon was disabled and stopped. Now cames the problem. 4) Upgrade to latest (4.1.8 async as of today) and reboot Results: IMAGE: rhvh-4.1-0.20180102.0+1 # ls /etc/ovirt-hosted-engine hosted-engine.conf hosted-engine.conf.20180118152956 virsh_auth.conf # systemctl status ovirt-ha-agent | egrep 'Loaded|Active' Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; enabled; vendor preset: disabled) Active: active (running) since Thu 2018-01-18 16:00:43 AEST; 1min 41s ago At step 4, the hosted-engine.conf file should not be there and the ha services should be disabled. So the Hosted-Engine configuration came back from the dead. imgbase logs: First upgrade: 2018-01-18 15:13:49,180 [DEBUG] (migrate_etc) Calling binary: (['cp', '-a', '-r', u'/tmp/mnt.l56vW///etc/systemd/system/multi-user.target.wants/ovirt-ha-agent.service', u'/tmp/mnt.gERGc///etc/systemd/system/multi-user.target.wants/ovirt-ha-agent.service'],) {} 2018-01-18 15:13:49,707 [DEBUG] (migrate_etc) Calling binary: (['cp', '-a', '-r', u'/tmp/mnt.l56vW///etc/ovirt-hosted-engine/hosted-engine.conf', u'/tmp/mnt.gERGc///etc/ovirt-hosted-engine/hosted-engine.conf'],) {} Second upgrade: 2018-01-18 15:46:36,547 [DEBUG] (migrate_etc) Calling binary: (['cp', '-a', '-r', u'/tmp/mnt.lbXZw///etc/systemd/system/multi-user.target.wants/ovirt-ha-agent.service', u'/tmp/mnt.2zgs8///etc/systemd/system/multi-user.target.wants/ovirt-ha-agent.service'],) {} 2018-01-18 15:46:37,610 [DEBUG] (migrate_etc) Calling binary: (['cp', '-a', '-r', u'/tmp/mnt.lbXZw///etc/ovirt-hosted-engine/hosted-engine.conf', u'/tmp/mnt.2zgs8///etc/ovirt-hosted-engine/hosted-engine.conf'],) {} 2018-01-18 15:46:37,614 [DEBUG] (migrate_etc) Calling binary: (['cp', '-a', '-r', u'/tmp/mnt.lbXZw///etc/ovirt-hosted-engine/hosted-engine.conf.20180118152956', u'/tmp/mnt.2zgs8///etc/ovirt-hosted-engine/hosted-engine.conf.20180118152956'],) {} Actual results: HE Deployment came back Expected results: HE Undeployed Additional info: This is similar to what was reported in BZ1501047, now for hosted engine.
Can reproduce. Additional host to the hostedengine cluster: 1. Deploy HostedEngine on one host, then add the another host to the HostedEngine cluster as the hostedengine additional host --------------------------------------------------------------------- [root@localhost ~]# imgbase w You are on rhvh-4.1-0.20171101.0+1 ll /etc/ovirt-hosted-engine total 8 -rw-r--r--. 1 root root 1039 Jan 22 15:12 hosted-engine.conf -rw-------. 1 root root 103 Nov 2 04:34 virsh_auth.conf [root@localhost ~]# systemctl status ovirt-ha-agent | grep Active Active: active (running) since Mon 2018-01-22 15:12:45 CST; 32min ago --------------------------------------------------------------------- 2. Upgrade to 4.1.8 (not async yet), reboot. --------------------------------------------------------------------- [root@hp-dl385pg8-11 ~]# imgbase w You are on rhvh-4.1-0.20171207.0+1 [root@hp-dl385pg8-11 ~]# ll /etc/ovirt-hosted-engine total 8 -rw-r--r--. 1 root root 1039 Jan 22 15:12 hosted-engine.conf -rw-------. 1 root root 103 Dec 8 04:16 virsh_auth.conf [root@hp-dl385pg8-11 ~]# systemctl status ovirt-ha-agent | grep Active Active: active (running) since Mon 2018-01-22 16:00:24 CST; 1min 18s ago ---------------------------------------------------------------------------- 3. Undeploy Hosted Engine (Maintenance -> Re-install -> Hosted-Engine -> Undeploy) ----------------------------------------------------------------------- ll /etc/ovirt-hosted-engine total 8 -rw-r--r--. 1 root root 1039 Jan 22 15:12 hosted-engine.conf.20180122160414 -rw-------. 1 root root 103 Dec 8 04:16 virsh_auth.conf # systemctl status ovirt-ha-agent | grep Active Active: failed (Result: exit-code) since Mon 2018-01-22 16:05:41 CST; 1min 21s ago # systemctl status ovirt-ha-agent | grep Loaded Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; disabled; vendor preset: disabled) --------------------------------------------------------------------------- 4. Upgrade to latest (4.1.8 async as of today) and reboot -------------------------------------------------------------------------- [root@hp-dl385pg8-11 ~]# imgbase w You are on rhvh-4.1-0.20180102.0+1 [root@hp-dl385pg8-11 ~]# ll /etc/ovirt-hosted-engine total 12 -rw-r--r--. 1 root root 1039 Jan 22 16:17 hosted-engine.conf -rw-r--r--. 1 root root 1039 Jan 22 15:12 hosted-engine.conf.20180122160414 -rw-------. 1 root root 103 Jan 3 07:17 virsh_auth.conf [root@hp-dl385pg8-11 ~]# systemctl status ovirt-ha-agent | egrep 'Loaded|Active' Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; enabled; vendor preset: disabled) Active: active (running) since Mon 2018-01-22 16:29:35 CST; 29s ago ------------------------------------------------------------------------------
Created attachment 1384248 [details] /var/log/*
The bug was fixed. Here is the verified process. Additional host to the hostedengine cluster: 1. Deploy HostedEngine on one host, then add the another host to the HostedEngine cluster as the hostedengine additional host --------------------------------------------------------------------- [root@hp-dl385pg8-11 ~]# imgbase w You are on rhvh-4.1-0.20171101.0+1 ll /etc/ovirt-hosted-engine total 8 -rw-r--r--. 1 root root 1039 Jan 30 14:00 hosted-engine.conf -rw-------. 1 root root 103 Nov 2 04:34 virsh_auth.conf [root@hp-dl385pg8-11 ~]# systemctl status ovirt-ha-agent | grep Active Active: active (running) since Tue 2018-01-30 14:00:46 CST; 2min 19s ago --------------------------------------------------------------------- 2. Upgrade to 4.1.8 (not async yet), reboot. --------------------------------------------------------------------- [root@hp-dl385pg8-11 ~]# imgbase w You are on rhvh-4.1-0.20171207.0+1 [root@hp-dl385pg8-11 ~]# ll /etc/ovirt-hosted-engine total 8 -rw-r--r--. 1 root root 1039 Jan 30 14:00 hosted-engine.conf -rw-------. 1 root root 103 Dec 8 04:16 virsh_auth.conf [root@hp-dl385pg8-11 ~]# systemctl status ovirt-ha-agent | grep Active Active: active (running) since Tue 2018-01-30 14:22:36 CST; 2min 56s ago ---------------------------------------------------------------------------- 3. Undeploy Hosted Engine (Maintenance -> Re-install -> Hosted-Engine -> Undeploy) ----------------------------------------------------------------------- [root@hp-dl385pg8-11 ~]# ll /etc/ovirt-hosted-engine total 8 -rw-r--r--. 1 root root 1039 Jan 30 14:00 hosted-engine.conf.20180130142716 -rw-------. 1 root root 103 Dec 8 04:16 virsh_auth.conf [root@hp-dl385pg8-11 ~]# systemctl status ovirt-ha-agent | grep Active Active: failed (Result: exit-code) since Tue 2018-01-30 14:28:07 CST; 1min 18s ago [root@hp-dl385pg8-11 ~]# systemctl status ovirt-ha-agent | grep Loaded Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; disabled; vendor preset: disabled) --------------------------------------------------------------------------- 4. Upgrade to latest (latest 4.2.1) and reboot -------------------------------------------------------------------------- [root@hp-dl385pg8-11 ~]# imgbase w You are on rhvh-4.2.1.2-0.20180126.0+1 [root@hp-dl385pg8-11 ~]# ll /etc/ovirt-hosted-engine total 8 -rw-r--r--. 1 root root 1039 Jan 30 14:00 hosted-engine.conf.20180130142716 -rw-------. 1 root root 103 Jan 26 20:16 virsh_auth.conf [root@hp-dl385pg8-11 ~]# systemctl status ovirt-ha-agent | egrep 'Loaded|Active' Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; disabled; vendor preset: disabled) Active: inactive (dead) ----------------------------------------------------------------------------- So, change ths bug's status to verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:1524
BZ<2>Jira Resync
sync2jira