This clone is supposed to track the backport to handle unexpected issues and exceptions better.
Bug tickets must have version flags set prior to targeting them to a release. Please ask maintainer to set the correct version flags and only then set the target milestone.
feel free to push a backport sooner, but from my perspective this is fixed in 4.1 and the only thing it affects is bug 1362618 which is hopefully covered by bug 1392903. So this backport would be just a reassurance for HE stability
Works for me on these components on hosts: libvirt-client-2.0.0-10.el7.x86_64 qemu-kvm-rhev-2.6.0-27.el7.x86_64 ovirt-hosted-engine-ha-2.0.4-1.el7ev.noarch ovirt-setup-lib-1.0.2-1.el7ev.noarch ovirt-imageio-daemon-0.4.0-0.el7ev.noarch ovirt-vmconsole-1.0.4-1.el7ev.noarch mom-0.5.8-1.el7ev.noarch vdsm-4.18.15.3-1.el7ev.x86_64 ovirt-hosted-engine-setup-2.0.3-2.el7ev.noarch sanlock-3.4.0-1.el7.x86_64 ovirt-host-deploy-1.5.3-1.el7ev.noarch ovirt-engine-sdk-python-3.6.9.1-1.el7ev.noarch ovirt-imageio-common-0.3.0-0.el7ev.noarch ovirt-vmconsole-host-1.0.4-1.el7ev.noarch Linux version 3.10.0-514.el7.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Wed Oct 19 11:24:13 EDT 2016 Linux 3.10.0-514.el7.x86_64 #1 SMP Wed Oct 19 11:24:13 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux Red Hat Enterprise Linux Server release 7.3 (Maipo) One engine: rhevm-setup-plugins-4.0.0.3-1.el7ev.noarch rhevm-4.0.6.1-0.1.el7ev.noarch rhevm-spice-client-x86-msi-4.0-3.el7ev.noarch rhev-release-4.0.6-3-001.noarch rhevm-dependencies-4.0.0-1.el7ev.noarch rhev-guest-tools-iso-4.0-6.el7ev.noarch rhevm-spice-client-x64-msi-4.0-3.el7ev.noarch rhevm-branding-rhev-4.0.0-5.el7ev.noarch rhevm-guest-agent-common-1.0.12-3.el7ev.noarch rhevm-doc-4.0.6-1.el7ev.noarch Linux version 3.10.0-514.el7.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Wed Oct 19 11:24:13 EDT 2016 Linux 3.10.0-514.el7.x86_64 #1 SMP Wed Oct 19 11:24:13 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux Red Hat Enterprise Linux Server release 7.3 (Maipo) I've deployed a clean deployment of rhevm-appliance-20161116.0-1.el7ev.noarch (el7.3 based) over NFS, then upgraded the engine to latest components as appears above, then added NFS data storage domain to get hosted_storage auto-imported in to the engine's WEBUI. Then added additional hosed engine host via WEBUI. Made at least 8 iterations of steps 1-10: 1.HE-VM running on alma03 with ha score 3400. 2.Setting alma03 in to the maintenance via WEBUI. 3.HE-VM migrated to alma04 successfully and alma03 got to maintenance with ha score of 0 and seen in local maintenance from CLI and WEBUI. 4.Activated alma03 back without any issues and host went back active. 5.Waited for alma03 to get ha score of 3400 a few minutes. 6.HE-VM running on alma04 with ha score 3400. 7.Setting alma04 in to the maintenance via WEBUI. 8.HE-VM migrated to alma03 successfully and alma04 got to maintenance with ha score of 0 and seen in local maintenance from CLI and WEBUI. 9.Activated alma04 back without any issues and host went back active. 10.Waited for alma04 to get ha score of 3400 a few minutes. I did not gotten in to the initally reported issue, hence moving this bug to verified.
Changing back to assigned, as bug was eventually reproduced after ~10th iteration, when I did not waited for target host to become active with positive score, I've tried to set host with HE-VM on it (alma04) in to maintenance, while alma03 was active, but not in positive HA score. attaching sosreports from my environment.
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.
Created attachment 1225889 [details] sosreport from alma03
Created attachment 1225892 [details] sosreport from alma04
Created attachment 1225893 [details] sosreport from engine
Moving back to verified, as what I've found would be documented and separate bug will be opened. As for this exact issue was not reproduced, then moving it to verified.
See also https://bugzilla.redhat.com/show_bug.cgi?id=1069269, which was opened forth to https://bugzilla.redhat.com/show_bug.cgi?id=1391933#c5 by msivak.
Sorry, added a wrong link, this is the correct one: https://bugzilla.redhat.com/show_bug.cgi?id=1399766.