Created attachment 1185418 [details] See line 31 Description of problem: During upgrade process of host via engine this log message appears: 2016-07-29 08:49:35,653 INFO [org.ovirt.engine.core.bll.hostdeploy.HostUpgradeCallback] (DefaultQuartzScheduler2) [2a46ccda] Host 'red' failed to move to maintenance mode. Upgrade process is terminated. This happens during upgrade of packages before Yum says - Downloading Packages Version-Release number of selected component (if applicable): ovirt-engine-backend-4.0.2.1-0.1.el7ev.noarch How reproducible: 100% Steps to Reproduce: 1. Have old version host 2. In WA portal cluck upgrade 3. See engine.log Actual results: Present INFO message about failed upgrade on successfull try. Expected results: If this message appears interrupt the upgrade process, dont continue and put it as ERROR message, however I believe this should not be present. Additional info: adding log see line 31
Ravi, could you please investigate?
Moving to 4.1 for now, if needed it can be backported to 4.0.z
The fix for this issue should be included in oVirt 4.1.0 beta 1 released on December 1st. If not included please move back to modified.
Dec 27, 2016 4:19:29 PM Failed to upgrade Host puma18.scl.lab.tlv.redhat.com (User: admin@internal-authz). After the upgrade, host stays in local maintenance, although seen in UI as down. puma19 ~]# hosted-engine --vm-status --== Host 1 status ==-- conf_on_shared_storage : True Status up-to-date : True Hostname : puma18.scl.lab.tlv.redhat.com Host ID : 1 Engine status : {"reason": "vm not running on this host", "health": "bad", "vm": "down", "detail": "unknown"} Score : 0 stopped : False Local maintenance : True crc32 : 97891d50 local_conf_timestamp : 6309 Host timestamp : 6297 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=6297 (Tue Dec 27 16:28:30 2016) host-id=1 score=0 vm_conf_refresh_time=6309 (Tue Dec 27 16:28:42 2016) conf_on_shared_storage=True maintenance=True state=LocalMaintenance stopped=False --== Host 2 status ==-- conf_on_shared_storage : True Status up-to-date : True Hostname : puma19.scl.lab.tlv.redhat.com Host ID : 2 Engine status : {"health": "good", "vm": "up", "detail": "up"} Score : 3400 stopped : False Local maintenance : False crc32 : 81fbb545 local_conf_timestamp : 4828 Host timestamp : 4815 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=4815 (Tue Dec 27 16:28:19 2016) host-id=2 score=3400 vm_conf_refresh_time=4828 (Tue Dec 27 16:28:31 2016) conf_on_shared_storage=True maintenance=False state=EngineUp stopped=False Manual intervention on host puma18 with "hosted-engine --set-maintenance --mode=none" successfully removed it from local maintenance and it went to up in CLI, but still appears as down in WEBADMIN. puma19 ~]# hosted-engine --vm-status --== Host 1 status ==-- conf_on_shared_storage : True Status up-to-date : True Hostname : puma18.scl.lab.tlv.redhat.com Host ID : 1 Engine status : {"reason": "vm not running on this host", "health": "bad", "vm": "down", "detail": "unknown"} Score : 3400 stopped : False Local maintenance : False crc32 : 17058747 local_conf_timestamp : 6574 Host timestamp : 6562 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=6562 (Tue Dec 27 16:32:55 2016) host-id=1 score=3400 vm_conf_refresh_time=6574 (Tue Dec 27 16:33:07 2016) conf_on_shared_storage=True maintenance=False state=EngineDown stopped=False --== Host 2 status ==-- conf_on_shared_storage : True Status up-to-date : True Hostname : puma19.scl.lab.tlv.redhat.com Host ID : 2 Engine status : {"health": "good", "vm": "up", "detail": "up"} Score : 3400 stopped : False Local maintenance : False crc32 : 0c4593ad local_conf_timestamp : 5098 Host timestamp : 5085 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=5085 (Tue Dec 27 16:32:49 2016) host-id=2 score=3400 vm_conf_refresh_time=5098 (Tue Dec 27 16:33:01 2016) conf_on_shared_storage=True maintenance=False state=EngineUp stopped=False Clearing cache and history in WEB browser did not helped. I've set the host in to maintenance via WEBADMIN and then activated it back, only then host came back to active in UI. Host's symbol of "upgrade is available" was still shown near the host. Components on engine: ovirt-engine-setup-plugin-ovirt-engine-common-4.1.0-0.3.beta2.20161221085908.el7.centos.noarch ovirt-imageio-proxy-0.5.0-0.201611201242.gitb02532b.el7.centos.noarch ovirt-iso-uploader-4.1.0-0.0.master.20160909154152.git14502bd.el7.centos.noarch ovirt-engine-userportal-4.1.0-0.3.beta2.20161221085908.el7.centos.noarch ovirt-engine-dbscripts-4.1.0-0.3.beta2.20161221085908.el7.centos.noarch ovirt-engine-setup-plugin-vmconsole-proxy-helper-4.1.0-0.3.beta2.20161221085908.el7.centos.noarch ovirt-engine-extensions-api-impl-4.1.0-0.3.beta2.20161221085908.el7.centos.noarch ovirt-imageio-common-0.5.0-0.201611201242.gitb02532b.el7.centos.noarch ovirt-host-deploy-1.6.0-0.0.master.20161215101008.gitb76ad50.el7.centos.noarch python-ovirt-engine-sdk4-4.1.0-0.1.a0.20161215git77fce51.el7.centos.x86_64 ovirt-host-deploy-java-1.6.0-0.0.master.20161215101008.gitb76ad50.el7.centos.noarch ovirt-release41-pre-4.1.0-0.6.beta2.20161221025826.gitc487776.el7.centos.noarch ovirt-setup-lib-1.1.0-1.el7.centos.noarch ovirt-engine-extension-aaa-jdbc-1.1.2-1.el7.noarch ovirt-engine-dwh-setup-4.1.0-0.0.master.20161129154019.el7.centos.noarch ovirt-imageio-proxy-setup-0.5.0-0.201611201242.gitb02532b.el7.centos.noarch ovirt-engine-tools-backup-4.1.0-0.3.beta2.20161221085908.el7.centos.noarch ovirt-engine-websocket-proxy-4.1.0-0.3.beta2.20161221085908.el7.centos.noarch ovirt-engine-setup-4.1.0-0.3.beta2.20161221085908.el7.centos.noarch ovirt-engine-backend-4.1.0-0.3.beta2.20161221085908.el7.centos.noarch ovirt-engine-tools-4.1.0-0.3.beta2.20161221085908.el7.centos.noarch ovirt-engine-webadmin-portal-4.1.0-0.3.beta2.20161221085908.el7.centos.noarch ovirt-engine-restapi-4.1.0-0.3.beta2.20161221085908.el7.centos.noarch ovirt-engine-vmconsole-proxy-helper-4.1.0-0.3.beta2.20161221085908.el7.centos.noarch ovirt-engine-setup-plugin-ovirt-engine-4.1.0-0.3.beta2.20161221085908.el7.centos.noarch ovirt-engine-wildfly-overlay-10.0.0-1.el7.noarch ovirt-engine-cli-3.6.9.2-1.el7.centos.noarch ovirt-web-ui-0.1.1-2.el7.centos.x86_64 ovirt-engine-setup-base-4.1.0-0.3.beta2.20161221085908.el7.centos.noarch ovirt-vmconsole-1.0.4-1.el7.centos.noarch ovirt-engine-dwh-4.1.0-0.0.master.20161129154019.el7.centos.noarch ovirt-engine-setup-plugin-websocket-proxy-4.1.0-0.3.beta2.20161221085908.el7.centos.noarch ovirt-engine-hosts-ansible-inventory-4.1.0-0.3.beta2.20161221085908.el7.centos.noarch ovirt-engine-dashboard-1.1.0-0.4.20161128git5ed6f96.el7.centos.noarch ovirt-engine-4.1.0-0.3.beta2.20161221085908.el7.centos.noarch ovirt-guest-agent-common-1.0.13-1.20161220085008.git165fff1.el7.centos.noarch ovirt-engine-sdk-python-3.6.9.1-1.el7.centos.noarch ovirt-engine-wildfly-10.1.0-1.el7.x86_64 ovirt-engine-lib-4.1.0-0.3.beta2.20161221085908.el7.centos.noarch ovirt-vmconsole-proxy-1.0.4-1.el7.centos.noarch Linux version 3.10.0-514.2.2.el7.x86_64 (builder.centos.org) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Dec 6 23:06:41 UTC 2016 Linux 3.10.0-514.2.2.el7.x86_64 #1 SMP Tue Dec 6 23:06:41 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux CentOS Linux release 7.3.1611 (Core) Components on puma18 (host which I've tried to upgrade via the engine): ovirt-vmconsole-host-1.0.4-1.el7ev.noarch mom-0.5.8-1.el7ev.noarch ovirt-hosted-engine-setup-2.1.0-0.0.master.20161221071755.git46cacd3.el7.centos.noarch ovirt-setup-lib-1.1.0-1.el7.centos.noarch libvirt-client-2.0.0-10.el7_3.2.x86_64 ovirt-release41-pre-4.1.0-0.6.beta2.20161221025826.gitc487776.el7.centos.noarch ovirt-vmconsole-1.0.4-1.el7ev.noarch qemu-kvm-rhev-2.6.0-28.el7_3.2.x86_64 ovirt-hosted-engine-ha-2.1.0-0.0.master.20161221070856.20161221070854.git387fa53.el7.centos.noarch ovirt-engine-appliance-4.1-20161222.1.el7.centos.noarch sanlock-3.4.0-1.el7.x86_64 ovirt-host-deploy-1.6.0-0.0.master.20161215101008.gitb76ad50.el7.centos.noarch ovirt-engine-sdk-python-3.6.9.1-1.el7ev.noarch ovirt-imageio-common-0.5.0-0.201611201242.gitb02532b.el7.centos.noarch vdsm-4.18.999-1218.gitd36143e.el7.centos.x86_64 ovirt-imageio-daemon-0.5.0-0.201611201242.gitb02532b.el7.centos.noarch Linux version 3.10.0-514.2.2.el7.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Wed Nov 16 13:15:13 EST 2016 Linux 3.10.0-514.2.2.el7.x86_64 #1 SMP Wed Nov 16 13:15:13 EST 2016 x86_64 x86_64 x86_64 GNU/Linux Red Hat Enterprise Linux Server release 7.3 (Maipo) Components on puma19 (host on which engine was running and it was upgraded to latest bits and appeared without "Upgrade is available" symbol): ovirt-vmconsole-host-1.0.4-1.el7ev.noarch mom-0.5.8-1.el7ev.noarch ovirt-hosted-engine-setup-2.1.0-0.0.master.20161221071755.git46cacd3.el7.centos.noarch ovirt-setup-lib-1.1.0-1.el7.centos.noarch libvirt-client-2.0.0-10.el7_3.2.x86_64 ovirt-release41-pre-4.1.0-0.6.beta2.20161221025826.gitc487776.el7.centos.noarch ovirt-vmconsole-1.0.4-1.el7ev.noarch qemu-kvm-rhev-2.6.0-28.el7_3.2.x86_64 ovirt-hosted-engine-ha-2.1.0-0.0.master.20161221070856.20161221070854.git387fa53.el7.centos.noarch rhevm-appliance-20161116.0-1.el7ev.noarch sanlock-3.4.0-1.el7.x86_64 ovirt-host-deploy-1.6.0-0.0.master.20161215101008.gitb76ad50.el7.centos.noarch ovirt-engine-sdk-python-3.6.9.1-1.el7ev.noarch ovirt-imageio-common-0.5.0-0.201611201242.gitb02532b.el7.centos.noarch vdsm-4.18.999-1218.gitd36143e.el7.centos.x86_64 ovirt-imageio-daemon-0.5.0-0.201611201242.gitb02532b.el7.centos.noarch Linux version 3.10.0-514.2.2.el7.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Wed Nov 16 13:15:13 EST 2016 Linux 3.10.0-514.2.2.el7.x86_64 #1 SMP Wed Nov 16 13:15:13 EST 2016 x86_64 x86_64 x86_64 GNU/Linux Red Hat Enterprise Linux Server release 7.3 (Maipo)
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.
Created attachment 1235523 [details] puma18 sosreport
Created attachment 1235524 [details] puma19 sosreport
Created attachment 1235525 [details] logs from the engine
Nikolai, according to logs you have installed oVirt on RHEL hosts with RHV repositories enabled, that's why upgrade failed due to conflicts between qemu-kvm-ev (oVirt) and qemu-kvm-rhev (RHV). If you want to install oVirt on RHEL you need to clean RHEL without any RHV related repositories (just base and optional channels enabled). Anyway raising the error and failing host upgrade in engine is valid. This bug is only about fixing host upgrade code, when we improperly detected failure of moving host status to maintenance when the host is already in maintenance status (that's why we displayed confusing message "Host xxx failed to move to maintenance mode. Upgrade process is terminated." although upgrade process continued as normal). So I'm moving the bug back on ON_QA
No error detected in engine.log, upgrade got finished successfully. Works for me on these components on hosts: libvirt-client-2.0.0-10.el7_3.4.x86_64 qemu-kvm-rhev-2.6.0-28.el7_3.6.x86_64 rhevm-appliance-20160721.0-2.el7ev.noarch mom-0.5.9-1.el7ev.noarch ovirt-hosted-engine-setup-2.1.0.3-1.el7ev.noarch ovirt-engine-sdk-python-3.6.9.1-1.el7ev.noarch sanlock-3.4.0-1.el7.x86_64 ovirt-vmconsole-host-1.0.4-1.el7ev.noarch vdsm-4.19.6-1.el7ev.x86_64 ovirt-host-deploy-1.6.0-1.el7ev.noarch ovirt-vmconsole-1.0.4-1.el7ev.noarch ovirt-imageio-common-1.0.0-0.el7ev.noarch ovirt-imageio-daemon-1.0.0-0.el7ev.noarch ovirt-setup-lib-1.1.0-1.el7ev.noarch ovirt-hosted-engine-ha-2.1.0.3-1.el7ev.noarch Linux version 3.10.0-514.6.1.el7.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Sat Dec 10 11:15:38 EST 2016 Linux 3.10.0-514.6.1.el7.x86_64 #1 SMP Sat Dec 10 11:15:38 EST 2016 x86_64 x86_64 x86_64 GNU/Linux Red Hat Enterprise Linux Server release 7.3 (Maipo) On engine: rhev-guest-tools-iso-4.1-3.el7ev.noarch rhevm-dependencies-4.1.0-1.el7ev.noarch rhevm-doc-4.1.0-2.el7ev.noarch rhevm-branding-rhev-4.1.0-1.el7ev.noarch rhevm-setup-plugins-4.1.0-1.el7ev.noarch rhevm-4.1.1.2-0.1.el7.noarch Linux version 3.10.0-514.6.2.el7.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Fri Feb 17 19:21:31 EST 2017 Linux 3.10.0-514.6.2.el7.x86_64 #1 SMP Fri Feb 17 19:21:31 EST 2017 x86_64 x86_64 x86_64 GNU/Linux Red Hat Enterprise Linux Server release 7.3 (Maipo)