Created attachment 1372137 [details] playbook log Description of problem: when upgrading the hosts from 4.1 to 4.2, the host gets into 'Install Failed' status Version-Release number of selected component (if applicable): on the host: vdsm-4.20.9.3-1.el7ev.x86_64 ovirt-host-4.2.0-1.el7ev.x86_64 on the engine: ovirt-engine-4.2.0.2-0.1.el7.noarch ovirt-host-deploy-1.7.0-1.el7ev.noarch How reproducible: 100% Steps to Reproduce: 1. upgrade engine from 4.1 to 4.2 2. set repos on the hosts 3. upgrade via rest or webui Actual results: host in 'Installed failed' Expected results: host should get updated Additional info: looks like ansible is in defunct [root@jenkins-vm-16 host-deploy]# ps -ef | grep ansible ovirt 26764 25035 0 12:13 ? 00:00:07 /usr/bin/python2 /usr/bin/ansible-playbook -v --private-key=/etc/pki/ovirt-engine/keys/engine_id_rsa --inventory=/tmp/ansible-inventory1951036924750427419 /usr/share/ovirt-engine/playbooks/ovirt-host-upgrade.yml ovirt 26775 26764 0 12:13 ? 00:00:00 [ansible-playboo] <defunct> ovirt 29615 25035 1 13:33 ? 00:00:08 /usr/bin/python2 /usr/bin/ansible-playbook -v --private-key=/etc/pki/ovirt-engine/keys/engine_id_rsa --inventory=/tmp/ansible-inventory7442594928060297520 /usr/share/ovirt-engine/playbooks/ovirt-host-upgrade.yml
Created attachment 1372138 [details] engine log
it is always failing on the first upgrade try, but when re-triggering the process it succeeds
What was the source 4.1? Did it have ovirt-host installed?
build 4.1.8-5 before upgrade: ovirt-engine-4.1.8.2-0.1.el7.noarch ovirt-host-deploy-1.6.7-1.el7ev.noarch ovirt-host is installed only in 4.2 afaik engine & host after upgrade: ovirt-host-4.2.0-1.el7ev.x86_64 ovirt-host-dependencies-4.2.0-1.el7ev.x86_64 ovirt-host-deploy-1.7.0-1.el7ev.noarch
Could you please provide also host-deploy log? You have provided only ansible part of host-deploy log?
Im afraid i dont have them and i was unable to reproduce since
We have enlarged the default timeout to 30 minutes and also users are now able to change that timeout even further.
# grep ANSIBLE /usr/share/ovirt-engine/services/ovirt-engine/ovirt-engine.conf # yum list ovirt-engine ovirt-engine.noarch 4.2.0.2-0.1.el7 @rhv-4.2.x ----- # yum list ovirt-engine ovirt-engine.noarch 4.2.1.1-0.1.el7 @rhv-4.2x # grep ANSIBLE /usr/share/ovirt-engine/services/ovirt-engine/ovirt-engine.conf ANSIBLE_PLAYBOOK_EXEC_DEFAULT_TIMEOUT=30
This bugzilla is included in oVirt 4.2.1 release, published on Feb 12th 2018. Since the problem described in this bug report should be resolved in oVirt 4.2.1 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.