+++ This bug is an upstream to downstream clone. The original bug is: +++ +++ bug 1728617 +++ ====================================================================== Created attachment 1589048 [details] engine log Description of problem: upgrade of host fails on timeout after 30 minutes Version-Release number of selected component (if applicable): ovirt-engine-ui-extensions-1.0.6-1.el7ev.noarch ovirt-engine-4.3.5.3-0.1.el7.noarch How reproducible: 33% (1 host out of 3 failed) Steps to Reproduce: 1. deploy 4.2 engine add 3 hosts 2. upgrade the engine to 4.3 3. upgrade the hosts to 4.3(in our case via restAPI host upgrade) Actual results: host failed with ansible timeout error in engine.log: 2019-07-09 13:51:41,781+03 ERROR [org.ovirt.engine.core.common.utils.ansible.AnsibleExecutor] (EE-ManagedThreadFactory-commandCoordinator-Thread-3) [hosts_syncAction_7dc517e8-0819-42b8] Ansible playbook execution failed: Timeout occurred while executing Ansible playbook. Expected results: to increase the timeout to be able to finish the host upgrade without failure. Additional info: we had before the following bug related to cluster update: https://bugzilla.redhat.com/show_bug.cgi?id=1697301 the timeout defined in: https://github.com/oVirt/ovirt-engine/blob/master/packaging/services/ovirt-engine/ovirt-engine.conf.in#L649 (Originally by Kobi Hakimi)
Created attachment 1589050 [details] ansible host deploy log file (Originally by Kobi Hakimi)
just to make my upgrade flow more clear: - deployed rhv-4.2.10-1 with rhel-7.6 - upgraded to rhv-4.3.5-5 and to rhel 7.7 (Originally by Kobi Hakimi)
Using ovirt-engine-4.3.7.0-0.1.el7.noarch the timeout is still 30 minutes which is not enough and our upgrade failed again. (Originally by Petr Matyas)
WARN: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason: [Found non-acked flags: '{'rhevm-4.3.z': '?'}', ] For more info please contact: rhv-devops: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason: [Found non-acked flags: '{'rhevm-4.3.z': '?'}', ] For more info please contact: rhv-devops
Verified on ovirt-engine-4.3.7.2-0.1.el7.noarch
Hi Ondra, Please can you review the doc text? The default maximum timeout for an Ansible playbook executed from the engine was 30 minutes. As a result, the upgrade process of the host failed due to the short timeout. In this release the timeout was raised to 120 minutes.
It's looks good, thank you.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:4229