Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1728617

Summary: upgrade of host fails on timeout after 30 minutes
Product: [oVirt] ovirt-engine Reporter: Kobi Hakimi <khakimi>
Component: Frontend.CoreAssignee: Ondra Machacek <omachace>
Status: CLOSED CURRENTRELEASE QA Contact: Petr Matyáš <pmatyas>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.3.5.3CC: bugs, dagur, lleistne, lsvaty, mperina, omachace, pelauter, pmatyas
Target Milestone: ovirt-4.4.0Keywords: AutomationBlocker, ZStream
Target Release: ---Flags: pm-rhel: ovirt-4.4+
pelauter: planning_ack+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Default maximum timeout for an ansible-playbook executed from engine has been raised from 30 to 120 minutes. This timeout is defined using configuration option ANSIBLE_PLAYBOOK_EXEC_DEFAULT_TIMEOUT within /usr/share/ovirt-engine/services/ovirt-engine/ovirt-engine.conf. If administrators need to change that timeout they can create /etc/ovirt-engine/engine.conf.d/99-ansible-timeout.conf file with below content: ANSIBLE_PLAYBOOK_EXEC_DEFAULT_TIMEOUT=NNN where NNN is number of minutes the timeout should be.
Story Points: ---
Clone Of:
: 1765161 (view as bug list) Environment:
Last Closed: 2020-05-20 20:02:30 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1765161    
Attachments:
Description Flags
engine log
none
ansible host deploy log file none

Description Kobi Hakimi 2019-07-10 09:35:04 UTC
Created attachment 1589048 [details]
engine log

Description of problem:
upgrade of host fails on timeout after 30 minutes

Version-Release number of selected component (if applicable):
ovirt-engine-ui-extensions-1.0.6-1.el7ev.noarch
ovirt-engine-4.3.5.3-0.1.el7.noarch

How reproducible:
33% (1 host out of 3 failed)

Steps to Reproduce:
1. deploy 4.2 engine add 3 hosts
2. upgrade the engine to 4.3 
3. upgrade the hosts to 4.3(in our case via restAPI host upgrade)

Actual results:
host failed with ansible timeout error in engine.log: 
2019-07-09 13:51:41,781+03 ERROR [org.ovirt.engine.core.common.utils.ansible.AnsibleExecutor] (EE-ManagedThreadFactory-commandCoordinator-Thread-3) [hosts_syncAction_7dc517e8-0819-42b8] Ansible playbook execution failed: Timeout occurred while executing Ansible playbook.

Expected results:
to increase the timeout to be able to finish the host upgrade without failure.

Additional info:
we had before the following bug related to cluster update: 
https://bugzilla.redhat.com/show_bug.cgi?id=1697301

the timeout defined in:
https://github.com/oVirt/ovirt-engine/blob/master/packaging/services/ovirt-engine/ovirt-engine.conf.in#L649

Comment 1 Kobi Hakimi 2019-07-10 09:36:01 UTC
Created attachment 1589050 [details]
ansible host deploy log file

Comment 4 Kobi Hakimi 2019-07-11 07:38:12 UTC
just to make my upgrade flow more clear:
 - deployed rhv-4.2.10-1 with rhel-7.6
 - upgraded to rhv-4.3.5-5 and to rhel 7.7

Comment 10 Petr Matyáš 2019-10-21 07:19:20 UTC
Using ovirt-engine-4.3.7.0-0.1.el7.noarch the timeout is still 30 minutes which is not enough and our upgrade failed again.

Comment 12 Martin Perina 2019-10-29 13:32:28 UTC
*** Bug 1759478 has been marked as a duplicate of this bug. ***

Comment 13 Petr Matyáš 2020-01-06 16:26:14 UTC
Verified on ovirt-engine-4.4.0-0.13.master.el7.noarch

Comment 14 Sandro Bonazzola 2020-05-20 20:02:30 UTC
This bugzilla is included in oVirt 4.4.0 release, published on May 20th 2020.

Since the problem described in this bug report should be
resolved in oVirt 4.4.0 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.