Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1643036

Summary: [downstream clone - 4.2.8] Default timeout for upgrade flow is too short
Product: Red Hat Enterprise Virtualization Manager Reporter: RHV bug bot <rhv-bugzilla-bot>
Component: ovirt-ansible-rolesAssignee: Ondra Machacek <omachace>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Petr Kubica <pkubica>
Severity: medium Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: amashah, lleistne, lsvaty, mgoldboi, michal.skrivanek, mperina, omachace, ratamir
Target Milestone: ovirt-4.2.8Keywords: Automation, AutomationBlocker, ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1619199 Environment:
Last Closed: 2018-11-26 13:27:36 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1619199    
Bug Blocks:    

Description RHV bug bot 2018-10-25 12:04:31 UTC
+++ This bug is an upstream to downstream clone. The original bug is: +++
+++   bug 1619199 +++
======================================================================

Description of problem:
Default timeout for whole upgrade is 1200s which should be enough for installing packages but in upgrade process there are also migrating VMs from upgraded host and from version 4.2 also rebooting the host at the end of upgrade. Rebooting process of physical host could take up to 10 minutes.
From this point of view 1200s is too short time for upgrading a machine.

This bug was found on clean hosted engine without running VMs except hosted_engine VM

All parts of upgrading (moving to maintenance - migrating, upgrading, rebooting) should be considered in default timeout

note: Time to moving into maintenance could be calculated from number of running VM on upgraded host.


Version-Release number of selected component (if applicable):
rhv-4.2.6-3

(Originally by Petr Kubica)

Comment 1 RHV bug bot 2018-10-25 12:04:35 UTC
It should rather watch progress and bail out in case the VMs cannot migrate away, or reboot takes longer than 15mins, or pkgs download takes more than x, etc.

(Originally by michal.skrivanek)

Comment 3 RHV bug bot 2018-10-25 12:04:41 UTC
It should rather watch progress and bail out in case the VMs cannot migrate away, or reboot takes longer than 15mins, or pkgs download takes more than x, etc.

(Originally by michal.skrivanek)

Comment 4 RHV bug bot 2018-10-25 12:04:44 UTC
Let's change the default to 60 minutes, because there is no reliable way how to compute correct timeout. If 60 minutes is not enough, users needs to set their own timeout.

(Originally by Martin Perina)

Comment 5 RHV bug bot 2018-10-25 12:04:47 UTC
Verified in 
ovirt-ansible-cluster-upgrade-1.1.8-0.1.master.20180925135108.el7.noarch

(Originally by Petr Kubica)

Comment 6 RHV bug bot 2018-10-25 12:04:49 UTC
QE verification bot: the bug was verified upstream

(Originally by Raz Tamir)

Comment 7 Sandro Bonazzola 2018-10-26 08:04:32 UTC
This bug is targeted to 4.2.7, can you check if the fix is included in packages going to be released and move to QE accordingly?

Comment 10 Martin Perina 2018-10-26 09:24:03 UTC
Moving the bug to 4.2.8 until we receive all requested data and continue with investigations

Comment 11 Martin Perina 2018-11-26 13:27:36 UTC
Closing as insufficient data, if the issue is still reproducible with ovirt-ansible-cluster-upgrade-1.1.8, please reopen and attach relevant logs