Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1643036

Summary:	[downstream clone - 4.2.8] Default timeout for upgrade flow is too short
Product:	Red Hat Enterprise Virtualization Manager	Reporter:	RHV bug bot <rhv-bugzilla-bot>
Component:	ovirt-ansible-roles	Assignee:	Ondra Machacek <omachace>
Status:	CLOSED INSUFFICIENT_DATA	QA Contact:	Petr Kubica <pkubica>
Severity:	medium	Docs Contact:
Priority:	unspecified
Version:	unspecified	CC:	amashah, lleistne, lsvaty, mgoldboi, michal.skrivanek, mperina, omachace, ratamir
Target Milestone:	ovirt-4.2.8	Keywords:	Automation, AutomationBlocker, ZStream
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:	1619199	Environment:
Last Closed:	2018-11-26 13:27:36 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	Infra	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1619199
Bug Blocks:

Description RHV bug bot 2018-10-25 12:04:31 UTC

+++ This bug is an upstream to downstream clone. The original bug is: +++
+++   bug 1619199 +++
======================================================================

Description of problem:
Default timeout for whole upgrade is 1200s which should be enough for installing packages but in upgrade process there are also migrating VMs from upgraded host and from version 4.2 also rebooting the host at the end of upgrade. Rebooting process of physical host could take up to 10 minutes.
From this point of view 1200s is too short time for upgrading a machine.

This bug was found on clean hosted engine without running VMs except hosted_engine VM

All parts of upgrading (moving to maintenance - migrating, upgrading, rebooting) should be considered in default timeout

note: Time to moving into maintenance could be calculated from number of running VM on upgraded host.


Version-Release number of selected component (if applicable):
rhv-4.2.6-3

(Originally by Petr Kubica)

Comment 1 RHV bug bot 2018-10-25 12:04:35 UTC

It should rather watch progress and bail out in case the VMs cannot migrate away, or reboot takes longer than 15mins, or pkgs download takes more than x, etc.

(Originally by michal.skrivanek)

Comment 3 RHV bug bot 2018-10-25 12:04:41 UTC

It should rather watch progress and bail out in case the VMs cannot migrate away, or reboot takes longer than 15mins, or pkgs download takes more than x, etc.

(Originally by michal.skrivanek)

Comment 4 RHV bug bot 2018-10-25 12:04:44 UTC

Let's change the default to 60 minutes, because there is no reliable way how to compute correct timeout. If 60 minutes is not enough, users needs to set their own timeout.

(Originally by Martin Perina)

Comment 5 RHV bug bot 2018-10-25 12:04:47 UTC

Verified in 
ovirt-ansible-cluster-upgrade-1.1.8-0.1.master.20180925135108.el7.noarch

(Originally by Petr Kubica)

Comment 6 RHV bug bot 2018-10-25 12:04:49 UTC

QE verification bot: the bug was verified upstream

(Originally by Raz Tamir)

Comment 7 Sandro Bonazzola 2018-10-26 08:04:32 UTC

This bug is targeted to 4.2.7, can you check if the fix is included in packages going to be released and move to QE accordingly?

Comment 10 Martin Perina 2018-10-26 09:24:03 UTC

Moving the bug to 4.2.8 until we receive all requested data and continue with investigations

Comment 11 Martin Perina 2018-11-26 13:27:36 UTC

Closing as insufficient data, if the issue is still reproducible with ovirt-ansible-cluster-upgrade-1.1.8, please reopen and attach relevant logs