Bug 1644605

Summary:	Provide backend support for cluster-upgrade role execution
Product:	Red Hat CloudForms Management Engine	Reporter:	Martin Perina <mperina>
Component:	Providers	Assignee:	Boriso <bodnopoz>
Status:	CLOSED CURRENTRELEASE	QA Contact:	Angelina Vasileva <anikifor>
Severity:	medium	Docs Contact:
Priority:	high
Version:	5.10.0	CC:	dmetzger, gblomqui, jfrey, jhardy, mperina, obarenbo, simaishi
Target Milestone:	GA
Target Release:	5.10.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	5.10.0.28	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2019-02-12 16:52:58 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	RHEVM	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1570563

Description Martin Perina 2018-10-31 08:52:40 UTC

This is an preparation for the RFE described in BZ1570563. In this bug we add a code required on backend side to execute ovirt.cluster-upgrade role for selected cluster. This functionality can be verified by QE using console (detailed steps will follow), but we don't want to announce this a ready feature until proper UI is created for that.

Comment 1 Dave Johnson 2018-10-31 09:05:07 UTC

Please assess the impact of this issue and update the severity accordingly.  Please refer to https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity for a reminder on each severity's definition.

If it's something like a tracker bug where it doesn't matter, please set the severity to Low.

Comment 5 Boriso 2018-12-03 11:05:31 UTC

Related PRs:
https://github.com/ManageIQ/manageiq/pull/18108
https://github.com/ManageIQ/manageiq/pull/18220
https://github.com/ManageIQ/manageiq/pull/18230
https://github.com/ManageIQ/manageiq/pull/18229
https://github.com/ManageIQ/manageiq-providers-ovirt/pull/313

Comment 6 CFME Bot 2018-12-03 15:21:03 UTC

New commits detected on ManageIQ/manageiq/hammer:

https://github.com/ManageIQ/manageiq/commit/8fb49aa604bfd052bfce9f4e44ed78bf21d95a32
commit 8fb49aa604bfd052bfce9f4e44ed78bf21d95a32
Author:     Adam Grare <agrare>
AuthorDate: Thu Nov 22 11:51:16 2018 -0500
Commit:     Adam Grare <agrare>
CommitDate: Thu Nov 22 11:51:16 2018 -0500

    Merge pull request #18229 from borod108/rfe/add_ca_ansible

    Pass CA when upgrading cluster through Ansible

    (cherry picked from commit 0598dc16c7e3db8e7ee32a0724e53dd9831bc9cc)

    https://bugzilla.redhat.com/show_bug.cgi?id=1644605

 app/models/ems_cluster/cluster_upgrade.rb | 7 +-
 spec/factories/ext_management_system.rb | 12 +
 spec/models/ems_cluster_spec.rb | 5 +-
 3 files changed, 19 insertions(+), 5 deletions(-)


https://github.com/ManageIQ/manageiq/commit/aa3b07422c46d5e11a7ee4b7236230b4b2dfbbea
commit aa3b07422c46d5e11a7ee4b7236230b4b2dfbbea
Author:     Adam Grare <agrare>
AuthorDate: Thu Nov 29 08:50:41 2018 -0500
Commit:     Adam Grare <agrare>
CommitDate: Thu Nov 29 08:50:41 2018 -0500

    Merge pull request #18230 from borod108/rfe/upgrade_cluster_role_options

    Add job_timeout parameter for upgrade_cluster

    (cherry picked from commit 0e1006bce4c40edaa7059247886ad08524f97b8c)

    https://bugzilla.redhat.com/show_bug.cgi?id=1644605

 app/models/ems_cluster/cluster_upgrade.rb | 4 +-
 spec/models/ems_cluster_spec.rb | 5 +-
 2 files changed, 5 insertions(+), 4 deletions(-)

Comment 7 CFME Bot 2018-12-03 15:23:44 UTC

New commit detected on ManageIQ/manageiq-providers-ovirt/hammer:

https://github.com/ManageIQ/manageiq-providers-ovirt/commit/8a867a2f46a58c47bb4673d921e71d284c1a7f53
commit 8a867a2f46a58c47bb4673d921e71d284c1a7f53
Author:     Adam Grare <agrare>
AuthorDate: Thu Nov 22 08:57:26 2018 -0500
Commit:     Adam Grare <agrare>
CommitDate: Thu Nov 22 08:57:26 2018 -0500

    Merge pull request #313 from borod108/rfe/ca_str

    Add support for CA for Ansible role

    (cherry picked from commit aff6bd3a56909955b7f421c0ff1a8a746472eec5)

    https://bugzilla.redhat.com/show_bug.cgi?id=1644605

 app/models/manageiq/providers/redhat/ansible_role_workflow.rb | 27 +
 spec/models/manageiq/providers/redhat/ansible_role_workflow_spec.rb | 51 +
 2 files changed, 78 insertions(+)

Comment 8 Boriso 2018-12-04 05:24:24 UTC

Another related PR:
https://github.com/ManageIQ/manageiq-providers-ovirt/pull/312

Comment 9 CFME Bot 2018-12-04 13:17:02 UTC

New commit detected on ManageIQ/manageiq-providers-ovirt/hammer:

https://github.com/ManageIQ/manageiq-providers-ovirt/commit/0caa93ee4c27228db62cbc3f367e94ff1e7b5b07
commit 0caa93ee4c27228db62cbc3f367e94ff1e7b5b07
Author:     Piotr Kliczewski <piotr.kliczewski>
AuthorDate: Wed Nov 21 03:58:40 2018 -0500
Commit:     Piotr Kliczewski <piotr.kliczewski>
CommitDate: Wed Nov 21 03:58:40 2018 -0500

    Merge pull request #312 from borod108/ref/conn_details

    Refactor connection method

    (cherry picked from commit 4f66532b2cb51a7197b888aa8c08f10fef94ea7c)

    https://bugzilla.redhat.com/show_bug.cgi?id=1644605

 app/models/manageiq/providers/redhat/infra_manager/api_integration.rb | 21 +-
 1 file changed, 12 insertions(+), 9 deletions(-)

Comment 10 Ilanit Stein 2018-12-24 08:08:46 UTC

Tested on Dec 11. CFME-5.10.0.28/RHV-4.2.8-1

Using those commands, that trigger RHV Upgrade cluster from CFME:

# vmdb
# rails c
# e = EmsCluster.where(name:'<cluster name>').first
# e.upgrade_cluster()

I've run those test cases:

The severe issues are marked with ***, the less severe issues marked with **.

1.
Environment:
4.2 Cluster, one host with no available updates (vdsm of RHV-4.2.8.-1).
Run:
e = EmsCluster.where(name:'golden_env_mixed_1').first
e.upgrade_cluster()
Result:
Task Finished with massage: Playbook completed with no errors
Upgrade was indeed run on the single host.

This worked OK

2.
Environment:
Further to a host reinstall (after vdsm downgrade from RHV-4.2.8-1 to RHV-4.2.7-9),
the CFME refresh failed (CFME RHV provider details page, Last refresh message contained an error).
Run:
e.upgrade_cluster()
Result:
No task was triggered, and there was no response on what is wrong.

** Please consider fixing it, to have an indication the Upgrade cluster can't be run.

3.
Environment: 2 hosts, only one has updates (vdsm of RHV-4.2.7-9), and has a running VM.
Run upgrade cluster with check_upgrade: "true":
e.upgrade_cluster(check_upgrade:"true")
Result:
The VM migrated to the 2nd host, the host was upgraded.
There was no attempt to upgrade the 2nd host.
e.upgrade_cluster(check_upgrade:"true")

This worked as expected.

** After running this change of the check_upgrade variable (set to "true"),
I could no longer change this setting, back to the default value, check_upgrade:"false"
and running the upgrade cluster, on a cluster with hosts,
that do not have updates no longer actually run upgrade cluster, as before.
IT would be good to solve this issue, to allow further testing of the Upgrade cluster, using role variables values, other than the default.

** CFME upgrade cluster will upgrade the host, even if there are no updates (this is the check_upgrade variable, set to "false" is untouched).
This behavior is different than RHV UI, that do not allow such an action. Upgrade operation in RHV UI is blocked in such case.
and I think it is worth to consider to have the CFME behavior consistent with RHV UI.

4. Environment: Cluster with 2 hosts: 1 updated, second not updated,
and has a running VM, but the VM is assigned to the host, and allow only manual migration.
Run:
e.upgrade_cluster()
Results:
- RHV in response do not provide any error in UI in such case. The upgrade is simply not done.
engine.log show that host fail to move to maintenance sine VMs are not migratable.
Filed this RHV bug 1658186 for this.
- On CFME Task Finished with massage: Playbook completed with no errors

*** CFME task should not show such message, in this case.
*** CFME do not indicate the Upgrade cluster actually didn't run.

5. Environment: Cluster with 2 hosts: one with no updates, and 2nd with updates, with a running VM,
that can be migrated (assigned to a host, but allow manual and automatic migration).
Run:
e.upgrade_cluster()
Results:
- From unclear reason the VM migration failed. I filed this RHV bug 1658179 on it.
That caused the host move to maintenance fail, and that failed the upgrade on RHV side.
- Still on CFME Task Finished with massage: Playbook completed with no errors

*** Seems CFME do not propagate the Upgrade cluster failure.

Comment 11 Ilanit Stein 2019-01-09 12:39:57 UTC

Moving the bug to Verified based on the support requirements of the feature for this version, and the above testing.