1455145 – [RFE][M-5] Add functionality to suspend a provider

Bug 1455145 - [RFE][M-5] Add functionality to suspend a provider

Summary: [RFE][M-5] Add functionality to suspend a provider

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat CloudForms Management Engine
Classification:	Red Hat
Component:	Providers
Sub Component:
Version:	5.7.0
Hardware:	All
OS:	All
Priority:	high
Severity:	high
Target Milestone:	MVP
Target Release:	5.11.0
Assignee:	Martin Slemr
QA Contact:	Matouš Mojžíš
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1590064
TreeView+	depends on / blocked

Reported:	2017-05-24 11:06 UTC by Ryan Spagnola
Modified:	2020-08-13 09:14 UTC (History)
CC List:	17 users (show)
Fixed In Version:	5.11.0.1
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-12-12 13:33:19 UTC
Category:	Feature
Cloudforms Team:	CFME Core
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2019:4199	0	None	None	None	2019-12-12 13:33:36 UTC

Description Ryan Spagnola 2017-05-24 11:06:50 UTC

Description of problem:
it is not possible temporary stop/suspend provider including all its workers in case there is running maintenance (or for example network maintenance leading to connectivity issue to provider).
Usually there is a lot of error messages in logs related to workers that belongs to provider in maintenance. 

It is just possible to Delete, but it is not practical as it deletes all related configuration items stored in database. 


Version-Release number of selected component (if applicable):
5.7.z

How reproducible:
all of the time

Steps to Reproduce:
1.place provider in maintenance mode
2.
3.

Actual results:
worker errors

Expected results:
a way to stop/pause/suspend a provider that is in maintenance mode

Additional info:

Comment 3 Bronagh Sorota 2017-06-16 13:39:03 UTC

Hi Brad,
Assigning this to you for evaluation.

Thanks
Bronagh

Comment 4 Bronagh Sorota 2017-06-16 16:07:48 UTC

Ryan,
I am told that the approach we recommend is to create an "In Maintenance" zone and place the provider in it, effectively taking away its credentials. I understand you may have started a tech note on this, is that correct?

Thanks
Bronagh

Comment 5 Bronagh Sorota 2017-06-16 16:09:51 UTC

hi Dave,
I am assigning this to you hoping the recommendation in comment4 above can be tested.

Thanks
Bronagh

Comment 6 vaclav.miller 2017-06-16 16:36:44 UTC

Hello Bronagh,

I am wondering how will "in maintenance" zone help.

Thy to imagine there are more zones in region, each zone have more than one appliance taking care for more than one provider - I will have 2 standalone providers called "A" and "B" (does not matter type of provider, may be same, but fulfilling serve different requirements) connected.

In "Maintenance" zone scenario if I will like to make provided "B" maintenance I will have to move at least one appliances to this zone and deactivate workers (or have there spare appliance). But it can lead to many issues. For example by removing appliance from "working" zone will lost some performance within zone. In database there will be changed relation of provider to zone (probably many stuff must be reindexed). Additionally person ho will be responsible for this operation can make mistake - activate appliance in different  zone then required, forget to activate some worker, etc...

In my opinion the only feasible solution is to be able to have option to stop / start provider (so it will start / stop its workers) similarly as:
- workers are started when provider is connected (appliance started)
- workers are restarted when "Check authentication" action is initiated
- workers are stopped when appliance is stopped or provider deleted

Kind regards,
Vaclav

Comment 8 Bronagh Sorota 2017-06-30 15:03:30 UTC

Jan,
Why did the customer not like the work around that was presented by Dave Johnson:
"...the customer should create a "parking" zone with no appliances in it to "park" the environment.  Since the zone has no appliances, all management should stop until the provider is moved back to its zone with appliances.  Hope that makes sense. "

Bronagh

Comment 9 vaclav.miller 2017-06-30 18:59:56 UTC

Hi Bronagh,

We are going to add to our implementation client dedicated environments (client = provider), at initial phase (july) there will be more than 10 providers, later the year additional clients will be enabled (migration from current "legacy" solution.

There are several reasons technical for this feature requirement:
- on CFME 5.7.3 (released two days ago) was tested, that in case provider is not available already, it is required to validate its credentials before it can be saved in "parking" zone. So this approach is not working (I can imagine it would be possible in case of planned maintenance to park provider before it is disconnected).
- in case there are more providers in the zone (for example we have 3 vCenters managed by one vCD), it is not possible to just disable workers because of one provider under maintenance.
Additionaly, from business perspective I would expect that enterprise ready application will support not only provider Registration, Deletion and "some workaround" but Stop and Start as well.

Kind regards,
Vaclav

Comment 11 CFME Bot 2018-05-25 07:42:12 UTC

https://github.com/ManageIQ/manageiq/pull/17452

Comment 12 Roman Blanco 2018-05-29 13:28:16 UTC

I've discussed with colleagues, how to properly label the new buttons, and we agreed on using "Suspend" / "Resume".

While checking the toolbars, where the feature should be added, I've found a toolbar that already has the buttons for the feature in containers providers screen, introduced in manageiq-ui-classic/2603 [1].
The buttons are using terminology "Pause" / "Resume".

I'm not sure, which version is correct for this case.

I've tried to find the answer in Patternfly Terminology and Wording [2], but the information is missing (an issue was created though, the icons might be an issue as well [3]).

The OpenStack documentation [4] explains the difference between "pause" and "suspend" for VMs, but I'm not sure it is applicable in this case as well.

For now, I'll use the same terminology as in container providers to keep consistency. Later I can update the PR with correct labels for the new buttons and fix the merged ones as well.

Roman

[1] https://github.com/ManageIQ/manageiq-ui-classic/pull/2603
[2] http://www.patternfly.org/styles/terminology-and-wording/
[3] https://github.com/patternfly/patternfly-design/issues/670
[4] https://wiki.openstack.org/wiki/Kvm-Pause-Suspend

Comment 13 Roman Blanco 2018-05-31 13:40:18 UTC

Toolbar buttons for the functionality and notification for summary view added in:

* https://github.com/ManageIQ/manageiq/pull/17500
* https://github.com/ManageIQ/manageiq-ui-classic/pull/4012

Comment 14 Martin Slemr 2018-05-31 13:57:00 UTC

Issue with PR relations: https://github.com/ManageIQ/manageiq/issues/17489

Comment 15 CFME Bot 2018-06-18 14:45:07 UTC

https://github.com/ManageIQ/manageiq-schema/pull/222

Comment 16 CFME Bot 2018-06-18 20:46:24 UTC

https://github.com/ManageIQ/manageiq/pull/17602

Comment 17 Martin Slemr 2018-07-09 13:17:42 UTC

https://github.com/ManageIQ/manageiq/pull/17602 is not related - mistake

Comment 18 Dan Clarizio 2018-07-17 14:36:58 UTC

UI PR: https://github.com/ManageIQ/manageiq-ui-classic/pull/4269

Comment 19 CFME Bot 2018-07-27 10:18:30 UTC

https://github.com/ManageIQ/manageiq-api/pull/434

Comment 20 Josh Carter 2018-09-18 13:26:53 UTC

Dear customer, 

The CloudForms team is reviewing the current CloudForms RFE(Request for Enhancement) backlog in order to improve our responsiveness to customers. We are closing any requests for versions no longer within full support(link below to the lifecycle) or that do not have a clear spot on the product roadmap. We are committing to better management of the backlog as we move forward. If you have an RFE that you still have a strong business case for, please open a new BZ against the currently supported version 4.6.

Lifecycle page: https://access.redhat.com/support/policy/updates/cloudforms

If you have any concerns about this, please let us know.

Thanks and regards!”

Comment 22 CFME Bot 2018-10-01 11:56:01 UTC

https://github.com/ManageIQ/manageiq/pull/18037

Comment 24 CFME Bot 2018-12-18 14:45:50 UTC

New commit detected on ManageIQ/manageiq/master:

https://github.com/ManageIQ/manageiq/commit/352ddb4f1b54186eef5fd9273828f6ee87a3250c
commit 352ddb4f1b54186eef5fd9273828f6ee87a3250c
Author:     Martin Slemr <mslemr>
AuthorDate: Mon May 21 09:51:29 2018 -0400
Commit:     Martin Slemr <mslemr>
CommitDate: Mon May 21 09:51:29 2018 -0400

    Pause/Resume EMS

    Enables/Disables ems with children and puts to maintenance zone

    https://bugzilla.redhat.com/show_bug.cgi?id=1455145

 app/models/ext_management_system.rb | 61 +-
 app/models/miq_queue.rb | 1 +
 app/models/zone.rb | 12 +
 spec/models/ext_management_system_spec.rb | 47 +
 spec/models/zone_spec.rb | 2 +-
 5 files changed, 111 insertions(+), 12 deletions(-)

Comment 25 CFME Bot 2018-12-21 17:42:45 UTC

New commit detected on ManageIQ/manageiq-api/master:

https://github.com/ManageIQ/manageiq-api/commit/cc4dd1b981d6a6ebeb872dc72a95f847b5054fb5
commit cc4dd1b981d6a6ebeb872dc72a95f847b5054fb5
Author:     Dávid Halász <dhalasz>
AuthorDate: Fri Jul 27 06:09:13 2018 -0400
Commit:     Dávid Halász <dhalasz>
CommitDate: Fri Jul 27 06:09:13 2018 -0400

    Use the new universal methods for suspending/resuming a provider

    https://bugzilla.redhat.com/show_bug.cgi?id=1455145

 app/controllers/api/providers_controller.rb | 4 +-
 1 file changed, 2 insertions(+), 2 deletions(-)

Comment 26 Adam Grare 2019-01-31 14:33:09 UTC

Hey Martin/David can this be moved to POST?

Comment 27 Dávid Halász 2019-01-31 14:46:26 UTC

No, Martin is still testing a PR for foreman/ansible: https://github.com/ManageIQ/manageiq-ui-classic/pull/5173

Comment 28 Martin Slemr 2019-02-04 12:59:07 UTC

https://github.com/ManageIQ/manageiq-providers-ansible_tower/pull/155 added to support frontend on ansible

Comment 29 CFME Bot 2019-02-14 09:48:01 UTC

New commit detected on ManageIQ/manageiq/master:

https://github.com/ManageIQ/manageiq/commit/0634007d7c47ff18524a935296237aca5fade759
commit 0634007d7c47ff18524a935296237aca5fade759
Author:     Martin Slemr <mslemr>
AuthorDate: Thu Jan 24 07:15:03 2019 -0500
Commit:     Martin Slemr <mslemr>
CommitDate: Thu Jan 24 07:15:03 2019 -0500

    EMS.enable!/disable! removed from public

    Replaced by pause!/resume! instead

    https://bugzilla.redhat.com/show_bug.cgi?id=1455145

 app/models/ext_management_system.rb | 18 +-
 1 file changed, 6 insertions(+), 12 deletions(-)

Comment 30 CFME Bot 2019-02-19 17:57:44 UTC

New commit detected on ManageIQ/manageiq-providers-ansible_tower/master:

https://github.com/ManageIQ/manageiq-providers-ansible_tower/commit/329999d8130588877c075bb204d6ebf5d428ca8f
commit 329999d8130588877c075bb204d6ebf5d428ca8f
Author:     Martin Slemr <mslemr>
AuthorDate: Mon Feb  4 07:28:04 2019 -0500
Commit:     Martin Slemr <mslemr>
CommitDate: Mon Feb  4 07:28:04 2019 -0500

    Changed provider zone when EMS paused/resumed

    Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1455145

 app/models/manageiq/providers/ansible_tower/shared/automation_manager.rb | 11 +
 app/models/manageiq/providers/ansible_tower/shared/provider.rb | 2 +-
 spec/support/ansible_shared/automation_manager.rb | 37 +
 3 files changed, 49 insertions(+), 1 deletion(-)

Comment 31 Dávid Halász 2019-02-20 03:44:58 UTC

The previous commit isn't the last one, just wrongly contains the fixes keyword, setting back to ON_DEV.

Comment 33 Martin Slemr 2019-02-26 10:27:34 UTC

List of all related PRs:
https://github.com/ManageIQ/manageiq/issues/17489

Comment 35 Matouš Mojžíš 2019-07-10 11:35:13 UTC

Verified in 5.11.0.13. Suspended vmware provider and then it resumed after several hours. While suspended no items were changed and no provider refresh was triggered. After resuming items were changed and refresh automatically triggered.

Comment 37 errata-xmlrpc 2019-12-12 13:33:19 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:4199

Note You need to log in before you can comment on or make changes to this bug.