Bug 1295574

Summary:	Cloud or Infarstructure provider replicated from another region won't be deleted after that region's vmdb crash
Product:	Red Hat CloudForms Management Engine	Reporter:	Fabien CAMBI <fcambi>
Component:	Replication	Assignee:	Nick Carboni <ncarboni>
Status:	CLOSED ERRATA	QA Contact:	Alex Newman <anewman>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	5.5.0	CC:	dajohnso, gtanzill, jhardy, jprause, ncarboni, obarenbo
Target Milestone:	GA
Target Release:	5.6.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	5.6.0.0	Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:	Build Name: 22906, Quick Start for Red Hat Enterprise Virtualization and Red Hat CloudForms-4-1 Build Date: 08-10-2014 12:19:28 Topic ID: 41679-714140 [Latest]
Last Closed:	2016-06-29 15:25:06 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Fabien CAMBI 2016-01-04 22:29:25 UTC

Title: Cloud or Infrastructure provider replicated from a non master zone won't be deleted after the crash of that zone.

Describe the issue:
In a multi-regions environment (minimum 2 evm appliances), when you activate database replication from one region (let's say region 1) to the master region (let's say region 99) you'll see all info from region 1 appearing in region 99.

If region 1 (especially the vmdb) crashes and you don't rebuild it the same way (I guess), same IP, same fqdn for example, you won't be able to delete the provider(s) from crashed region unless you have a working appliance in that region. If you want do delete it from inside Region 99, this message will popup:
"The selected Provider is not in the current region".

Also there is no indicator that region 1 is not available anymore. Region 99 still thinks region 1 is 


Suggestions for improvement:
1/ One should be able to force-delete a provider from a crashed region. If it's too dangerous (because you could crash a working environment), document a procedure to delete it properly and manually from the master VMDB.
2/ The Master VMDB should know the state of the other DBs synchronized on it.


Additional information:

Comment 2 Fabien CAMBI 2016-01-04 22:31:46 UTC

Erratum:
-- Title: Cloud or Infrastructure provider replicated from a non master zone won't be deleted after the crash of that zone.
++ Title: Cloud or Infarstructure provider replicated from another region won't be deleted after that region's vmdb crash

Comment 3 Nick Carboni 2016-02-29 16:45:09 UTC

https://github.com/ManageIQ/manageiq/pull/7004

This will add a rake task that can be run to remove a selected region's data from a local database.

It can be run as `rake evm:dbsync:destroy_local_region <region_number>`

It will also refuse to destroy the region of the database it is currently running on.

Comment 4 CFME Bot 2016-02-29 21:20:30 UTC

New commit detected on ManageIQ/manageiq/master:
https://github.com/ManageIQ/manageiq/commit/0bf7b0c7570937fb2eaa327de6cc48df28a3d3b0

commit 0bf7b0c7570937fb2eaa327de6cc48df28a3d3b0
Author:     Nick Carboni <ncarboni>
AuthorDate: Mon Feb 29 11:39:00 2016 -0500
Commit:     Nick Carboni <ncarboni>
CommitDate: Mon Feb 29 11:39:00 2016 -0500

    Create a task which will remove another region's data from the local database
    
    If a user has replication configured and removes a regional database,
    that region's data will be "stuck" in the master database.
    
    The new task evm:dbsync:destroy_local_region takes a region number and
    uses the same logic as destroy_remote_region to remove the data, but locally.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1295574

 app/models/miq_region.rb        | 16 ++++++++++++++++
 app/models/miq_region_remote.rb | 16 +---------------
 lib/tasks/evm_dbsync.rake       | 15 +++++++++++++++
 3 files changed, 32 insertions(+), 15 deletions(-)

Comment 6 errata-xmlrpc 2016-06-29 15:25:06 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1348