Bug 1546902 - Replication stop working in global region if child region is switched to standby vmdb
Summary: Replication stop working in global region if child region is switched to stan...
Keywords:
Status: CLOSED DUPLICATE of bug 1391095
Alias: None
Product: Red Hat CloudForms Management Engine
Classification: Red Hat
Component: Appliance
Version: 5.8.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: GA
: cfme-future
Assignee: Gregg Tanzillo
QA Contact: Alex Newman
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-02-20 00:11 UTC by Giovanni Fontana
Modified: 2018-02-20 22:24 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-02-20 14:19:52 UTC
Category: ---
Cloudforms Team: ---
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Giovanni Fontana 2018-02-20 00:11:25 UTC
Created attachment 1398051 [details]
Screenshot evidences

Description of problem:
In a multi-region and HA environment, when a primary vmdb of a child region becomes unavailable and repmgr and failover-monitor switches the workers for standby vmdb, replication in global region stop working and a "500 Internal Server Error" is showed in Replication tab (look at screenshots attached).

Version-Release number of selected component (if applicable): 5.8.0


How reproducible:
Yes

Steps to Reproduce:
1. Setup a global and a remote region.
2. The remote region DB needs to be HA.
3. Simulate a failure in primary DB in remote region. Standby VMDB is switched to Primary VMDB.
4. Access "Configuration -> Settings -> Region -> Replication tab". The "500 Internal Server Error" is presented.

Actual results:
- Replication stops and a "500 Internal Server Error" is presented.

Expected results:
- Global region should detect that the Primary VMDB is out and start working with Standby VMDB, just like failover-monitor does with the workers in the region.
- No "Internal Server Error" should be presented.

Additional info:

Comment 2 luke couzens 2018-02-20 09:48:30 UTC
Is this not a duplicate of 1391095? 

The current way replication/HA works it wont failover correctly without some virtual IP usage as stated in that RFE bug.

Comment 3 Giovanni Fontana 2018-02-20 13:15:30 UTC
I think so, unless by the "500 Internal Server Error" issue (I didn't see any reference to this error).

Comment 4 Nick Carboni 2018-02-20 14:19:52 UTC
The 500 error was fixed as a part of https://bugzilla.redhat.com/show_bug.cgi?id=1540688 (specifically in https://github.com/ManageIQ/pg-pglogical/pull/20)

Marking this a duplicate of bug 1391095

*** This bug has been marked as a duplicate of bug 1391095 ***

Comment 5 Giovanni Fontana 2018-02-20 15:24:47 UTC
Hi Nick! The screenshot I have is a little bit different, is it being fixed by this PR also?

Regards,

Giovanni


Note You need to log in before you can comment on or make changes to this bug.