1361218 – RubyRep fails to start after 5.5 -> 5.6 migration

Bug 1361218 - RubyRep fails to start after 5.5 -> 5.6 migration

Summary: RubyRep fails to start after 5.5 -> 5.6 migration

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat CloudForms Management Engine
Classification:	Red Hat
Component:	Appliance
Sub Component:
Version:	5.6.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	GA
Target Release:	5.7.0
Assignee:	Nick Carboni
QA Contact:	luke couzens
Docs Contact:
URL:
Whiteboard:	black:upgrade:migration:replication
Depends On:
Blocks:	1361610
TreeView+	depends on / blocked

Reported:	2016-07-28 14:11 UTC by luke couzens
Modified:	2017-01-12 04:53 UTC (History)
CC List:	7 users (show)
Fixed In Version:	5.7.0.0
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1361610 (view as bug list)
Environment:
Last Closed:	2017-01-11 20:12:36 UTC
Category:	---
Cloudforms Team:	---
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description luke couzens 2016-07-28 14:11:04 UTC

Description of problem:RubyRep in restart loop after migration from 5.5.5.4 -> 5.6.1.0. The same issue seems to be present in standard migration as well as in-place upgrade.


Version-Release number of selected component (if applicable):5.6.1.0


How reproducible:100%


Steps to Reproduce:
1.provision 2x 5.5 appliances
2.configure 1st db with region 99 (r99)
3.configure 2nd db with region 0 (r0)
4.login to webui of r0 appliance
5.setup replication worker (configure-configuration-workers)
6.point it at r99 appliance
7.enable db synchronization (configure-configuration-server)
8.test replication by adding provider and checking it shows up in r99 also
9. follow migration docs to upgrade to 5.6 [0]

Actual results:rubyrep fails to start


Expected results: replication starts correctly


Additional info:
[0] https://access.redhat.com/articles/2297391 - inplace


evm.log
http://pastebin.test.redhat.com/396957

ips for standard migration:
rr99 - 10.16.6.208
rr0 - 10.16.6.85

ips for in-place upgrade
rr99 - 10.8.199.223
rr0 - 10.16.6.131

Comment 2 Nick Carboni 2016-07-28 14:19:55 UTC

The issue is a unique constraint error on the cloud_subnets_network_ports table. This was caused by a region agnostic migration which created join table rows containing data from a remote region with global ids when a global region was migrated.

This was introduced in https://github.com/ManageIQ/manageiq/pull/7237 

Unfortunately this issue seems to also be in 5.6.0.

We can doc a fix which will be, after the migration, to:
On the global region:
  - DELETE from cloud_subnets_network_ports;
On each remote region:
  - Stop the replication worker
  - bin/rake evm:dbsync:local_uninstall cloud_subnets_network_ports
  - Start the replication worker

I'm also currently working on a fix for the migration itself.

Comment 3 CFME Bot 2016-07-28 16:51:02 UTC

https://github.com/ManageIQ/manageiq/pull/10124

Comment 4 CFME Bot 2016-07-29 13:30:49 UTC

New commit detected on ManageIQ/manageiq/master:
https://github.com/ManageIQ/manageiq/commit/6c0ab8b32f8f621474588bfe94b62fc25692bc0d

commit 6c0ab8b32f8f621474588bfe94b62fc25692bc0d
Author:     Nick Carboni <ncarboni>
AuthorDate: Thu Jul 28 12:39:15 2016 -0400
Commit:     Nick Carboni <ncarboni>
CommitDate: Thu Jul 28 12:41:57 2016 -0400

    Only migrate rows in the current region
    
    This was causing an issue with replication when rows which actually
    belonged to a region were migrated.
    
    The new rows in the join table got an id in the global region.
    
    This caused replication to fail with a unique constraint error
    when trying to replicate the new rows in the regional database.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1361218

 ...oud_subnet_id_to_network_ports_cloud_subnets.rb |  6 +-
 ...ubnet_id_to_network_ports_cloud_subnets_spec.rb | 71 ++++++++++++++++++++++
 2 files changed, 74 insertions(+), 3 deletions(-)

Comment 7 luke couzens 2016-09-16 18:37:01 UTC

If we should be doing 5.5 - 5.7 inplace upgrade then we are currently blocked by https://bugzilla.redhat.com/show_bug.cgi?id=1376888

Comment 8 luke couzens 2016-11-02 17:45:33 UTC

Verified in 5.7.0.7

Note You need to log in before you can comment on or make changes to this bug.