Bug 1361218

Summary: RubyRep fails to start after 5.5 -> 5.6 migration
Product: Red Hat CloudForms Management Engine Reporter: luke couzens <lcouzens>
Component: ApplianceAssignee: Nick Carboni <ncarboni>
Status: CLOSED CURRENTRELEASE QA Contact: luke couzens <lcouzens>
Severity: high Docs Contact:
Priority: high    
Version: 5.6.0CC: abellott, cpelland, jhardy, jkrocil, ncarboni, obarenbo, simaishi
Target Milestone: GAKeywords: TestOnly, ZStream
Target Release: 5.7.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: black:upgrade:migration:replication
Fixed In Version: 5.7.0.0 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1361610 (view as bug list) Environment:
Last Closed: 2017-01-11 20:12:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1361610    

Description luke couzens 2016-07-28 14:11:04 UTC
Description of problem:RubyRep in restart loop after migration from 5.5.5.4 -> 5.6.1.0. The same issue seems to be present in standard migration as well as in-place upgrade.


Version-Release number of selected component (if applicable):5.6.1.0


How reproducible:100%


Steps to Reproduce:
1.provision 2x 5.5 appliances
2.configure 1st db with region 99 (r99)
3.configure 2nd db with region 0 (r0)
4.login to webui of r0 appliance
5.setup replication worker (configure-configuration-workers)
6.point it at r99 appliance
7.enable db synchronization (configure-configuration-server)
8.test replication by adding provider and checking it shows up in r99 also
9. follow migration docs to upgrade to 5.6 [0]

Actual results:rubyrep fails to start


Expected results: replication starts correctly


Additional info:
[0] https://access.redhat.com/articles/2297391 - inplace


evm.log
http://pastebin.test.redhat.com/396957

ips for standard migration:
rr99 - 10.16.6.208
rr0 - 10.16.6.85

ips for in-place upgrade
rr99 - 10.8.199.223
rr0 - 10.16.6.131

Comment 2 Nick Carboni 2016-07-28 14:19:55 UTC
The issue is a unique constraint error on the cloud_subnets_network_ports table. This was caused by a region agnostic migration which created join table rows containing data from a remote region with global ids when a global region was migrated.

This was introduced in https://github.com/ManageIQ/manageiq/pull/7237 

Unfortunately this issue seems to also be in 5.6.0.

We can doc a fix which will be, after the migration, to:
On the global region:
  - DELETE from cloud_subnets_network_ports;
On each remote region:
  - Stop the replication worker
  - bin/rake evm:dbsync:local_uninstall cloud_subnets_network_ports
  - Start the replication worker

I'm also currently working on a fix for the migration itself.

Comment 4 CFME Bot 2016-07-29 13:30:49 UTC
New commit detected on ManageIQ/manageiq/master:
https://github.com/ManageIQ/manageiq/commit/6c0ab8b32f8f621474588bfe94b62fc25692bc0d

commit 6c0ab8b32f8f621474588bfe94b62fc25692bc0d
Author:     Nick Carboni <ncarboni>
AuthorDate: Thu Jul 28 12:39:15 2016 -0400
Commit:     Nick Carboni <ncarboni>
CommitDate: Thu Jul 28 12:41:57 2016 -0400

    Only migrate rows in the current region
    
    This was causing an issue with replication when rows which actually
    belonged to a region were migrated.
    
    The new rows in the join table got an id in the global region.
    
    This caused replication to fail with a unique constraint error
    when trying to replicate the new rows in the regional database.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1361218

 ...oud_subnet_id_to_network_ports_cloud_subnets.rb |  6 +-
 ...ubnet_id_to_network_ports_cloud_subnets_spec.rb | 71 ++++++++++++++++++++++
 2 files changed, 74 insertions(+), 3 deletions(-)

Comment 7 luke couzens 2016-09-16 18:37:01 UTC
If we should be doing 5.5 - 5.7 inplace upgrade then we are currently blocked by https://bugzilla.redhat.com/show_bug.cgi?id=1376888

Comment 8 luke couzens 2016-11-02 17:45:33 UTC
Verified in 5.7.0.7