Bug 1331114 - pglogical replication initial sync fails with a migrated database
Summary: pglogical replication initial sync fails with a migrated database
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat CloudForms Management Engine
Classification: Red Hat
Component: Replication
Version: 5.6.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: GA
: 5.6.0
Assignee: Nick Carboni
QA Contact: luke couzens
URL:
Whiteboard: replication:migration:database
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-04-27 18:50 UTC by Jared Deubel
Modified: 2016-07-20 07:24 UTC (History)
4 users (show)

Fixed In Version: 5.6.0.8
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-06-29 15:55:40 UTC
Category: ---
Cloudforms Team: ---
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:1348 0 normal SHIPPED_LIVE CFME 5.6.0 bug fixes and enhancement update 2016-06-29 18:50:04 UTC

Description Jared Deubel 2016-04-27 18:50:49 UTC
Description of problem:
The event_streams tables do not match as we are getting the string "MiqServer" for an integer column. The ordering of target_type and target_id are different.
Remote DB:

                                            Table "public.event_streams"
           Column           |            Type             |                         Modifiers                          
----------------------------+-----------------------------+------------------------------------------------------------
 id                         | bigint                      | not null default nextval('event_streams_id_seq'::regclass)
 event_type                 | character varying(255)      | 
 message                    | text                        | 
 timestamp                  | timestamp without time zone | 
 host_name                  | character varying(255)      | 
 host_id                    | bigint                      | 
 vm_name                    | character varying(255)      | 
 vm_location                | character varying(255)      | 
 vm_or_template_id          | bigint                      | 
 dest_host_name             | character varying(255)      | 
 dest_host_id               | bigint                      | 
 dest_vm_name               | character varying(255)      | 
 dest_vm_location           | character varying(255)      | 
 dest_vm_or_template_id     | bigint                      | 
 source                     | character varying(255)      | 
 chain_id                   | bigint                      | 
 ems_id                     | bigint                      | 
 is_task                    | boolean                     | 
 full_data                  | text                        | 
 created_on                 | timestamp without time zone | 
 username                   | character varying(255)      | 
 ems_cluster_id             | bigint                      | 
 ems_cluster_name           | character varying(255)      | 
 ems_cluster_uid            | character varying(255)      | 
 dest_ems_cluster_id        | bigint                      | 
 dest_ems_cluster_name      | character varying(255)      | 
 dest_ems_cluster_uid       | character varying(255)      | 
 availability_zone_id       | bigint                      | 
 container_node_id          | bigint                      | 
 container_node_name        | character varying           | 
 container_group_id         | bigint                      | 
 container_group_name       | character varying           | 
 container_namespace        | character varying           | 
 type                       | character varying           | 
 target_id                  | bigint                      | 
 target_type                | character varying           | 
 container_id               | bigint                      | 
 container_name             | character varying           | 
 container_replicator_id    | bigint                      | 
 container_replicator_name  | character varying           | 
 middleware_server_id       | bigint                      | 
 middleware_server_name     | character varying           | 
 middleware_deployment_id   | bigint                      | 
 middleware_deployment_name | character varying           |


Global Db:

                                         Table "public.event_streams"
           Column           |            Type             |                         Modifiers                          
----------------------------+-----------------------------+------------------------------------------------------------
 id                         | bigint                      | not null default nextval('event_streams_id_seq'::regclass)
 event_type                 | character varying           | 
 message                    | text                        | 
 timestamp                  | timestamp without time zone | 
 host_name                  | character varying           | 
 host_id                    | bigint                      | 
 vm_name                    | character varying           | 
 vm_location                | character varying           | 
 vm_or_template_id          | bigint                      | 
 dest_host_name             | character varying           | 
 dest_host_id               | bigint                      | 
 dest_vm_name               | character varying           | 
 dest_vm_location           | character varying           | 
 dest_vm_or_template_id     | bigint                      | 
 source                     | character varying           | 
 chain_id                   | bigint                      | 
 ems_id                     | bigint                      | 
 is_task                    | boolean                     | 
 full_data                  | text                        | 
 created_on                 | timestamp without time zone | 
 username                   | character varying           | 
 ems_cluster_id             | bigint                      | 
 ems_cluster_name           | character varying           | 
 ems_cluster_uid            | character varying           | 
 dest_ems_cluster_id        | bigint                      | 
 dest_ems_cluster_name      | character varying           | 
 dest_ems_cluster_uid       | character varying           | 
 availability_zone_id       | bigint                      | 
 container_node_id          | bigint                      | 
 container_node_name        | character varying           | 
 container_group_id         | bigint                      | 
 container_group_name       | character varying           | 
 container_namespace        | character varying           | 
 type                       | character varying           | 
 target_type                | character varying           | 
 target_id                  | bigint                      | 
 container_id               | bigint                      | 
 container_name             | character varying           | 
 container_replicator_id    | bigint                      | 
 container_replicator_name  | character varying           | 
 middleware_server_id       | bigint                      | 
 middleware_server_name     | character varying           | 
 middleware_deployment_id   | bigint                      | 
 middleware_deployment_name | character varying           | 


Version-Release number of selected component (if applicable):
5.6 beta2

Comment 2 Nick Carboni 2016-04-27 20:53:45 UTC
https://github.com/rails/rails/commit/d26704a15f88d384dd282425daa832affdb5f8c1 has changed the order in which the columns are added to a table for polymorphic relationships.

This is what caused the issue in the event_streams table.
The same issue also occurred in computer_systems and security_contexts.

Comment 4 Nick Carboni 2016-04-28 21:05:25 UTC
https://github.com/ManageIQ/manageiq/pull/8337 will add a spec to be sure that we do not cause this issue by having migrations run in the wrong order in customer environments.

Comment 5 Nick Carboni 2016-05-03 14:19:48 UTC
https://github.com/ManageIQ/manageiq/pull/8380 will add methods to validate the schema currently being used against a local file which contains the "expected" schema.

We will use this to keep track of schema changes, validate that replication can run before saving a subscription, and determine what tables to fix if a customer schema does not match the expected.

Comment 7 Nick Carboni 2016-05-13 21:47:52 UTC
https://github.com/ManageIQ/manageiq/pull/8698 will ensure that we detect this situation before attempting to save a subscription and give adequate feedback

Comment 8 CFME Bot 2016-05-16 16:35:47 UTC
New commit detected on ManageIQ/manageiq/master:
https://github.com/ManageIQ/manageiq/commit/480752c6e987ec03f003b2b163e7502d9ca373f5

commit 480752c6e987ec03f003b2b163e7502d9ca373f5
Author:     Nick Carboni <ncarboni>
AuthorDate: Thu May 5 18:16:14 2016 -0400
Commit:     Nick Carboni <ncarboni>
CommitDate: Mon May 9 16:36:12 2016 -0400

    Add tool to alter the column ordering of a table
    
    Use of this tool will allow migrated databases to replicate
    to newly deployed schemas.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1331114

 spec/tools/column_ordering/column_ordering_spec.rb | 114 ++++++++++
 spec/tools/column_ordering/data/metrics_04.sql     | 153 ++++++++++++++
 spec/tools/column_ordering/data/metrics_04.yml     |  75 +++++++
 .../tools/column_ordering/data/metrics_04_cols.yml |  73 +++++++
 .../data/metrics_04_constraints.yml                |   3 +
 .../column_ordering/data/metrics_04_create.sql     |  74 +++++++
 .../column_ordering/data/metrics_04_new_params     |  71 +++++++
 spec/tools/column_ordering/data/metrics_04_params  |  72 +++++++
 .../tools/column_ordering/data/metrics_04_rest.sql |  80 +++++++
 spec/tools/column_ordering/data/test.sql           |  78 +++++++
 tools/column_ordering/column_ordering.rb           | 231 +++++++++++++++++++++
 tools/fix_column_ordering.rb                       |  23 ++
 12 files changed, 1047 insertions(+)
 create mode 100644 spec/tools/column_ordering/column_ordering_spec.rb
 create mode 100644 spec/tools/column_ordering/data/metrics_04.sql
 create mode 100644 spec/tools/column_ordering/data/metrics_04.yml
 create mode 100644 spec/tools/column_ordering/data/metrics_04_cols.yml
 create mode 100644 spec/tools/column_ordering/data/metrics_04_constraints.yml
 create mode 100644 spec/tools/column_ordering/data/metrics_04_create.sql
 create mode 100644 spec/tools/column_ordering/data/metrics_04_new_params
 create mode 100644 spec/tools/column_ordering/data/metrics_04_params
 create mode 100644 spec/tools/column_ordering/data/metrics_04_rest.sql
 create mode 100644 spec/tools/column_ordering/data/test.sql
 create mode 100644 tools/column_ordering/column_ordering.rb
 create mode 100644 tools/fix_column_ordering.rb

Comment 9 Nick Carboni 2016-05-18 12:40:54 UTC
Tested the new column ordering script with the original failing database and everything is now working.

Here is procedure I went through to fix the issue:

Determine the tables that are out of order:
`rake evm:db:check_schema`

For each of those tables run:
`rails r tools/fix_column_ordering.rb <table_name>`

Make sure all tables are correct by running evm:db:check_schema again.

After this, the database will be properly configured to run replication with a newly deployed target database.

Comment 10 Nick Carboni 2016-05-18 17:56:19 UTC
This will need some consideration from a user experience perspective. And likely some text in the migration doc. I'm not sure that people will really know what to do if/when they get the error while configuring replication.

I think it would be best to just have people run the steps in comment 9 as a part of the migration to 5.6

Comment 11 CFME Bot 2016-05-18 18:00:55 UTC
New commit detected on ManageIQ/manageiq/master:
https://github.com/ManageIQ/manageiq/commit/0a73fa0206bd09d2935509cfb11207a200b8aa6b

commit 0a73fa0206bd09d2935509cfb11207a200b8aa6b
Author:     Nick Carboni <ncarboni>
AuthorDate: Fri May 13 15:12:07 2016 -0400
Commit:     Nick Carboni <ncarboni>
CommitDate: Tue May 17 15:52:22 2016 -0400

    Compare the databases to schema.yml in PglogicalSubscription#save!
    
    Before this change we would attempt to subscribe to any database
    that we were pointed at if we could connect to it
    
    If the local and remote databases had different schemas we would
    likely fail to complete the data sync as in
    https://bugzilla.redhat.com/show_bug.cgi?id=1331114
    
    This change will check both the local and remote databases against
    schema.yml file before attempting to save the subscription and will
    raise an exception if either does not match

 app/models/pglogical_subscription.rb       | 10 +++++++++
 spec/models/pglogical_subscription_spec.rb | 34 ++++++++++++++++++++++++++++++
 2 files changed, 44 insertions(+)

Comment 12 luke couzens 2016-06-02 10:30:14 UTC
Verified on 5.6.0.8

Comment 14 errata-xmlrpc 2016-06-29 15:55:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1348


Note You need to log in before you can comment on or make changes to this bug.