Description of problem: Errors from pg_logical on worker initialization when syncing config: /var/www/miq/vmdb/lib/extensions/ar_adapter/ar_pglogical/pglogical_raw.rb:301:in `async_exec': ERROR: replication set mapping -1931769956:25998 not found (PG::InternalError) from /var/www/miq/vmdb/lib/extensions/ar_adapter/ar_pglogical/pglogical_raw.rb:301:in `typed_exec' from /var/www/miq/vmdb/lib/extensions/ar_adapter/ar_pglogical/pglogical_raw.rb:281:in `replication_set_remove_table' from /var/www/miq/vmdb/lib/miq_pglogical.rb:79:in `block in refresh_excludes' from /var/www/miq/vmdb/lib/miq_pglogical.rb:78:in `each' from /var/www/miq/vmdb/lib/miq_pglogical.rb:78:in `refresh_excludes' from /var/www/miq/vmdb/lib/vmdb/config/activator.rb:50:in `workers' from /var/www/miq/vmdb/lib/vmdb/config/activator.rb:24:in `block in activate' from /var/www/miq/vmdb/lib/vmdb/config/activator.rb:20:in `each_key' from /var/www/miq/vmdb/lib/vmdb/config/activator.rb:20:in `activate' from /var/www/miq/vmdb/lib/vmdb/settings.rb:42:in `activate' from /var/www/miq/vmdb/lib/vmdb/settings.rb:20:in `reload!' from /var/www/miq/vmdb/app/models/miq_worker/runner.rb:283:in `sync_config' from /var/www/miq/vmdb/app/models/miq_worker/runner.rb:63:in `worker_initialization' from /var/www/miq/vmdb/app/models/miq_worker/runner.rb:51:in `initialize' from /var/www/miq/vmdb/app/models/miq_worker/runner.rb:21:in `new' from /var/www/miq/vmdb/app/models/miq_worker/runner.rb:21:in `start_worker' from /var/www/miq/vmdb/app/models/miq_worker.rb:347:in `block in start' Version-Release number of selected component (if applicable): 5.6.1.2
This looks like it is being caused by a race condition in MiqPglogical#refresh_excludes I was able to reproduce this by: - Creating two appliances pointing to the same database - Set replication type to "remote" on either appliance - Remove a table from the excludes on one server (I used log_files) - Wait for the config to reload on the workers (it may not happen every time) By taking advantage of a separate issue which causes the replication set to "thrash" when two servers have different excludes each worker that reloads the config will likely have to change the excludes. It is possible for one worker to read the current replication set, have a different worker remove the table and then have the table be missing when the first one also tries to remove it from the set. I will look into process synchronization methods to solve this.
https://bugzilla.redhat.com/show_bug.cgi?id=1389821 - Issue referenced in comment 4
https://github.com/ManageIQ/manageiq/pull/12318
New commit detected on ManageIQ/manageiq/euwe: https://github.com/ManageIQ/manageiq/commit/b89c5ab912994cc26b8aee49546b6b6bd6ef52ee commit b89c5ab912994cc26b8aee49546b6b6bd6ef52ee Author: Jason Frey <fryguy9> AuthorDate: Mon Oct 31 16:36:51 2016 -0400 Commit: Oleg Barenboim <chessbyte> CommitDate: Tue Nov 1 13:37:50 2016 -0400 Merge pull request #12280 from carbonin/add_logging_to_refresh_excludes Add logging when the replication set is altered (cherry picked from commit 49cfe5798b581595dc91d4d00a9038000c2fd24a) https://bugzilla.redhat.com/show_bug.cgi?id=1387420 https://bugzilla.redhat.com/show_bug.cgi?id=1386686 lib/miq_pglogical.rb | 2 ++ 1 file changed, 2 insertions(+)
New commit detected on ManageIQ/manageiq/master: https://github.com/ManageIQ/manageiq/commit/320058db5c63a03ebb11e354fb34b983a41ac657 commit 320058db5c63a03ebb11e354fb34b983a41ac657 Author: Nick Carboni <ncarboni> AuthorDate: Mon Oct 31 17:02:26 2016 -0400 Commit: Nick Carboni <ncarboni> CommitDate: Mon Oct 31 18:02:23 2016 -0400 Wrap the contents of MiqPglogical#refresh_excludes in a lock This will prevent multiple processes from executing this logic at the same time which will prevent us from adding a table that is already included in the set or removing a table that is not in the set. https://bugzilla.redhat.com/show_bug.cgi?id=1387420 lib/miq_pglogical.rb | 26 ++++++++++++++------------ 1 file changed, 14 insertions(+), 12 deletions(-)
New commit detected on ManageIQ/manageiq/darga: https://github.com/ManageIQ/manageiq/commit/7b4b3d961f18a847b9c395e65c0f2575ad108c1a commit 7b4b3d961f18a847b9c395e65c0f2575ad108c1a Author: Nick Carboni <ncarboni> AuthorDate: Mon Oct 31 17:02:26 2016 -0400 Commit: Nick Carboni <ncarboni> CommitDate: Tue Nov 8 14:57:08 2016 -0500 Wrap the contents of MiqPglogical#refresh_excludes in a lock This will prevent multiple processes from executing this logic at the same time which will prevent us from adding a table that is already included in the set or removing a table that is not in the set. https://bugzilla.redhat.com/show_bug.cgi?id=1387420 lib/miq_pglogical.rb | 26 ++++++++++++++------------ 1 file changed, 14 insertions(+), 12 deletions(-)