When a primary node in a HA cluster fails it retains the replication slot(s) that was being used by any standby servers. When this node is reintroduced into the cluster and starts generating WAL, that slot causes the new WAL to be retained and will eventually cause disk space issues. That slot should be dropped when reintroducing the node into the cluster. --- The challenge here will be deciding if all the replication slots should be dropped or picking the particular ones that should be removed. Generally I feel like this should be the responsibility of repmgr when running `repmgr standby follow` so maybe we can open an RFE on that project as well (if this is not already included in a new version).
https://github.com/ManageIQ/manageiq-gems-pending/pull/124
https://github.com/ManageIQ/manageiq-gems-pending/pull/126
New commit detected on ManageIQ/manageiq-gems-pending/master: https://github.com/ManageIQ/manageiq-gems-pending/commit/3c90a73de7c8ec34364050de8ef677f72ac424d7 commit 3c90a73de7c8ec34364050de8ef677f72ac424d7 Author: Nick Carboni <ncarboni> AuthorDate: Tue Apr 18 16:22:16 2017 -0400 Commit: Nick Carboni <ncarboni> CommitDate: Mon Apr 24 15:52:52 2017 -0400 Alter PostgresAdmin.prep_data_directory to remove all contents This will allow us to use it for reinitializing a database server as a standby when it was previously a primary. https://bugzilla.redhat.com/show_bug.cgi?id=1426769 lib/gems/pending/util/postgres_admin.rb | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
New commit detected on ManageIQ/manageiq-gems-pending/master: https://github.com/ManageIQ/manageiq-gems-pending/commit/63a179ea2b419007df07a0385989f8f20978ee8f commit 63a179ea2b419007df07a0385989f8f20978ee8f Author: Nick Carboni <ncarboni> AuthorDate: Wed Apr 19 17:43:17 2017 -0400 Commit: Nick Carboni <ncarboni> CommitDate: Mon Apr 24 15:52:53 2017 -0400 Offer to clear the data directory for new standby servers This will allow seamless reintegration of failed primary servers after a failover. When this happens the user will be given the option to clear the existing database and re-clone the new primary into this server and then continue to set up a standby as before. https://bugzilla.redhat.com/show_bug.cgi?id=1426718 https://bugzilla.redhat.com/show_bug.cgi?id=1426769 https://bugzilla.redhat.com/show_bug.cgi?id=1442911 .../database_replication_standby.rb | 20 +-- .../database_replication_standby_spec.rb | 143 ++++++++++++++------- 2 files changed, 112 insertions(+), 51 deletions(-)
This should be fixed because a failed master (which would have replication slots on it) now is completely wiped and re-initialized when it is re-introduced into the cluster as a standby when using the console "Configure Standby" option.
Verified in 5.9.0.2
New commit detected on ManageIQ/manageiq-appliance_console/master: https://github.com/ManageIQ/manageiq-appliance_console/commit/012aaefe755d8a0c7264381e6196a37166f4558d commit 012aaefe755d8a0c7264381e6196a37166f4558d Author: Nick Carboni <ncarboni> AuthorDate: Tue Apr 18 20:22:16 2017 +0000 Commit: Nick LaMuro <nicklamuro> CommitDate: Tue Mar 16 19:25:16 2021 +0000 Alter PostgresAdmin.prep_data_directory to remove all contents This will allow us to use it for reinitializing a database server as a standby when it was previously a primary. https://bugzilla.redhat.com/show_bug.cgi?id=1426769 (transferred from ManageIQ/manageiq-gems-pending@3c90a73de7c8ec34364050de8ef677f72ac424d7) lib/manageiq/appliance_console/postgres_admin.rb | 3 +- 1 file changed, 1 insertion(+), 2 deletions(-)