Description of problem:After failing back over to a reintroduced node $APPLIANCE_PG_SERVICE shows as failed and appliance_console info shows Local Database Server: initialized and stopped Version-Release number of selected component (if applicable):5.7.1.0 How reproducible:100% Steps to Reproduce: 1.Setup HA following [0] 2.fail over to secondary db 3.reintroduce primary as secondary node following [0] 4.fail back over to new secondary node Actual results: failover works correctly, however appliance_console info and systemctl status show postgres as not running Expected results:$APPLIANCE_PG_SERVICE show correctly running and appliance_console info showing correct status Additional info: [0]https://access.redhat.com/documentation/en/red-hat-cloudforms/4.2/single/configuring-high-availability/ I can see that postgres is running correctly via the following commnad: ps aux | grep post
*** Bug 1426721 has been marked as a duplicate of this bug. ***
Right now the console summary only lists the database as running if it is running under systemd. In some cases (repmgr) the database is started using pg_ctl directly rather than systemctl. This will cause the console summary to report that the database is not running. I propose that we use `pg_ctl -D $APPLIANCE_PG_DATA status` to determine the status of the database as that will always be correct.
A configuration option for how to start the PG service after a follow was added in repmgr 3.2. To fix this issue we would have to upgrade and set the service_*_command in repmgr.conf. ref: http://www.repmgr.org/release-notes-3.2.html
https://github.com/ManageIQ/manageiq-appliance_console/pull/49
https://github.com/ManageIQ/manageiq-appliance/pull/197
New commit detected on ManageIQ/manageiq-appliance_console/master: https://github.com/ManageIQ/manageiq-appliance_console/commit/08dd210add84b0f714fcb89912aaa25ece08761a commit 08dd210add84b0f714fcb89912aaa25ece08761a Author: Nick Carboni <ncarboni> AuthorDate: Thu Jun 28 16:34:04 2018 -0400 Commit: Nick Carboni <ncarboni> CommitDate: Thu Jun 28 16:34:04 2018 -0400 Fix the repmgr.conf file for the new version - Rename changed keys in the repmgr config file - Set commands to use to manage postgres services - Add newly required data_directory parameter - Add new options to promote and follow commands The --log-to-file option ensures that all output gets to the repmgrd log file. The --upstream-node-id ensures that the correct node is chosen in the case that the old primary comes back online after a successful failover, but before a follow can be completed. https://bugzilla.redhat.com/show_bug.cgi?id=1418080 https://www.pivotaltracker.com/story/show/135779733 lib/manageiq/appliance_console/database_replication.rb | 15 +- spec/database_replication_spec.rb | 23 +- 2 files changed, 30 insertions(+), 8 deletions(-)
https://github.com/ManageIQ/manageiq-appliance/pull/201
New commit detected on ManageIQ/manageiq-appliance/master: https://github.com/ManageIQ/manageiq-appliance/commit/8aba2f893cfbe878eea55c1da0de0b98f60024ab commit 8aba2f893cfbe878eea55c1da0de0b98f60024ab Author: Nick Carboni <ncarboni> AuthorDate: Wed Aug 1 16:53:52 2018 -0400 Commit: Nick Carboni <ncarboni> CommitDate: Wed Aug 1 16:53:52 2018 -0400 Bump versions of the console and HA admin gem Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1544854 Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1418080 Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1535345 Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1586186 https://www.pivotaltracker.com/story/show/135779733 https://www.pivotaltracker.com/story/show/141523501 https://www.pivotaltracker.com/story/show/121849185 manageiq-appliance-dependencies.rb | 4 +- 1 file changed, 2 insertions(+), 2 deletions(-) https://github.com/ManageIQ/manageiq-appliance/commit/92f565db0341156889d1d022f65285de350fa279 commit 92f565db0341156889d1d022f65285de350fa279 Author: Nick Carboni <ncarboni> AuthorDate: Fri Jun 29 16:34:53 2018 -0400 Commit: Nick Carboni <ncarboni> CommitDate: Fri Jun 29 16:34:53 2018 -0400 Add a sudoers include file for repmgr This will allow the postgresql user to use systemctl to manage the rh-postgresql95-postgresql service when run from repmgrd https://www.pivotaltracker.com/story/show/141523501 https://www.pivotaltracker.com/story/show/135779733 https://bugzilla.redhat.com/show_bug.cgi?id=1418080 LINK/etc/sudoers.d/repmgr | 2 + 1 file changed, 2 insertions(+)
On the first (primary), then failed node, after adding back and failing the second node I see this in the appliance_console: Local Database Server: running (primary) CFME Version: 5.10.0.19
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2019:0212