Bug 1561075 - Unable to restore database to any ha node in a cluster
Summary: Unable to restore database to any ha node in a cluster
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat CloudForms Management Engine
Classification: Red Hat
Component: Appliance
Version: 5.9.0
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: GA
: 5.10.0
Assignee: Nick Carboni
QA Contact: luke couzens
URL:
Whiteboard: black:ha
: 1418256 (view as bug list)
Depends On:
Blocks: 1578957
TreeView+ depends on / blocked
 
Reported: 2018-03-27 15:00 UTC by luke couzens
Modified: 2019-02-11 14:07 UTC (History)
5 users (show)

Fixed In Version: 5.10.0.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1578957 (view as bug list)
Environment:
Last Closed: 2019-02-11 14:07:34 UTC
Category: ---
Cloudforms Team: CFME Core
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description luke couzens 2018-03-27 15:00:39 UTC
Description of problem:Unable to restore database to promoted secondary ha node


Version-Release number of selected component (if applicable):5.9.2.0


How reproducible:100%


Steps to Reproduce:
1.configure ha environment
2.simulate fail over to secondary node
3.create backup pg_dump or pg_basebackup
4.stop evm on non-VMDB appliance
5.stop rh-postgresql95-repmgr on vmdb appliances
6.dropdb/createdb
7.use appliance console to restore db using option 4/1

Actual results:
Restore Database From Backup

Note: A database restore cannot be undone.  The restore will use the file: /tmp/evm_db.backup.
Are you sure you would like to restore the database? (Y/N): y

Restoring the database...

Database restore failed. Check the logs for more information

Press any key to continue.

Expected results:
Backup restored correctly

Additional info:
Looking at appliance_console.log we can see the restore seems to fail due to missing database.yml and v2_key

http://pastebin.test.redhat.com/569287

These two files are not created during configuration of the standby node.

A work around is to copy the files from another appliance then the restore functions correctly.

Comment 2 luke couzens 2018-04-12 16:10:04 UTC
So I was testing this on 5.9.2.2 and further investigation revealed this is not just a secondary node issue. This is actually a regression and prevents restoring database to any ha node in a cluster.

The only work around is to make sure you copy the v2_key and database.yml from the non vmdb appliance (assume thats where its stored in your cluster)

It seems like there was a change since 5.9.0.22 causing this issue.

Adding regression flag, raising priority to high and renaming bug.

Comment 3 Nick Carboni 2018-04-30 20:16:42 UTC
> It seems like there was a change since 5.9.0.22 causing this issue.

Does this mean that it worked in 5.9.0.22 or that 5.9.0.22 was the first version it was broken in?

Additionally, we don't put the v2_key or database.yml on any standalone DB appliance, so I'm not sure HA really has anything to do with it in this case.

Can you try restoring to a standalone DB appliance through the console? I imagine it will fail in the same way.

Comment 4 luke couzens 2018-05-01 11:23:19 UTC
Hi Nick,

Sorry for confusion, it works correctly in 5.9.0.22, builds after that don't work correctly. You are also correct its not directly connected to HA its down to standalone db configurations.

Comment 5 Nick Carboni 2018-05-01 19:26:09 UTC
I believe the commit that introduced this issue was https://github.com/ManageIQ/manageiq/commit/4d7af028694bc612b5d66f55bc65781b71329bac which was part of the fix for https://bugzilla.redhat.com/show_bug.cgi?id=1553903

That was only included in 5.9.2 which means this should work in 5.9.1 also.

Also, are you running the restore through the console or using the rake task directly? It looks like the console case would have also failed on a standalone db even before that commit. If you are running it through the rake task are you doing that because we expect the console case to fail?

Comment 6 Nick Carboni 2018-05-01 19:29:51 UTC
Running this through the task previously would have bypassed the connection checking which got moved into the rake task after the commit from comment 5.

If we still want the connection checking, I think we will probably need to add the v2_key and database.yml to the standalone database machines rather than trying to connect to the database through a rake task without using activerecord.

Comment 7 luke couzens 2018-05-02 08:59:15 UTC
Hey Nick, sorry I didnt try it on 5.9.1 so its possible that it works there. As for restore I am running all through appliance_console.

Comment 9 CFME Bot 2018-05-02 18:14:47 UTC
New commits detected on ManageIQ/manageiq-appliance_console/master:

https://github.com/ManageIQ/manageiq-appliance_console/commit/2243b5a00632d7aad08fc5ea35111dd1cfa62a83
commit 2243b5a00632d7aad08fc5ea35111dd1cfa62a83
Author:     Nick Carboni <ncarboni>
AuthorDate: Tue May  1 16:29:12 2018 -0400
Commit:     Nick Carboni <ncarboni>
CommitDate: Tue May  1 16:29:12 2018 -0400

    Move the key logic to a method and call it from standby configuration

    This will ensure the v2_key is present when configuring a standby
    database

    https://bugzilla.redhat.com/show_bug.cgi?id=1561075

 bin/appliance_console | 35 +-
 1 file changed, 20 insertions(+), 15 deletions(-)


https://github.com/ManageIQ/manageiq-appliance_console/commit/dac69bf7ca8cb6608578705f895f51ffea9f089c
commit dac69bf7ca8cb6608578705f895f51ffea9f089c
Author:     Nick Carboni <ncarboni>
AuthorDate: Tue May  1 17:25:32 2018 -0400
Commit:     Nick Carboni <ncarboni>
CommitDate: Tue May  1 17:25:32 2018 -0400

    Save database.yml even when configuring a database-only appliance

    https://bugzilla.redhat.com/show_bug.cgi?id=1561075

 lib/manageiq/appliance_console/internal_database_configuration.rb | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


https://github.com/ManageIQ/manageiq-appliance_console/commit/fdde0fec5b976da5c419a30d6bcce60a4ed2645f
commit fdde0fec5b976da5c419a30d6bcce60a4ed2645f
Author:     Nick Carboni <ncarboni>
AuthorDate: Tue May  1 17:26:09 2018 -0400
Commit:     Nick Carboni <ncarboni>
CommitDate: Tue May  1 17:26:09 2018 -0400

    Save database.yml when configuring a standby database appliance

    https://bugzilla.redhat.com/show_bug.cgi?id=1561075

 lib/manageiq/appliance_console/database_replication_standby.rb | 5 +
 spec/database_replication_standby_spec.rb | 11 +
 2 files changed, 16 insertions(+)

Comment 11 CFME Bot 2018-05-03 13:42:31 UTC
New commit detected on ManageIQ/manageiq-appliance/master:

https://github.com/ManageIQ/manageiq-appliance/commit/c56df73314f300a54d7e03e8e7a545725cf0a6d1
commit c56df73314f300a54d7e03e8e7a545725cf0a6d1
Author:     Nick Carboni <ncarboni>
AuthorDate: Wed May  2 15:17:01 2018 -0400
Commit:     Nick Carboni <ncarboni>
CommitDate: Wed May  2 15:17:01 2018 -0400

    Update the included console version to 2.0.2

    Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1561075

 manageiq-appliance-dependencies.rb | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comment 12 Nick Carboni 2018-05-03 18:18:08 UTC
*** Bug 1418256 has been marked as a duplicate of this bug. ***

Comment 13 CFME Bot 2018-05-03 19:50:16 UTC
New commit detected on ManageIQ/manageiq-appliance_console/master:

https://github.com/ManageIQ/manageiq-appliance_console/commit/19693ff7f445e3927e331eabe29fdef3b7ee2f44
commit 19693ff7f445e3927e331eabe29fdef3b7ee2f44
Author:     Nick Carboni <ncarboni>
AuthorDate: Thu May  3 14:05:34 2018 -0400
Commit:     Nick Carboni <ncarboni>
CommitDate: Thu May  3 14:05:34 2018 -0400

    Fix the ensure_key_configured helper method

    This method needs to be defined outside of the console module
    which runs the control loop during the module definition

    https://bugzilla.redhat.com/show_bug.cgi?id=1561075

 bin/appliance_console | 36 +-
 1 file changed, 18 insertions(+), 18 deletions(-)

Comment 14 CFME Bot 2018-05-03 22:02:29 UTC
New commit detected on ManageIQ/manageiq-appliance/master:

https://github.com/ManageIQ/manageiq-appliance/commit/b363cc66380d7750d03bcb53613fd5da172446a5
commit b363cc66380d7750d03bcb53613fd5da172446a5
Author:     Nick Carboni <ncarboni>
AuthorDate: Thu May  3 17:51:45 2018 -0400
Commit:     Nick Carboni <ncarboni>
CommitDate: Thu May  3 17:51:45 2018 -0400

    Update the console to version 2.0.3

    https://bugzilla.redhat.com/show_bug.cgi?id=1561075

 manageiq-appliance-dependencies.rb | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comment 16 luke couzens 2018-06-15 17:33:33 UTC
Verified in 5.10


Note You need to log in before you can comment on or make changes to this bug.