2172540 – "Restoring postgresql global objects" step is buggy and not required

Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 2172540 - "Restoring postgresql global objects" step is buggy and not required

Summary: "Restoring postgresql global objects" step is buggy and not required

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Satellite
Classification:	Red Hat
Component:	Satellite Maintain
Sub Component:
Version:	6.13.0
Hardware:	All
OS:	All
Priority:	high
Severity:	high
Target Milestone:	6.13.0
Assignee:	Evgeni Golov
QA Contact:	Griffin Sullivan
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2023-02-22 13:29 UTC by Sayan Das
Modified:	2023-12-02 04:26 UTC (History)
CC List:	6 users (show)
Fixed In Version:	rubygem-foreman_maintain-1.2.7
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2023-05-03 13:25:13 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Foreman Issue Tracker	36152	High	New	"Restoring postgresql global objects" step is buggy and not required	2023-03-02 07:47:21 UTC
Red Hat Issue Tracker	SAT-16117	None	None	None	2023-02-23 15:19:33 UTC
Red Hat Product Errata	RHSA-2023:2097	None	None	None	2023-05-03 13:25:23 UTC

Description Sayan Das 2023-02-22 13:29:08 UTC

Description of problem:

Online backup can be taken successfully but the same backup cannot be restored on a new instance of Satellite 6.13


Version-Release number of selected component (if applicable):

# rpm -q satellite rubygem-foreman_maintain satellite-maintain 
satellite-6.13.0-6.el8sat.noarch
rubygem-foreman_maintain-1.2.4-1.el8sat.noarch
satellite-maintain-0.0.1-1.el8sat.noarch



How reproducible:

Always


Steps to Reproduce:

1. Install a Satellite 6.13 and configure repos\CVs etc stuff

2. Mount an NFS share on /mnt to store the backup data

3. Collect online backup of satellite without the pulp data inside /mnt ( for me it's /mnt/breakfix3/online/ )

   # satellite-maintain backup online -s -y /mnt/breakfix3/online/

4. Verify the backup content and then Unmount the nfs share

5. Rebuild the OS of that instance with exact same details and install a fresh Satellite 6.13 on it.

6. Now, mount back the NFS share and try restoring the backup e.g.

   # satellite-maintain restore -y /mnt/breakfix3/online/satellite-backup-2023-02-22-14-57-56/


Actual results:


\ All services stopped                                                [OK]      
--------------------------------------------------------------------------------
Drop postgresql databases: 
/ Dropping pulpcore database                                          [OK]      
--------------------------------------------------------------------------------
Start applicable services: 

Starting the following service(s):
postgresql
/ All services started                                                [OK]      
--------------------------------------------------------------------------------
Restore any existing postgresql global objects from backup: 
| Restoring postgresql global objects                                 [FAIL]    
Failed executing PGPASSWORD='[FILTERED]' psql -h localhost  -p 5432 -U foreman -f /mnt/breakfix3/online/satellite-backup-2023-02-22-14-57-56/pg_globals.dump postgres 2>/dev/null, exit status 2:
 psql: error: FATAL:  password authentication failed for user "foreman"
--------------------------------------------------------------------------------
Scenario [Restore backup] failed.

The following steps ended up in failing state:

  [restore-pg-global-objects]

Resolve the failed steps and rerun the command.

If the situation persists and, you are unclear what to do next,
contact Red Hat Technical Support.

In case the failures are false positives, use
--whitelist="restore-pg-global-objects"



Running Rescue Restore backup
================================================================================


Expected results:


No such errors should come when restoring the backup


Additional info:

Comment 2 Sayan Das 2023-02-22 15:24:38 UTC

JFYI, Another colleague of mine has confirmed the issue as reproducible with online backup restoration.

Comment 3 Sayan Das 2023-02-23 12:52:40 UTC

Also, I tested today and I had no issues while restoring the capsule 6.13 from online backup

Comment 4 Brad Buckingham 2023-02-23 15:18:37 UTC

Is this a regression from Satellite 6.12?

Comment 5 Sayan Das 2023-02-23 15:34:27 UTC

(In reply to Brad Buckingham from comment #4)
> Is this a regression from Satellite 6.12?

I am positive it works on 6.11 just fine but I could not test it on 6.12 recently due to a lack of resources with me. 

So until I am able to test it on Sat 6.12, Unfortunately, I can unable to confirm whether it's a regression from 6.12 or not.

Comment 14 Evgeni Golov 2023-03-01 11:18:16 UTC

@saydas I've created a PR that just drops this code, as it is broken and unnecessary: https://github.com/theforeman/foreman_maintain/pull/691

If you have a system where you can reliably reproduce the issue, could you please try the RPM attached to that PR and see if it manages to restore a working system for you?

Comment 15 Sayan Das 2023-03-01 11:30:12 UTC

Hello Evgeni

Sure. Let me try it out by EOD today\tomorrow and then get back to you. 

( Keeping the needinfo flag enabled )


-- Sayan

Comment 16 Sayan Das 2023-03-01 15:36:10 UTC

Before I tried the RPM from the PR , Evgeni had look a peek at my latest reproducer and the only conclusion was that something had changed the DB password while the restoration was still in progress but no idea who and how ( as I was the sole admin working on that system ).

So I rebuilt that system, Installed a fresh instance of 6.13, Installed rubygem-foreman_maintain-1.2.6-1.20230301111458534583.pr691.1.g53dac01.el8.noarch.rpm from the PR and now I could successfully restore the online backup without any issues. 

The Satellite UI looked healthy as well. 


# rpm -q satellite rubygem-foreman_maintain
satellite-6.13.0-6.el8sat.noarch
rubygem-foreman_maintain-1.2.6-1.20230301111458534583.pr691.1.g53dac01.el8.noarch



Restore :

...

Confirm dropping databases and running restore: 

WARNING: This script will drop and restore your database.
Your existing installation will be replaced with the backup database.
Once this operation is complete there is no going back.
Do you want to proceed? (assuming yes)
                                                                      [OK]      
--------------------------------------------------------------------------------
Setting file security: 
- Restoring SELinux context                                           [OK]      
--------------------------------------------------------------------------------
Restore configs from backup: 
\ Restoring configs                                                   [OK]      
--------------------------------------------------------------------------------
Run installer reset: 
/ Installer reset                                                     [OK]      
--------------------------------------------------------------------------------
Stop applicable services: 

Stopping the following service(s):
redis, postgresql, pulpcore-api, pulpcore-content, pulpcore-api.socket, pulpcore-content.socket, pulpcore-worker, pulpcore-worker, tomcat, dynflow-sidekiq@orchestrator, foreman, httpd, foreman.socket, dynflow-sidekiq@worker-1, dynflow-sidekiq@worker-hosts-queue-1, foreman-proxy
/ All services stopped                                                [OK]      
--------------------------------------------------------------------------------
Drop postgresql databases: 
- Dropping pulpcore database                                          [OK]      
--------------------------------------------------------------------------------
Start applicable services: 

Starting the following service(s):
postgresql
- All services started                                                [OK]      
--------------------------------------------------------------------------------
Restore candlepin postgresql dump from backup: 
- Restoring candlepin dump                                            [OK]      
--------------------------------------------------------------------------------
Restore foreman postgresql dump from backup: 
\ Restoring foreman dump                                              [OK]      
--------------------------------------------------------------------------------
Restore pulpcore postgresql dump from backup: 
\ Restoring pulpcore dump                                             [OK]      
--------------------------------------------------------------------------------
Stop applicable services: 

Stopping the following service(s):
postgresql
- All services stopped                                                [OK]      
--------------------------------------------------------------------------------
Migrate pulpcore db: 
/ Migrating pulpcore database                                         [OK]      
--------------------------------------------------------------------------------
Ensure Candlepin runs all migrations after restoring the database:    [OK]
--------------------------------------------------------------------------------
Start applicable services: 

Starting the following service(s):
redis, postgresql, pulpcore-api, pulpcore-content, pulpcore-worker, pulpcore-worker, tomcat, dynflow-sidekiq@orchestrator, foreman, httpd, dynflow-sidekiq@worker-1, dynflow-sidekiq@worker-hosts-queue-1, foreman-proxy
| All services started                                                [OK]      
--------------------------------------------------------------------------------
Run daemon reload:                                                    [OK]
--------------------------------------------------------------------------------
Procedures::Installer::Upgrade:                                       [OK]
--------------------------------------------------------------------------------
Execute upgrade:run rake task:                                        [OK]
--------------------------------------------------------------------------------


# hammer ping
database:         
    Status:          ok
    Server Response: Duration: 0ms
candlepin:        
    Status:          ok
    Server Response: Duration: 29ms
candlepin_auth:   
    Status:          ok
    Server Response: Duration: 32ms
candlepin_events: 
    Status:          ok
    message:         0 Processed, 0 Failed
    Server Response: Duration: 0ms
katello_events:   
    Status:          ok
    message:         0 Processed, 0 Failed
    Server Response: Duration: 0ms
pulp3:            
    Status:          ok
    Server Response: Duration: 62ms
pulp3_content:    
    Status:          ok
    Server Response: Duration: 51ms
foreman_tasks:    
    Status:          ok
    Server Response: Duration: 3ms

Comment 17 Evgeni Golov 2023-03-02 07:46:20 UTC

Updating the "real" description of this bug to:

The "Restoring postgresql global objects" step in "foreman-maintain restore" is buggy:
- it's executing the restore as the "foreman" user who has not the sufficient permissions to perform the actions (modifying global objects requires superuser permission)
- it sometimes fails with 'password authentication failed for user "foreman"' as the password of the user is changed during the overall restore procedure

However, it's also not required, as the only thing it's supposed to do is to (re-)create the users in the database, but that's already handled by the Installer run that happens before the Globals Restore.

To avoid any further issues, we can drop this step completely and rely on the Installer for the management of the database users.

Comment 18 Evgeni Golov 2023-03-02 07:47:20 UTC

Created redmine issue https://projects.theforeman.org/issues/36152 from this bug

Comment 19 Bryan Kearney 2023-03-02 08:03:29 UTC

Moving this bug to POST for triage into Satellite since the upstream issue https://projects.theforeman.org/issues/36152 has been resolved.

Comment 21 Griffin Sullivan 2023-03-23 13:47:38 UTC

Verified on 6.13 snap 15

Backup and restore works without the "Restoring postgresql global objects" step

Steps to Reproduce:

1) Setup whatever content on Satellite

2) # satellite-maintain backup online -s -y /mnt/online/

3) # satellite-maintain restore -y /mnt/online/<backup-name>

4) Check /var/log/foreman-maintain/foreman-maintain.log and verify pg_globals.dump was not ran


Results:

Backup and restore run successfully and pg_globals.dump is not present in the logs

Comment 24 errata-xmlrpc 2023-05-03 13:25:13 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Satellite 6.13 Release), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:2097

Comment 25 Red Hat Bugzilla 2023-12-02 04:26:21 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days

Note You need to log in before you can comment on or make changes to this bug.