Bug 1483033

Summary: [RFE] Fail when backing up to a directory postgres cannot write to
Product: Red Hat Satellite Reporter: Christine Fouant <cfouant>
Component: Backup & RestoreAssignee: Christine Fouant <cfouant>
Status: CLOSED ERRATA QA Contact: Peter Ondrejka <pondrejk>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.2.11CC: akarimi, bbuckingham, bkearney, cwelton, dleatherman, gpatil, hmore, jomitsch, mmccune, omankame, pondrejk, rajgupta, rhbgs.10.bigi_gigi, sthirugn
Target Milestone: 6.4.0Keywords: FutureFeature, Triaged
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-10-16 15:28:38 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Repoduced none

Description Christine Fouant 2017-08-18 16:02:58 UTC
Description of problem:
Currently Backup needs to be written to either /tmp or /var/tmp (preferred). Writing to any other root directory fails because Postgres cannot write files to those locations.

We should have a failure with warning if user selects a directory to which Postgres doesn't have access.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. run # katello-backup /any/directory/not/tmp


Actual results:
Failure without any reasoning other than tar file creation fails.

Expected results:
Fail with note to write to /var/tmp or /tmp so that Postgres has write access

Additional info:

Comment 1 Christine Fouant 2017-08-18 19:59:56 UTC
Created redmine issue http://projects.theforeman.org/issues/20650 from this bug

Comment 2 Peter Ondrejka 2017-08-21 08:34:50 UTC
A note on reproducer, as of snap 11, the absolute paths like /backup-dir/... work all right, the problem arises with relative paths under root, e.g. `katello-backup backup-dir` where backup-dir is actually /root/backup-dir

Comment 4 Satellite Program 2017-09-29 18:32:50 UTC
Moving this bug to POST for triage into Satellite 6 since the upstream issue http://projects.theforeman.org/issues/20650 has been resolved.

Comment 5 Peter Ondrejka 2017-10-16 09:33:26 UTC
Blocked by https://bugzilla.redhat.com/show_bug.cgi?id=1497957

Comment 6 Corey Welton 2017-10-23 17:44:37 UTC
Snap 21:

I tried backing up to a user's home directory and this occurred:

Creating backup folder /home/corey/satellite-backup-20171023134042
****cancelled****
Postgres user needs write access to the backup directory
Please select a directory, such as /tmp or /var/tmp which allows Postgres write access
Cleaning up backup folder and starting any stopped services... 
Redirecting to /bin/systemctl start mongod.service
Redirecting to /bin/systemctl start postgresql.service
Redirecting to /bin/systemctl start tomcat.service
Redirecting to /bin/systemctl start pulp_workers.service
Redirecting to /bin/systemctl start pulp_resource_manager.service
Redirecting to /bin/systemctl start pulp_streamer.service
Redirecting to /bin/systemctl start pulp_celerybeat.service
Redirecting to /bin/systemctl start httpd.service
Redirecting to /bin/systemctl start foreman-tasks.service


My question - why are we unnecessarily (re)starting services if the action never succeeded?  This stuff needs to be in a loop and not processed if we fail check in the step above.

This may well be a different bug/behavior, but it is really sort of an undesirable outcome of the 'fix'...

Will leave this for pondrejka and cfouant to hash out.

Comment 7 Christine Fouant 2017-10-23 17:55:45 UTC
This goes through a starting of all services in case it was further along in the backup process. I can suppress the output, but would need to start services in other scenarios. Possibly I could run a check to only start services that had been stopped, I'm sure that's an option, although it might end up making the hang time in the cleanup process minimally longer.

Comment 8 Christine Fouant 2017-10-23 17:55:55 UTC
This goes through a starting of all services in case it was further along in the backup process. I can suppress the output, but would need to start services in other scenarios. Possibly I could run a check to only start services that had been stopped, I'm sure that's an option, although it might end up making the hang time in the cleanup process minimally longer.

Comment 9 Corey Welton 2017-10-23 19:44:03 UTC
> Possibly I could run a check to only start services that had been stopped, 
> I'm sure that's an option, although it might end up making the hang time in 
> the cleanup process minimally longer.

I guess I'm just not sure why we are stopping/starting services at all if the backup location is inaccessible. It seems to me that this should be the first thing we do, and then error out/abort prior to stopping to starting services.

Comment 10 Peter Ondrejka 2017-10-24 13:32:34 UTC
This is definitely not ideal as it actually creates the directory before doing the validity check, this happens even with a nonsense input:

~]# katello-backup kjkjlk

DEPRECATION WARNING: katello-backup is deprecated and will
be removed in the next Satellite release. It is being replaced by
satellite-backup. Redirecting to satellite-backup now.

Starting backup: 2017-10-24 09:06:22 -0400
Creating backup folder kjkjlk/satellite-backup-20171024090622
****cancelled****
Postgres user needs write access to the backup directory
Please select a directory, such as /tmp or /var/tmp which allows Postgres write access
Cleaning up backup folder and starting any stopped services... 
Redirecting to /bin/systemctl start mongod.service
Redirecting to /bin/systemctl start postgresql.service
Redirecting to /bin/systemctl start tomcat.service
Redirecting to /bin/systemctl start pulp_workers.service
Redirecting to /bin/systemctl start pulp_resource_manager.service
Redirecting to /bin/systemctl start pulp_streamer.service
Redirecting to /bin/systemctl start pulp_celerybeat.service
Redirecting to /bin/systemctl start httpd.service
Redirecting to /bin/systemctl start foreman-tasks.service
Done.

~]# ll
drwxr-xr-x. 3 root root    45 Oct 24 09:06 kjkjlk

So the requirements check needs to be done before anything else is done. In this case we could have some pre-check to verifiy the supplied path starts with /tmp/ or /var/tmp or something like that. Hostname-change has a similar pre-check function https://github.com/Katello/katello-packaging/blob/master/katello/hostname-change.rb#L84

Comment 14 Bengt Giger 2018-01-15 15:12:22 UTC
Please do not mix up online and offline backup! Offline backup does not depend on write access for postgres user, it is a simple tar operation. 

After the upgrade to 6.2.13, until we realized, we had several days of failed backups because the function validate_directory() is used for offline backups. Backup tars are written by root, and due to the unnecessary check the whole backup process exited.

Comment 17 David Leatherman 2018-03-28 15:01:49 UTC
Created attachment 1414270 [details]
Repoduced

I am experiencing the same exact issue with satellite-backup.

/usr/sbin/satellite-backup /backup/bkup --logical-db-backup -y >> $LOG 2>&1

Result in log:
Starting backup: 2018-03-28 02:00:04 +0000
Creating backup folder /backup/bkup/satellite-backup-20180328020004
sudo: sorry, you must have a tty to run sudo
****cancelled****
Postgres user needs write access to the backup directory
Please select a directory, such as /tmp or /var/tmp which allows Postgres write access
Cleaning up backup folder and starting any stopped services...
Redirecting to /bin/systemctl start mongod.service
Redirecting to /bin/systemctl start postgresql.service
Redirecting to /bin/systemctl start tomcat.service
Redirecting to /bin/systemctl start pulp_workers.service
Redirecting to /bin/systemctl start pulp_resource_manager.service
Redirecting to /bin/systemctl start pulp_streamer.service
Redirecting to /bin/systemctl start pulp_celerybeat.service
Redirecting to /bin/systemctl start httpd.service
Redirecting to /bin/systemctl start foreman-tasks.service
Done.
Wed Mar 28 02:00:22 UTC 2018

Comment 18 Peter Ondrejka 2018-09-05 11:43:46 UTC
Verified on Satellite 6.4 snap 20, katello backup is now replaced with foreman maintain backup, which handles the situation with the following check

foreman-maintain backup online test
...
Check if the directory exists and is writable:                        [FAIL]
Postgres user needs write access to the backup directory 
Please allow the postgres user write access to /root/test/satellite-backup-2018-09-05-07-19-14 or choose another directory.

There still is a minor cleanup issue under certain circumstances (filed for https://projects.theforeman.org/issues/24825 it), though I don't think it should block verification here.

Comment 20 errata-xmlrpc 2018-10-16 15:28:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2927