Bug 1465182

Summary: Add warning when using --online-backup
Product: Red Hat Satellite Reporter: Mike McCune <mmccune>
Component: Backup & RestoreAssignee: Christine Fouant <cfouant>
Status: CLOSED ERRATA QA Contact: Ales Dujicek <adujicek>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.2.10CC: adujicek, bkearney, cduryee, dhawke, ehelms, jcallaha, mmccune, sjayapra
Target Milestone: UnspecifiedKeywords: Triaged
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: tfm-rubygem-katello-3.4.4 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-02-21 17:32:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Mike McCune 2017-06-26 23:09:54 UTC
We offer the --online-backup flag in 6.2 but there needs to be a warning statement when running with this flag as there is a caveat with its use.

Satellite 6 uses two database systems, Postgres and Mongo. There are records that exist in Postgres and Mongo that need to remain consistent and in sync between each system. 

When you shut down all the services you can ensure that there are no modifications occurring to either database. The --online-backup flag keeps all services running so there is a possibility that exists that data being modified while the backups are occurring. There is a basic check in place in the backup routine that checks to see if the database was modified during the backup. If this occurs, it will re-backup the data and start over. 

This check is rudimentary and can't ensure with 100% certainty that there were no modifications to the Pulp or Postgres database while the backup was running. This check can also result in repeated loops if there is continuous modification occurring to the database. If the user still wishes to use the --online-backup flag for production use they will need to ensure that there are no modifications that occur during the backup runs. 

We need to add a warning that explains this, for example (the wording is open for discussion):

# katello-backup /var/tmp --online-backup

*** WARNING: The online backup flag is intended for making a copy of the data
*** for debugging purposes only. The backup routine can not ensure 100% consistency while the 
*** backup is taking place as there is a chance there may be data mismatch between 
*** Mongo and Postgres databases while the services are live. If you wish to utilize the --online-backup
*** flag for production use you need to ensure that there are no modifications occurring during 
*** your backup run. 

Satellite 6.3 will offer LVM based snapshots for 'hot backup' function and we can then remove this warning.

Comment 1 Mike McCune 2017-06-26 23:11:51 UTC
We also need to document possible mitigation steps that users of Satellite 6.2 can do to minimize the chances of data mismatch between Postgres and Mongo.

This would involve things like:

* Bulk disable all sync plans while --online-backup is running
* Disable pulp workers while --online-backup is running
* Possible firewall rules to block API access to pulp to ensure nothing is modifying 
* Disable any cron tasks that operate on pulp

Comment 2 Chris Duryee 2017-06-27 00:58:46 UTC
implementing comment #1 is very similar to https://bugzilla.redhat.com/show_bug.cgi?id=1420648, it may be possible to roll them both into one feature

Comment 5 Christine Fouant 2017-07-10 19:47:22 UTC
Created redmine issue http://projects.theforeman.org/issues/20268 from this bug

Comment 6 Satellite Program 2017-07-10 20:06:30 UTC
Moving this bug to POST for triage into Satellite 6 since the upstream issue http://projects.theforeman.org/issues/20268 has been resolved.

Comment 8 Satellite Program 2017-08-03 22:08:21 UTC
Moving this bug to POST for triage into Satellite 6 since the upstream issue http://projects.theforeman.org/issues/20268 has been resolved.

Comment 14 Bryan Kearney 2017-10-18 19:48:15 UTC
clearing the needinfo.

Comment 15 Bryan Kearney 2018-02-21 17:32:20 UTC
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA.

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:0336

Comment 16 Bryan Kearney 2018-02-21 17:32:56 UTC
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA.

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:0336