1532147 – katello-backup needs DBs running - not checked, doc says contrary

Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1532147 - katello-backup needs DBs running - not checked, doc says contrary

Summary: katello-backup needs DBs running - not checked, doc says contrary

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Satellite
Classification:	Red Hat
Component:	Satellite Maintain
Sub Component:
Version:	Unspecified
Hardware:	All
OS:	Linux
Priority:	high
Severity:	medium
Target Milestone:	6.4.0
Assignee:	Martin Bacovsky
QA Contact:	Lukáš Hellebrandt
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-01-08 07:32 UTC by Pavel Moravec
Modified:	2021-03-11 16:50 UTC (History)
CC List:	11 users (show)
Fixed In Version:	rubygem-foreman_maintain-0.2.4
Doc Type:	Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed:	2018-10-16 19:28:50 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Foreman Issue Tracker	24366	0	Normal	Closed	mongo_db_up check cannot start mongo34	2020-04-03 10:37:29 UTC
Red Hat Knowledge Base (Solution)	3312651	0	None	None	None	2018-01-08 08:05:05 UTC

Description Pavel Moravec 2018-01-08 07:32:27 UTC

Description of problem:
Try running katello-backup with postgresql+mongo down, few issues can be hit:

1) "foreman-rake plugin:list" fails to collect plugin list and raises error "PG::Error: could not connect to server"

2) when running with --online-backup, katello-backup fails to call pg_dump* and mongodump commands.


Since our documentation in:

https://access.redhat.com/documentation/en-us/red_hat_satellite/6.2/html-single/server_administration_guide/#sect-Red_Hat_Satellite-Server_Administration_Guide-Backup_and_Disaster_Recovery-Backing_up_Satellite_Server_or_Capsule_Server

suggests to run the backup after running katello-service stop, we suggest to let the tool fail at *any* customer.


I suggest to ensure DBs services are running at the beginning of the backup tool.



Version-Release number of selected component (if applicable):
Sat 6.2.13 / katello-common-3.0.0-31.el7sat.noarch

How reproducible:
100%


Steps to Reproduce:
1. katello-service stop
2. katello-backup -y /tmp
3. katello-backup -y /tmp --online-backup


Actual results:
2. raises error "PG::Error: could not connect to server:" and fails to collect plugins list (metadata.yml in archive will miss it)
3. fails to dump any DB


Expected results:
2. not to raise an error and to collect plugin list in metadata.yml
3. to dump all DBs


Additional info:

Comment 2 Pavel Moravec 2018-01-08 07:50:59 UTC

katello-backup IMHO needs to ensure postgresql and mongod services are running - we can't assume this prior katello-backup run in any case.

I.e. removing the notice about "run katello-service stop prior a backup" from Doc isnt sufficient - still there can be customers and scenarios when the backup tool is run with all / DBs services down.

Comment 4 Bengt Giger 2018-01-15 15:23:38 UTC

Please note that the introduction of the "-y" option made non interactive offline backup scripts break! The backup continued to the offline backup procedure without stopping the databases, leading to backups probably being corrupt. Some days passed until we were aware of the issue, should a restore have been necessary, the results would have been disastrous.

Comment 5 Pavel Moravec 2018-01-16 08:02:30 UTC

(In reply to Bengt Giger from comment #4)
> Please note that the introduction of the "-y" option made non interactive
> offline backup scripts break! The backup continued to the offline backup
> procedure without stopping the databases, leading to backups probably being
> corrupt. Some days passed until we were aware of the issue, should a restore
> have been necessary, the results would have been disastrous.

Good point.

Currently, our documentation says "katello-service stop" must be run before running backup, so that way databases will be stopped.

BUT:
- when stopping katello-service (or just the DBs), "foreman-rake plugin:list" fails to be collected
- when letting DBs running, their backup can be inconsistent


Therefore katello-backup needs:
- running DBs when calling "foreman-rake plugin:list"
- stopped DBs when taking offline backup
- above working regardless of services status at the beginning
- relevant Doc update (is "katello-service stop" really needed?)


Christine, do you need extra BZ for this different but relevant flaw (stop DBs before taking offline backup)? I would vote to deal it within this BZ to cover all scenarios wrt. "above working regardless of services status at the beginning".

Comment 6 Christine Fouant 2018-04-04 18:11:45 UTC

(In reply to Pavel Moravec from comment #0)
> Description of problem:
> Try running katello-backup with postgresql+mongo down, few issues can be hit:
> 
> 1) "foreman-rake plugin:list" fails to collect plugin list and raises error
> "PG::Error: could not connect to server"
> 
> 2) when running with --online-backup, katello-backup fails to call pg_dump*
> and mongodump commands.
> 
> 
> Since our documentation in:
> 
> https://access.redhat.com/documentation/en-us/red_hat_satellite/6.2/html-
> single/server_administration_guide/#sect-Red_Hat_Satellite-
> Server_Administration_Guide-Backup_and_Disaster_Recovery-
> Backing_up_Satellite_Server_or_Capsule_Server
> 
> suggests to run the backup after running katello-service stop, we suggest to
> let the tool fail at *any* customer.
> 
> 
> I suggest to ensure DBs services are running at the beginning of the backup
> tool.
> 
> 
> 
> Version-Release number of selected component (if applicable):
> Sat 6.2.13 / katello-common-3.0.0-31.el7sat.noarch
> 
> How reproducible:
> 100%
> 
> 
> Steps to Reproduce:
> 1. katello-service stop
> 2. katello-backup -y /tmp
> 3. katello-backup -y /tmp --online-backup
> 
> 
> Actual results:
> 2. raises error "PG::Error: could not connect to server:" and fails to
> collect plugins list (metadata.yml in archive will miss it)
> 3. fails to dump any DB
> 
> 
> Expected results:
> 2. not to raise an error and to collect plugin list in metadata.yml
> 3. to dump all DBs
> 
> 
> Additional info:

Hi Pavel - the docs specifically state not to stop services prior if you are running the backup script, but I will put a couple measures in to ensure that the necessary services are not down.

Comment 7 Christine Fouant 2018-04-04 19:17:06 UTC

Created redmine issue http://projects.theforeman.org/issues/23124 from this bug

Comment 12 Kavita 2018-08-16 11:42:05 UTC

This change is available in foreman_maintain-0.2.6 so marking it as ON_QA.

Comment 13 Lukáš Hellebrandt 2018-08-29 13:33:49 UTC

FailedQA with Sat 6.4 snap 19.

The results differ based on service status before running the backup:

# mkdir /tmp/bup
# rm -rf /tmp/bup/*
# katello-service stop
# foreman-maintain backup online /tmp/bup
# katello-service restart
# foreman-maintain backup online /tmp/bup
# diff /tmp/bup/*/metadata.yml

Comment 15 Martin Bacovsky 2018-09-14 09:04:24 UTC

I checked the differences and what I see is:

# diff /tmp/bup/*/metadata.yml
18c18,28
< proxy_features: ''
---
> proxy_features:
> - ansible
> - discovery
> - dynflow
> - logs
> - openscap
> - pulp
> - puppet
> - puppetca
> - ssh
> - tftp

That means the proxy was down and the list of the proxy features couldn't be queried. This data are not used in restore so this is technically valid backup and the issue originally reported was IMO fixed. To make the current state more clear we could add note such as "Internal proxy is down features couldn't be listed" to the metadata but I'd suggest to do that in separate low prio issue. Is that acceptable solution?

Also I don't think we should enforce proxy start prior the backup as there may be reasons for the service being down (data integrity).

Comment 16 Lukáš Hellebrandt 2018-09-17 14:37:19 UTC

That makes sense.

Verified with Sat 6.4 snap 21.

Tried online backup, offline backup, restore. Checked plugin list presence.

Comment 17 Bryan Kearney 2018-10-16 19:28:50 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2927

Note You need to log in before you can comment on or make changes to this bug.