1726595 – [RFE] Speed up the restore process

Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1726595 - [RFE] Speed up the restore process

Summary: [RFE] Speed up the restore process

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Satellite
Classification:	Red Hat
Component:	Satellite Maintain
Sub Component:
Version:	6.3.5
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	6.8.0
Assignee:	Anurag Patel
QA Contact:	Lucie Vrtelova
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1725409 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2019-07-03 09:07 UTC by Mikel Olasagasti
Modified:	2023-10-06 18:24 UTC (History)
CC List:	8 users (show)
Fixed In Version:	rubygem-foreman_maintain-0.6.3
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1851133 (view as bug list)
Environment:
Last Closed:	2020-10-27 12:38:20 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Foreman Issue Tracker	28881	0	High	Closed	[RFE] Speed up the restore process	2021-01-12 12:47:51 UTC
Red Hat Product Errata	RHBA-2020:4365	0	None	None	None	2020-10-27 12:38:41 UTC

Description Mikel Olasagasti 2019-07-03 09:07:17 UTC

TL;DR during Pulp backup only save RPM files and once Satellite is restored launch a job that will create tasks to regenerate metadata of each CV to parallelize the process.

Description of problem:
Currently a user can choose between doing a backup of the complete Satellite environment or not to backup Pulp content.

If user decided to backup Pulp, pulp_data.tar will contain all the RPMs and all the symbolic links that have been created for Content Views.

In a large environment, with multiple CVs and multiple versions of each CV, pulp_data.tar can contain millions of symbolic links that have to be recovered during the restore. Reading a tar file is a lineal non-threaded process and writing millions of symbolic files can take days, as it will use just one CPU core.

If user decided not to backup Pulp, pulp_data.tar will be empty. Restore will be much faster but Satellite needs to sync all the files again. In case there are custom RPMs that have been uploaded using UI or hammer, it is required to upload them again. Once all content is available, metadata regeneration is required for each version of CVs.

One way to improve this could be to backup Pulp, but only RPM data and no symbolic links. With this method, pulp_data.tar restore won't take as much time as when tar file contains symbolic links. Once Satellite is running again, an automated process should launch metadata regeneration for each of the CVs, prioritizing the ones that are promoted. This would create a task for each regeneration and allow the usage of multiple CPUs, parallelizing the task and having a much faster restore than when symbolic files have to be created from tar file.

How reproducible:
Always

Steps to Reproduce:
1. Sync multiple RHEL releases
2. Create different CVs
3. Backup with full Pulp content
4. Restore with Pulp content

Actual results:
Restore will start creating millions of symbolic links from tar, that uses a single CPU for this with no parallelism. This process can take days.

Expected results:
Be able to have a running Satellite faster, even in degraded mode until it ends to fully recover.

Comment 8 Mike McCune 2020-01-27 04:01:36 UTC

via an internal email thread:

"""
From: 	Hao Chang Yu <hyu>
To: 	Mike McCune <mmccune>

This may related to massive amount of symlinks in the tar files and causes high memory consumption.

It seems that "-P" option can be used to extract the tar file to prevent the delay link creation. This worked for customer in Case No. 02461599.

More details why extracting symlinks are slow
https://bugzilla.redhat.com/show_bug.cgi?id=1759140

If this really works then we need to add the tar 's "-P" option to the foreman_maintain restore script.
"""

Comment 10 Bryan Kearney 2020-01-29 09:02:29 UTC

Upstream bug assigned to mmccune

Comment 11 Bryan Kearney 2020-01-29 09:02:32 UTC

Upstream bug assigned to mmccune

Comment 13 Bryan Kearney 2020-02-12 19:02:07 UTC

Moving this bug to POST for triage into Satellite 6 since the upstream issue https://projects.theforeman.org/issues/28881 has been resolved.

Comment 15 Mike McCune 2020-04-06 17:13:40 UTC

*** Bug 1725409 has been marked as a duplicate of this bug. ***

Comment 19 errata-xmlrpc 2020-10-27 12:38:20 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Satellite 6.8 Satellite Maintenance Release), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4365

Note You need to log in before you can comment on or make changes to this bug.