Bug 1315326 - Capsule sync redundantly generates metadata for all repos
Capsule sync redundantly generates metadata for all repos
Product: Red Hat Satellite 6
Classification: Red Hat
Component: Repositories (Show other bugs)
x86_64 Linux
medium Severity high (vote)
: GA
: --
Assigned To: Tomas Strachota
Katello QA List
: Triaged
Depends On:
Blocks: 1327338
  Show dependency treegraph
Reported: 2016-03-07 08:35 EST by Pavel Moravec
Modified: 2017-08-22 07:31 EDT (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1327338 (view as bug list)
Last Closed: 2016-05-06 14:48:07 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

External Trackers
Tracker ID Priority Status Summary Last Updated
Foreman Issue Tracker 14807 None None None 2016-04-26 13:06 EDT

  None (edit)
Description Pavel Moravec 2016-03-07 08:35:40 EST
Description of problem:
Invoking capsule sync:
1) Sat orders capsule to sync _all_ repositories in _all_ content views / lifecycle environments to the capsule
2) for every such repository, pulp on the capsule generates new repo metadata

Assume a use case where Sat having few hundreds of repositories in different content views, and just some repo might need to be synced (as there are doubts if it was synced properly). A capsule sync would do so so much redundant work taking hours.

To see the scope of ridiculous work being done:
- assume a use case where large repos (say rhel5-7 base) are present in many content views as a base - repo metadata takes nontrivial time to be computed, multiply it by # of repos..
- particular example: a customer behind this bug has 1250 repos and capsule sync takes 3-4 hours(!) doing nothing

Please optimize either 1 or 2. While I understand Sat does not know what repo needs and what does not need to be synced to the capsule (i.e. 1 sounds legit), there should be an option to e.g. fetch metadata from Sat to Caps, compare if they are the same and if so, do nothing (and if differ or missing on Caps, then do the sync).

Version-Release number of selected component (if applicable):
Sat 6.1.7

How reproducible:

Steps to Reproduce:
1. Have more pulp repos enabled (e.g. have sat61-tools repo in 10 published content views)
2. do repeatedly capsule sync, without any repo / content view manipulation meantime
3. Check times of execution of 1st sync and other synces
4. ll /var/lib/pulp/published/yum/https/repos/Default_Organization/Library/ContentViewName1/content/dist/rhel/server/7/7Server/x86_64/sat-tools/6.1/os/repodata/

Actual results:
3. synces takes the same/very similar time, nontrivial
4. repodata recalculated every time

Expected results:
3. sync takes small time, if no work to be done
4. repodata not updated / recalculated every time (if there is no need to)

Additional info:
Comment 2 Tomas Strachota 2016-04-06 05:15:14 EDT
After discussion with Ina Panova from Pulp team we found out that satellite creates all repository distributors with auto-publish set to true. That results in the publish action (which re-generates the metadata) being executed on every sync.
We should be able to fix the issue with turning auto-publish off and publishing repos from satellite side only if there is some content synced.
Comment 4 Tomas Strachota 2016-04-26 03:56:56 EDT
Created redmine issue http://projects.theforeman.org/issues/14807 from this bug
Comment 5 Bryan Kearney 2016-04-26 04:14:44 EDT
Upstream bug component is Repositories
Comment 7 Tomas Strachota 2016-04-29 11:22:52 EDT
I'm removing the linked upstream issue. Optimization on Pulp's side have been made and metadata generation process is much faster now. Therefore it's not necessary to skip it in upstream.
Comment 8 Bryan Kearney 2016-04-29 12:13:26 EDT
Moving to POST since upstream bug http://projects.theforeman.org/issues/14807 has been closed
Comment 9 Brad Buckingham 2016-05-03 09:52:00 EDT
Hi Tomas,  It appears that the upstream PR has been closed.  Will there be additional changes to address this issue in Satellite 6.2?  If not, should we move this to ON_QA or CLOSED?
Comment 10 Tomas Strachota 2016-05-06 06:19:21 EDT
Hi Brad, this is only sat 6.1 issue. Upstream and 6.2 are not affected. That's why I closed the upstream PR. I didn't know about optimizations that had been done in Pulp by the time I was writing the upstream patch. Please see the discussion in the upstream PR for details.

This change makes sense only in sat 6.1 where there's older Pulp and different approach to how we handle capsule content synchronizations. I also don't think we should track this as sat-6.2.0+.
Comment 11 Brad Buckingham 2016-05-06 14:48:07 EDT
Tomas,  Thanks!  Based on the feedback, I am going to close this bug on 6.2.  The fix/plans for 6.1.z will be tracked in the associated clone bug 1327338.

Note You need to log in before you can comment on or make changes to this bug.