Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1844713

Summary: Optimised Capsule Sync should skip syncing up-to-date repos
Product: Red Hat Satellite Reporter: Pavel Moravec <pmoravec>
Component: Capsule - ContentAssignee: satellite6-bugs <satellite6-bugs>
Status: CLOSED DUPLICATE QA Contact: Vladimír Sedmík <vsedmik>
Severity: medium Docs Contact:
Priority: medium    
Version: 6.7.0CC: dsynk, jsherril
Target Milestone: UnspecifiedKeywords: Patch, Triaged
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-03-15 12:53:15 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Pavel Moravec 2020-06-06 13:59:50 UTC
Description of problem:
User story: having a fully updated Capsule, invoking an (even optimised) Capsule sync attempts to sync each and every repo to the Capsule. Since any no-op repo sync takes seconds to tens of seconds each, this generates a nontrivial redundant work to the Capsule. Also, redundantly many dynflow steps are created, which puts dynflowd under redundant scaled load.

Satellite should skip syncing a Capsule repo with a sync time (or last unit added/remove) that is newer than on Sat - then we expect the repo is up to date.

In particular, I think we should filter repos via code like:

--- a/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.14.0.20/app/lib/actions/katello/capsule_content/sync_capsule.rb
+++ b/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.14.0.20/app/lib/actions/katello/capsule_content/sync_capsule.rb
@@ -4,6 +4,18 @@ module Actions
       class SyncCapsule < ::Actions::EntryAction
         include Actions::Katello::PulpSelector
 
+	# auxiliary function for detecting up-to-date repos on a Capsule
+	# for an array with items [pulp_id, last_unit_added, last_unit_removed], it returns
+	# a hash h[pulp_id] = max(last_unit_added, last_unit_removed)
+	# with ignoring null values
+	def hash_repo_ts(data)
+  	  h = {}
+	  data.each do |item|
+            h[item.first] = [ item.second, item.third ].grep(String).map(&:to_datetime).max
+          end
+          h
+        end
+
         # rubocop:disable MethodLength
         def plan(smart_proxy, options = {})
           action_subject(smart_proxy)
@@ -13,12 +25,27 @@ module Actions
           skip_metadata_check = options.fetch(:skip_metadata_check, false)
 
           smart_proxy_helper = ::Katello::SmartProxyHelper.new(smart_proxy)
+
+          # get details of Satellite's and Capsule's pulp repos to identify which Capsule's ones are up to date
+          caps_repo_data = smart_proxy.smart_proxy_service.current_repositories_data(environment, content_view).pluck(:id, :last_unit_added, :last_unit_removed)
+          sat_repo_data = ::SmartProxy.default_capsule.smart_proxy_service.current_repositories_data(environment, content_view).pluck(:id, :last_unit_added, :last_unit_removed)
+	  # build a hash of repos and their latest modification time - on Sat and Caps
+          sat_repo_hash = hash_repo_ts(sat_repo_data)
+	  caps_repo_hash = hash_repo_ts(caps_repo_data)
+	  # remove repos that were updated on Caps later than on Sat (or that dont exist on Sat)
+	  # TODO: what if some datetime is null/empty?
+          caps_repo_hash.delete_if{|id,ts| not(sat_repo_hash.key?(id) or sat_repo_hash[id] < ts }
+	  caps_older_repo_ids = caps_repo_hash.keys
+
           concurrence do
             smart_proxy_helper.repos_available_to_capsule(environment, content_view, repository).each do |repo|
-              plan_pulp_action([Actions::Pulp::Orchestration::Repository::SmartProxySync,
-                                Actions::Pulp3::CapsuleContent::Sync],
-                                 repo, smart_proxy,
-                                 skip_metadata_check: skip_metadata_check)
+	      # invoke the repo sync only if skipping metadata check (i.e. full sync), or if repo in older Caps repos
+	      if false==skip_metadata_check or repo.pulp_id.in?(caps_older_repo_ids)
+                plan_pulp_action([Actions::Pulp::Orchestration::Repository::SmartProxySync,
+                                  Actions::Pulp3::CapsuleContent::Sync],
+                                   repo, smart_proxy,
+                                   skip_metadata_check: skip_metadata_check)
+              end
 
               if repo.is_a?(::Katello::Repository) &&
                 repo.distribution_bootable? &&


The above code has one TODO: if last_unit_added = last_unit_removed = nil for a repo, then the comparison in delete_if would fail. nil values must be replaced by some very old datetime value - which requires some ruby-fu I dont have /o\.

I suggest (also in the code) to skip the up-to-date repos only for Optimised Caps sync. Full Caps sync (skip_metadata_check==true) should still attempt to sync all repos (in fact my optimisation is kind of metadata check, right?).


Version-Release number of selected component (if applicable):
Sat 6.7.0


How reproducible:
100%


Steps to Reproduce:
1. Have a Sat and Caps with many repos synced
2. Repeatedly invoke Optimised Capsule sync
3. Check that each such sync invokes as many Actions::Pulp::Consumer::SyncCapsule dynflow steps as the # of repos on Caps are
4. Check duration of such sync


Actual results:
3. # of the dynflow steps matches with # of repos on the Capsule
4. it takes many minutes to hours for a scaled environment to perform a no-op sync


Expected results:
3. # of the dynflow steps should correspond to the repos having an updated content on Sat, only
4. On a fully updated Caps, an Optimised Caps Sync should take a minute or two only


Additional info:
Some gotchas of the patch:
- it fires "get me all repos" request to pulp both on Sat and Caps, while we should already have that info; at least Caps info we have (afaik app/views/katello/api/v2/capsule_content/sync_status.json.rabl does call current_repositories_data method before my code, already) - there might be a space for improvement of the wide requests my patch performs
- the patch relies on last_unit_added and last_unit_removed pulp repo metadata. Not sure how reliable this is or if there are no plans to deprecate it. Fetching individual repo details, one could get last_sync details for each importer - but that would be an overkill, I think.

Comment 1 Pavel Moravec 2020-06-08 10:56:06 UTC
There is a typo in the patch:


+          caps_repo_hash.delete_if{|id,ts| not(sat_repo_hash.key?(id) or sat_repo_hash[id] < ts }

should be:

+          caps_repo_hash.delete_if{|id,ts| not(sat_repo_hash.key?(id)) or sat_repo_hash[id] < ts }

Comment 2 Justin Sherrill 2020-06-25 17:22:34 UTC
one issue with the suggested patch is that katello currently uses a special 'speed up trick' to speed up content view publishes and promotes.  In these cases the repository on the katello server may actually be empty, and no units are really copied to it.  We copy the metadata on disk so that it appears that the content is there to clients and to capsules, but in reality pulp has knowledge of it.    We might could add some workarounds to look at the repos that actually hold the content for their timestamp, but then things like repromoting an older version wouldn't trigger the capsule sync properly.  

So this patch will work with some repos but not others and won't fully be reliable.  

I think a better solution would be for katello to track for each capsule the set of Content view versions last synced for each LCE/CV.  It could then compare that when it needed to sync and ignore the sync at that time.  Thats a bit more work though.

Comment 3 Pavel Moravec 2020-06-25 19:14:42 UTC
(In reply to Justin Sherrill from comment #2)
> one issue with the suggested patch is that katello currently uses a special
> 'speed up trick' to speed up content view publishes and promotes.  In these
> cases the repository on the katello server may actually be empty, and no
> units are really copied to it.  We copy the metadata on disk so that it
> appears that the content is there to clients and to capsules, but in reality
> pulp has knowledge of it.    We might could add some workarounds to look at
> the repos that actually hold the content for their timestamp, but then
> things like repromoting an older version wouldn't trigger the capsule sync
> properly.  
> 
> So this patch will work with some repos but not others and won't fully be
> reliable.  
> 
> I think a better solution would be for katello to track for each capsule the
> set of Content view versions last synced for each LCE/CV.  It could then
> compare that when it needed to sync and ignore the sync at that time.  Thats
> a bit more work though.

I see /o\ .

About the tracking CV/LE/Caps sync times: we might query foreman tasks for the latest time of successful Caps sync (optionally limited to given CV/LE). If that time is newer than Satellite's CV/LE promote time, we can skip it. If older or we dont know the time, let sync it. ?

Comment 5 Pavel Moravec 2021-03-15 12:53:15 UTC
I think we can close this in favour of https://bugzilla.redhat.com/show_bug.cgi?id=sat-smartsync .

*** This bug has been marked as a duplicate of bug 1899326 ***