Bug 1844713
| Summary: | Optimised Capsule Sync should skip syncing up-to-date repos | ||
|---|---|---|---|
| Product: | Red Hat Satellite | Reporter: | Pavel Moravec <pmoravec> |
| Component: | Capsule - Content | Assignee: | satellite6-bugs <satellite6-bugs> |
| Status: | CLOSED DUPLICATE | QA Contact: | Vladimír Sedmík <vsedmik> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 6.7.0 | CC: | dsynk, jsherril |
| Target Milestone: | Unspecified | Keywords: | Patch, Triaged |
| Target Release: | Unused | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-03-15 12:53:15 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
There is a typo in the patch:
+ caps_repo_hash.delete_if{|id,ts| not(sat_repo_hash.key?(id) or sat_repo_hash[id] < ts }
should be:
+ caps_repo_hash.delete_if{|id,ts| not(sat_repo_hash.key?(id)) or sat_repo_hash[id] < ts }
one issue with the suggested patch is that katello currently uses a special 'speed up trick' to speed up content view publishes and promotes. In these cases the repository on the katello server may actually be empty, and no units are really copied to it. We copy the metadata on disk so that it appears that the content is there to clients and to capsules, but in reality pulp has knowledge of it. We might could add some workarounds to look at the repos that actually hold the content for their timestamp, but then things like repromoting an older version wouldn't trigger the capsule sync properly. So this patch will work with some repos but not others and won't fully be reliable. I think a better solution would be for katello to track for each capsule the set of Content view versions last synced for each LCE/CV. It could then compare that when it needed to sync and ignore the sync at that time. Thats a bit more work though. (In reply to Justin Sherrill from comment #2) > one issue with the suggested patch is that katello currently uses a special > 'speed up trick' to speed up content view publishes and promotes. In these > cases the repository on the katello server may actually be empty, and no > units are really copied to it. We copy the metadata on disk so that it > appears that the content is there to clients and to capsules, but in reality > pulp has knowledge of it. We might could add some workarounds to look at > the repos that actually hold the content for their timestamp, but then > things like repromoting an older version wouldn't trigger the capsule sync > properly. > > So this patch will work with some repos but not others and won't fully be > reliable. > > I think a better solution would be for katello to track for each capsule the > set of Content view versions last synced for each LCE/CV. It could then > compare that when it needed to sync and ignore the sync at that time. Thats > a bit more work though. I see /o\ . About the tracking CV/LE/Caps sync times: we might query foreman tasks for the latest time of successful Caps sync (optionally limited to given CV/LE). If that time is newer than Satellite's CV/LE promote time, we can skip it. If older or we dont know the time, let sync it. ? I think we can close this in favour of https://bugzilla.redhat.com/show_bug.cgi?id=sat-smartsync . *** This bug has been marked as a duplicate of bug 1899326 *** |
Description of problem: User story: having a fully updated Capsule, invoking an (even optimised) Capsule sync attempts to sync each and every repo to the Capsule. Since any no-op repo sync takes seconds to tens of seconds each, this generates a nontrivial redundant work to the Capsule. Also, redundantly many dynflow steps are created, which puts dynflowd under redundant scaled load. Satellite should skip syncing a Capsule repo with a sync time (or last unit added/remove) that is newer than on Sat - then we expect the repo is up to date. In particular, I think we should filter repos via code like: --- a/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.14.0.20/app/lib/actions/katello/capsule_content/sync_capsule.rb +++ b/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.14.0.20/app/lib/actions/katello/capsule_content/sync_capsule.rb @@ -4,6 +4,18 @@ module Actions class SyncCapsule < ::Actions::EntryAction include Actions::Katello::PulpSelector + # auxiliary function for detecting up-to-date repos on a Capsule + # for an array with items [pulp_id, last_unit_added, last_unit_removed], it returns + # a hash h[pulp_id] = max(last_unit_added, last_unit_removed) + # with ignoring null values + def hash_repo_ts(data) + h = {} + data.each do |item| + h[item.first] = [ item.second, item.third ].grep(String).map(&:to_datetime).max + end + h + end + # rubocop:disable MethodLength def plan(smart_proxy, options = {}) action_subject(smart_proxy) @@ -13,12 +25,27 @@ module Actions skip_metadata_check = options.fetch(:skip_metadata_check, false) smart_proxy_helper = ::Katello::SmartProxyHelper.new(smart_proxy) + + # get details of Satellite's and Capsule's pulp repos to identify which Capsule's ones are up to date + caps_repo_data = smart_proxy.smart_proxy_service.current_repositories_data(environment, content_view).pluck(:id, :last_unit_added, :last_unit_removed) + sat_repo_data = ::SmartProxy.default_capsule.smart_proxy_service.current_repositories_data(environment, content_view).pluck(:id, :last_unit_added, :last_unit_removed) + # build a hash of repos and their latest modification time - on Sat and Caps + sat_repo_hash = hash_repo_ts(sat_repo_data) + caps_repo_hash = hash_repo_ts(caps_repo_data) + # remove repos that were updated on Caps later than on Sat (or that dont exist on Sat) + # TODO: what if some datetime is null/empty? + caps_repo_hash.delete_if{|id,ts| not(sat_repo_hash.key?(id) or sat_repo_hash[id] < ts } + caps_older_repo_ids = caps_repo_hash.keys + concurrence do smart_proxy_helper.repos_available_to_capsule(environment, content_view, repository).each do |repo| - plan_pulp_action([Actions::Pulp::Orchestration::Repository::SmartProxySync, - Actions::Pulp3::CapsuleContent::Sync], - repo, smart_proxy, - skip_metadata_check: skip_metadata_check) + # invoke the repo sync only if skipping metadata check (i.e. full sync), or if repo in older Caps repos + if false==skip_metadata_check or repo.pulp_id.in?(caps_older_repo_ids) + plan_pulp_action([Actions::Pulp::Orchestration::Repository::SmartProxySync, + Actions::Pulp3::CapsuleContent::Sync], + repo, smart_proxy, + skip_metadata_check: skip_metadata_check) + end if repo.is_a?(::Katello::Repository) && repo.distribution_bootable? && The above code has one TODO: if last_unit_added = last_unit_removed = nil for a repo, then the comparison in delete_if would fail. nil values must be replaced by some very old datetime value - which requires some ruby-fu I dont have /o\. I suggest (also in the code) to skip the up-to-date repos only for Optimised Caps sync. Full Caps sync (skip_metadata_check==true) should still attempt to sync all repos (in fact my optimisation is kind of metadata check, right?). Version-Release number of selected component (if applicable): Sat 6.7.0 How reproducible: 100% Steps to Reproduce: 1. Have a Sat and Caps with many repos synced 2. Repeatedly invoke Optimised Capsule sync 3. Check that each such sync invokes as many Actions::Pulp::Consumer::SyncCapsule dynflow steps as the # of repos on Caps are 4. Check duration of such sync Actual results: 3. # of the dynflow steps matches with # of repos on the Capsule 4. it takes many minutes to hours for a scaled environment to perform a no-op sync Expected results: 3. # of the dynflow steps should correspond to the repos having an updated content on Sat, only 4. On a fully updated Caps, an Optimised Caps Sync should take a minute or two only Additional info: Some gotchas of the patch: - it fires "get me all repos" request to pulp both on Sat and Caps, while we should already have that info; at least Caps info we have (afaik app/views/katello/api/v2/capsule_content/sync_status.json.rabl does call current_repositories_data method before my code, already) - there might be a space for improvement of the wide requests my patch performs - the patch relies on last_unit_added and last_unit_removed pulp repo metadata. Not sure how reliable this is or if there are no plans to deprecate it. Fetching individual repo details, one could get last_sync details for each importer - but that would be an overkill, I think.