Description of problem: Having a Content View with a filter that includes or excludes thousands of errata, an attempt to publish the CV takes too much time (i.e. 2+ minutes per each CV's repo with the filter applied). As an example, having a CV with 10 repos with such filters, it takes approx. 30 minutes of planning the task (and then just few minutes to execute it, incld. CopyRpm or DistributorPublish). That is bad from two reasons: 1) overall performance is bad (because planning a task takes several times more than executing the task) 2) user sees practically nothing for most of the task lifecycle. When interested why the task publish takes so long, task details are empty (since task is still in planning). Particular code that takes so long: https://github.com/Katello/katello/blob/master/app/lib/actions/katello/repository/clone_yum_content.rb#L17 (clause_gen.generate) that is this method: https://github.com/Katello/katello/blob/master/app/lib/katello/util/filter_clause_generator.rb#L9-L12 The method is inefficient for arguments with thousands of errata in it. Today, there exists a workaround in using opposite filtering (i.e. instead of "include all errata older than month ago", use "exclude any newer errata" (and deal with pkgs outside errata). However this workaround will be less and less applicable as the overall number of errata in a repo will grow over time. Version-Release number of selected component (if applicable): Sat 6.2.12 How reproducible: 100% Steps to Reproduce: 1. Have synced several bigger repos with many errata 2. Create a CV, add there all the repos. 3. Add a filter "include all errata older than date X.Y." such that the date is just a month old / to include most of errata in the CV 4. Click to publish the CV 5. Check how long it will take to publish the CV (and when the task will leave planning phase / will start executing the very first step) Actual results: 5. CV publish takes 30+ minutes, most of the time is spent in planning (very first dynflow step is kicked off after a long time) Expected results: 5. Some reasonable lower planning phase Additional info: Just add some debugging statements just around the line https://github.com/Katello/katello/blob/master/app/lib/actions/katello/repository/clone_yum_content.rb#L17 to see the delay is right there.
Also worth to know, that the list of errata works quite inefficiently when having multiple repos. Assume I have a big repo (RHEL6 6Server, e.g.) and several small repos and apply errata filter "include every errata older than today". Then - after adding the debugs per "Additional info" - one can see that publishing that CV spends *same* (surprisingly high) time in the method https://github.com/Katello/katello/blob/master/app/lib/actions/katello/repository/clone_yum_content.rb#L17 for *each and every* repo where errata are applied to. Even if the repo has just few errata. So the calculation is somehow disproportional for small repos (if there is a big repo as well) and it seems the calculation is repeated for each and every repo in the CV once again.
I can replicate this and I have also several customers who are seeing this. This is turning Satellite 6 into something which is not very useful.
Connecting redmine issue http://projects.theforeman.org/issues/21727 from this bug
Moving this bug to POST for triage into Satellite 6 since the upstream issue http://projects.theforeman.org/issues/21727 has been resolved.
Re-proposing for 6.3, as this has a high impact.
Partha would you mind taking a look at cherry-picking this.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. > > > > For information on the advisory, and where to find the updated files, follow the link below. > > > > If the solution does not work for you, open a new bug report. > > > > https://access.redhat.com/errata/RHSA-2018:0336