Description of problem:
Under unknown circumstances, and _randomly_ occurring, CV filter "exclude errata issued after date .." is ignored during CV publish, when dependency solving is enabled.
Particular reproducer (sporadic occurrence, though):
Having a CV with:
- RHEL8 BaseOS
- RHEL8 AppStream
- RHEL8 Supplementary
(probably the more repos, the bigger chance to reproduce is)
Having dependency solving _enabled_ (this is crucial)
Then having two exclude filters:
- Exclude errata issued after 2021-10-01 (applied on all repos)
- Exclude puppet and puppet-agent packages (of all versions, from all repos)
- this *might* be irrelevant filter
and publishing the CV, it *sporadically* leaves RHEL8.5 errata with *some* content in it. E.g.:
RHBA-2021:4431 (issued on 2021-11-05)
are in, but e.g. device-mapper-event-1.02.177-10.el8.x86_64 (that is dependant on the above ones) is removed (which sounds correct).
Thus, CV content - despite dependency solving set - has inconsistent content with broken dependencies among packages.
The cause is that pulp repo for CV _version_ is published (DistributorPublish step) _before_ the unwanted content is unassociated (PurgeEmptyContent). And the "versioned" repo is then used for the "CV in Library" repo during that repo publish.
Anyway, the above sequence of steps happens every time, but we can reproduce it sporadically only - there must be some other factor contributing to it (or just unlucky race condition?)
Version-Release number of selected component (if applicable):
(newer Satellites probably affected the same)
Steps to Reproduce:
See above steps
CV content with broken dependencies and unwanted packages
CV content with "dependency closure" and without unwanted packages
- for the "versioned BaseOS repo", the _errata_ RHBA-2021:4431 is in published metadata, BUT NOT in mongo (repo_content_units)
- the device-mapper -177 version is in published metadata, not sure if in mongo (repo_content_units) - PROBABLY it is there
- workaround in force_full publish of "versioned" repo and then force_full publish of "Library" repo removes the errata from the repos metadata, BUT keeps package in (hence mongo probably keeps the association)
- I *think* ordering of dynflow steps:
34: Actions::Pulp::Repository::DistributorPublish (success) [ 390.51s / 7.15s ]
39: Actions::Katello::Repository::PurgeEmptyContent (success) [ 1.53s / 0.49s ]
is wrong. We must call the Publish (of CV-versioned repo) _after_ PurgeEmptyContent
- BUT just the above change will probably not purge away the packages..? (as the workaround idea above failed)
Created redmine issue https://projects.theforeman.org/issues/34127 from this bug
I can reproduce very reliably now, on few various sets of repos included (everytime BaseOS+App+Suppl(+some others)).
When I run *one* CV publish, I can reproduce easily. When I run *multiple* similar CV publishes, I can *not* reproduce.
That is interesting clue (and horrifying - how parallel CV publish can affect another one???). Concurrent CV publishes sounds like a workaround (very strange one).
Upstream bug assigned to aruzicka
I can *not* reproduce it on Sat6.9. Tried 50ish times, on "the customer reproducer setup", with "reordered" repos in the CV (cf https://bugzilla.redhat.com/show_bug.cgi?id=1917076#c16), no way.
Until the reproducer depends also on some config (that is set on my 6.8 but not on my 6.9), or there is a tricky race condition else-where, it sounds that 6.9 fixes the bug.
I tried to spot a bugfix in 6.9 or 6.9.z errata that might affect this, but neither BZ sounds really applicable.
Most "applicable" BZs:
- this improves speed in 6.9, which can impact the expected concurrency bug. But alone, no real change in behavour. Still maybe worth trying to patch 6.8 and test?
- Caps Sync speed improvement / similar to previous, if it ever applies to Sat
(In reply to Pavel Moravec from comment #9)
> I can *not* reproduce it on Sat6.9. Tried 50ish times, on "the customer
> reproducer setup", with "reordered" repos in the CV (cf
> https://bugzilla.redhat.com/show_bug.cgi?id=1917076#c16), no way.
We 8can8 reproduce on 6.9.6, but *different* errata and their content appears in the CV.
So far minimalistic reproducer (on 6.9.6):
- RHEL8 BaseOS + AppStream + Supplementary repos
- exclude errata issued after 2021-10-01
Publish the CV, and see e.g.:
zgrep -c "issued date=\"2021-1[0-2]" /var/lib/pulp/published/yum/master/yum_distributor/1-cv_03094967_BaseOS_App_Suppl-v10_0-a667af9c-7005-4ca0-9d4d-8d3434c51803/1639171203.43/repodata/43e53782a597d447acb29a7c6f7f00115f5662e8a68854b838af2f99aabb0d97-updateinfo.xml.gz
returns 20 errata issued after the cut-off date :-/
Since 6.9 provides *less* errors than 6.8, there is assumed some 6.8->6.9 improvement.
Attempting to backporting to 6.8 fixes of:
(particular patch applied: http://pastebin.test.redhat.com/1015289)
that I evaluated as perspective diff between 6.8 and 6.9, this has *not* helped anyhow (so far, in 1 test run).
I haven't touched libsolv that differs between the versions (6.8 has 0.7.4-4, 6.9 has 0.7.17-1), that should be tried as well.
The original request was revealed as "unsupported" CV set-up, where demanding the contradicting requirements (exclude filters vs. concurrently applied dependency to add some stuff) within one CV can lead to generally unpredictable results.
WebUI(*) should protect using such "unsupported" settings combinations, where we can't guarantee proper outcome of combination of selected features (include filters vs. exclude filters vs. dependency solving). This is requested as a HF over Sat6.8.
(thinking about technical implementation and user experience: it can be hard to "recalculate" currently set-up configuration, whenever user modifies anything in WebUI - there are multiple such places where we had to run that check. Rather we should add that check just before publish or promote action, to warn (or strictly disallow?) the user to publish/promote content for which the user requests contradicting/unsupported set of requirements (e.g. exclude filters and depsolving). This approach will aslo allow the user to "play" with the CV setting like "now I enable depsolving and then remove exclude filters, before I will publish".)
This HF work should be aligned with the doc change in the same area: https://bugzilla.redhat.com/show_bug.cgi?id=2034804
- should we _warn_ or totally disallow such setups? (I lean to the warning, just in case a customer really wants to (e.g. experiment with) use some setup)
- and what about hammer and API? Should be the change in WebUI only, or in our API that WebUI uses?
- is such warning/prohibition required also for CV promote? As that does not calculate a new content, it just copies some already existing stuff
I am changing BZ Subject to reflect this change of BZ scope.
I don't have strong preferences what combinations should be strictly disallowed (if any) or what combinations should be followed by warning message. Simply, the current status (barely any warning, no prohibition even when we expect dependency issues) is insufficient. Even after the December's lesson from "exclude newest errata and enable depsolving => source of troubles", I dont feel confident enough to state when we shall warn and when/if to disallow something. I am leaving the particular decisions to more knowledgable devels, here.
I don't think I'll add the warning to the API, because the only way to do that would be with a --force option which would be a breaking API change.
Also, per your earlier question, we shouldn't need a warning for promotion as well. Content isn't being copied there.
I'll be adding this extra warning first to upstream because dep solving cannot 100% guarantee a successful yum update even for Pulp 3.
Moving this bug to POST for triage into Satellite since the upstream issue https://projects.theforeman.org/issues/34127 has been resolved.