Bug 2031096

Summary: Publishing a CV with depsolve=true and filter "exclude errata issued after date" randomly forgets to apply the filter
Product: Red Hat Satellite Reporter: Pavel Moravec <pmoravec>
Component: Content ViewsAssignee: Ian Ballou <iballou>
Status: CLOSED CURRENTRELEASE QA Contact: Jameer Pathan <jpathan>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 6.8.0CC: ahumbe, aruzicka, iballou, ktordeur, kurathod, pdwyer
Target Milestone: UnspecifiedKeywords: PrioBumpGSS, Triaged
Target Release: Unused   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-12-29 06:03:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Pavel Moravec 2021-12-10 13:43:27 UTC
Description of problem:
Under unknown circumstances, and _randomly_ occurring, CV filter "exclude errata issued after date .." is ignored during CV publish, when dependency solving is enabled.

Particular reproducer (sporadic occurrence, though):
Having a CV with:
- RHEL8 BaseOS
- RHEL8 AppStream
- RHEL8 Supplementary
(probably the more repos, the bigger chance to reproduce is)

Having dependency solving _enabled_ (this is crucial)

Then having two exclude filters:
- Exclude errata issued after 2021-10-01 (applied on all repos)
- Exclude puppet and puppet-agent packages (of all versions, from all repos)
  - this *might* be irrelevant filter

and publishing the CV, it *sporadically* leaves RHEL8.5 errata with *some* content in it. E.g.:
device-mapper-1.02.177-10.el8.x86_64
device-mapper-lib-1.02.177-10.el8.x86_64
from
RHBA-2021:4431 (issued on 2021-11-05)
are in, but e.g. device-mapper-event-1.02.177-10.el8.x86_64 (that is dependant on the above ones) is removed (which sounds correct).

Thus, CV content - despite dependency solving set - has inconsistent content with broken dependencies among packages.

The cause is that pulp repo for CV _version_ is published (DistributorPublish step) _before_ the unwanted content is unassociated (PurgeEmptyContent). And the "versioned" repo is then used for the "CV in Library" repo during that repo publish.

Anyway, the above sequence of steps happens every time, but we can reproduce it sporadically only - there must be some other factor contributing to it (or just unlucky race condition?)


Version-Release number of selected component (if applicable):
Sat 6.8.6
(newer Satellites probably affected the same)


How reproducible:
sporadically


Steps to Reproduce:
See above steps


Actual results:
CV content with broken dependencies and unwanted packages


Expected results:
CV content with "dependency closure" and without unwanted packages


Additional info:
- for the "versioned BaseOS repo", the _errata_ RHBA-2021:4431 is in published metadata, BUT NOT in mongo (repo_content_units)
- the device-mapper -177 version is in published metadata, not sure if in mongo (repo_content_units) - PROBABLY it is there

- workaround in force_full publish of "versioned" repo and then force_full publish of "Library" repo removes the errata from the repos metadata, BUT keeps package in (hence mongo probably keeps the association)

- I *think* ordering of dynflow steps:

34: Actions::Pulp::Repository::DistributorPublish (success) [ 390.51s / 7.15s ]
39: Actions::Katello::Repository::PurgeEmptyContent (success) [ 1.53s / 0.49s ]

is wrong. We must call the Publish (of CV-versioned repo) _after_ PurgeEmptyContent

- BUT just the above change will probably not purge away the packages..? (as the workaround idea above failed)

Comment 2 Adam Ruzicka 2021-12-10 13:47:30 UTC
Created redmine issue https://projects.theforeman.org/issues/34127 from this bug

Comment 6 Pavel Moravec 2021-12-11 21:28:25 UTC
I can reproduce very reliably now, on few various sets of repos included (everytime BaseOS+App+Suppl(+some others)).

When I run *one* CV publish, I can reproduce easily. When I run *multiple* similar CV publishes, I can *not* reproduce.

That is interesting clue (and horrifying - how parallel CV publish can affect another one???). Concurrent CV publishes sounds like a workaround (very strange one).

Comment 7 Bryan Kearney 2021-12-13 04:05:58 UTC
Upstream bug assigned to aruzicka

Comment 8 Bryan Kearney 2021-12-13 04:06:00 UTC
Upstream bug assigned to aruzicka

Comment 9 Pavel Moravec 2021-12-13 08:07:24 UTC
I can *not* reproduce it on Sat6.9. Tried 50ish times, on "the customer reproducer setup", with "reordered" repos in the CV (cf https://bugzilla.redhat.com/show_bug.cgi?id=1917076#c16), no way.

Until the reproducer depends also on some config (that is set on my 6.8 but not on my 6.9), or there is a tricky race condition else-where, it sounds that 6.9 fixes the bug.

I tried to spot a bugfix in 6.9 or 6.9.z errata that might affect this, but neither BZ sounds really applicable.

Most "applicable" BZs:
- https://pulp.plan.io/issues/7898
  - this improves speed in 6.9, which can impact the expected concurrency bug. But alone, no real change in behavour. Still maybe worth trying to patch 6.8 and test?
- https://bugzilla.redhat.com/show_bug.cgi?id=1952609
  - Caps Sync speed improvement / similar to previous, if it ever applies to Sat

Comment 13 Pavel Moravec 2021-12-13 15:39:41 UTC
(In reply to Pavel Moravec from comment #9)
> I can *not* reproduce it on Sat6.9. Tried 50ish times, on "the customer
> reproducer setup", with "reordered" repos in the CV (cf
> https://bugzilla.redhat.com/show_bug.cgi?id=1917076#c16), no way.

We 8can8 reproduce on 6.9.6, but *different* errata and their content appears in the CV.

So far minimalistic reproducer (on 6.9.6):
- RHEL8 BaseOS + AppStream + Supplementary repos
- exclude errata issued after 2021-10-01

Publish the CV, and see e.g.:

zgrep -c "issued date=\"2021-1[0-2]" /var/lib/pulp/published/yum/master/yum_distributor/1-cv_03094967_BaseOS_App_Suppl-v10_0-a667af9c-7005-4ca0-9d4d-8d3434c51803/1639171203.43/repodata/43e53782a597d447acb29a7c6f7f00115f5662e8a68854b838af2f99aabb0d97-updateinfo.xml.gz

returns 20 errata issued after the cut-off date :-/

Comment 14 Pavel Moravec 2021-12-13 17:09:48 UTC
Since 6.9 provides *less* errors than 6.8, there is assumed some 6.8->6.9 improvement.

Attempting to backporting to 6.8 fixes of:

https://projects.theforeman.org/issues/30827
https://projects.theforeman.org/issues/30828
https://bugzilla.redhat.com/show_bug.cgi?id=1952609
https://pulp.plan.io/issues/7898

(particular patch applied: http://pastebin.test.redhat.com/1015289)

that I evaluated as perspective diff between 6.8 and 6.9, this has *not* helped anyhow (so far, in 1 test run).


I haven't touched libsolv that differs between the versions (6.8 has 0.7.4-4, 6.9 has 0.7.17-1), that should be tried as well.

Comment 16 Pavel Moravec 2021-12-22 11:05:51 UTC
The original request was revealed as "unsupported" CV set-up, where demanding the contradicting requirements (exclude filters vs. concurrently applied dependency to add some stuff) within one CV can lead to generally unpredictable results.

WebUI(*) should protect using such "unsupported" settings combinations, where we can't guarantee proper outcome of combination of selected features (include filters vs. exclude filters vs. dependency solving). This is requested as a HF over Sat6.8.

(thinking about technical implementation and user experience: it can be hard to "recalculate" currently set-up configuration, whenever user modifies anything in WebUI - there are multiple such places where we had to run that check. Rather we should add that check just before publish or promote action, to warn (or strictly disallow?) the user to publish/promote content for which the user requests contradicting/unsupported set of requirements (e.g. exclude filters and depsolving). This approach will aslo allow the user to "play" with the CV setting like "now I enable depsolving and then remove exclude filters, before I will publish".)

This HF work should be aligned with the doc change in the same area: https://bugzilla.redhat.com/show_bug.cgi?id=2034804

Unanswered questions:
- should we _warn_ or totally disallow such setups? (I lean to the warning, just in case a customer really wants to (e.g. experiment with) use some setup)
- and what about hammer and API? Should be the change in WebUI only, or in our API that WebUI uses?
- is such warning/prohibition required also for CV promote? As that does not calculate a new content, it just copies some already existing stuff


I am changing BZ Subject to reflect this change of BZ scope.

Comment 19 Pavel Moravec 2022-01-05 11:48:18 UTC
I don't have strong preferences what combinations should be strictly disallowed (if any) or what combinations should be followed by warning message. Simply, the current status (barely any warning, no prohibition even when we expect dependency issues) is insufficient. Even after the December's lesson from "exclude newest errata and enable depsolving => source of troubles", I dont feel confident enough to state when we shall warn and when/if to disallow something. I am leaving the particular decisions to more knowledgable devels, here.

Comment 20 Ian Ballou 2022-01-05 16:39:21 UTC
I don't think I'll add the warning to the API, because the only way to do that would be with a --force option which would be a breaking API change.

Also, per your earlier question, we shouldn't need a warning for promotion as well.  Content isn't being copied there.

I'll be adding this extra warning first to upstream because dep solving cannot 100% guarantee a successful yum update even for Pulp 3.

Comment 22 Bryan Kearney 2022-01-20 20:05:12 UTC
Moving this bug to POST for triage into Satellite since the upstream issue https://projects.theforeman.org/issues/34127 has been resolved.

Comment 27 Ian Ballou 2022-12-29 06:03:40 UTC
This issue is specific to Satellite 6.9 and below, which are now EOL. As such, I am closing this issue as fixed by the current release.