1004001 – [RFE] Allow whitelist/blacklist for filtering yum repo syncs

Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1004001 - [RFE] Allow whitelist/blacklist for filtering yum repo syncs

Summary: [RFE] Allow whitelist/blacklist for filtering yum repo syncs

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Satellite
Classification:	Red Hat
Component:	Pulp
Sub Component:
Version:	Unspecified
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	Unspecified
Assignee:	satellite6-bugs
QA Contact:	Katello QA List
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1157857 (view as bug list)
Depends On:	1003999
Blocks:	260381 754576 sat6-pulp-future 1293641
TreeView+	depends on / blocked

Reported:	2013-09-03 16:57 UTC by Mike McCune
Modified:	2019-12-16 04:26 UTC (History)
CC List:	10 users (show)
Fixed In Version:
Doc Type:	Enhancement
Doc Text:
Clone Of:	1003999
Environment:
Last Closed:	2016-10-06 12:27:10 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Mike McCune 2013-09-03 16:57:51 UTC

+++ This bug was initially created as a clone of Bug #1003999 +++

Filtering in pulp currently only effects published Content Views but there is no way to apply a filtering algorithm to a yum repository feed.

There are often cases where repositories contain large sets of packages that are never used (architectures, subdirectories, etc), eg:

http://linux.dell.com/repo/hardware/latest/platform_independent/rh60_64/

where the user may only want a certain subset of the large repo.

The customer should have the ability to specify on the repo a set of filters which prevent packages from being synced thus saving disk space and sync time.

Comment 1 Christina Plummer 2013-10-22 13:53:04 UTC

Another example is Oracle - the public repos for Oracle Linux include source RPMs in the same directory as the binary/noarch RPMs, e.g.

http://public-yum.oracle.com/repo/OracleLinux/OL5/latest/x86_64/

When I synced this repo, over half of the space consumed was from the source RPMs:

$ sudo du -Lks /var/lib/pulp/published/http/repos/ol5/x86_64/
7759976 /var/lib/pulp/published/http/repos/ol5/x86_64/
$ sudo du -Lks /var/lib/pulp/published/http/repos/ol5/x86_64/*.src.rpm | awk '{SUM+=$1} END{print SUM}'
3686144

Comment 2 Christina Plummer 2013-10-22 14:50:02 UTC

The other use that I would like to see is for cloning local repos - i.e. "syncing" from one repo to another in order to create a "promote to production" process.

One model developed using Pulp v1 is described in a Usenix paper from 2011 [1]. This would enable "less risky" (i.e. the majority of) packages from an upstream distributor to be promoted automatically to the internal production repositories, but packages that require additional testing (e.g. kernel, or applications like mysql or httpd) can be filtered from the automatic sync. That way, the majority of updates can be pushed automatically to clients in a timely manner (or pulled via a standard "yum update"), while reducing the risk of introducting unexpected issues. This model enables target package sets to be managed in one place, in the repository itself, rather than through excludes on each individual client. For example, you might have 3 repositories:

"Live" = repo synced daily from upstream distributor
"Unstable" = repo synced daily from "Live", excluding $risky_pkgs
"Stable" = repo synced daily from "Unstable", excluding $risky_pkgs

$risky_pkgs would be manually promoted from Live -> Unstable -> Stable, for example weekly.

Similarly, different teams might choose to set different policies (maybe they want all kernel updates as soon as they are available, but they want to make sure that Python gets tested with their custom app first), so their $risky_pkgs_teamA list would be different. It enables all teams within an organization to set their own policies while still inheriting certain organization-wide policy (e.g. that packages must be at least a day old before being installed anywhere).

It is not clear to me why the features of "cloning" and "sync filters" were removed between Pulp v1 and v2.

[1] https://www.usenix.org/legacy/events/lisa11/tech/full_papers/Pierre.pdf

Comment 3 Christina Plummer 2013-10-22 15:21:47 UTC

Based on the above use cases, I'd like to see sync filters for at least the following criteria:
 - content type (e.g. RPM, source RPM)
 - architecture (e.g. x86_64)
 - package name match (e.g. 'kernel*' or 'kmod*')
 - date package was added to the repo (e.g. "before 20131001" or "before -7days")

Thanks.

Comment 4 Randy Barlow 2014-12-02 16:20:12 UTC

*** Bug 1157857 has been marked as a duplicate of this bug. ***

Comment 5 Andrea Giardini 2014-12-16 12:18:38 UTC

Can we make a point on this and decide how we want to implement this?
I'd like to see this in Pulp since in my case is a "blocking" feature

Should we follow the old v1 approach? (Create a filter, link a filter to a repo)

Comment 6 Brian Bouterse 2015-02-19 01:11:50 UTC

Moved to https://pulp.plan.io/issues/206

Comment 7 RHEL Program Management 2015-03-04 11:24:27 UTC

Since this issue was entered in Red Hat Bugzilla, the release flag has been
set to ? to ensure that it is properly evaluated for this release.

Comment 8 RHEL Program Management 2015-03-18 19:23:21 UTC

Since this issue was entered in Red Hat Bugzilla, the release flag has been
set to ? to ensure that it is properly evaluated for this release.

Comment 9 pulp-infra@redhat.com 2015-10-08 14:01:17 UTC

The Pulp upstream bug status is at ASSIGNED. Updating the external tracker on this bug.

Comment 10 pulp-infra@redhat.com 2016-01-27 14:01:36 UTC

The Pulp upstream bug status is at POST. Updating the external tracker on this bug.

Comment 11 pulp-infra@redhat.com 2016-01-28 18:01:37 UTC

The Pulp upstream bug status is at NEW. Updating the external tracker on this bug.

Note You need to log in before you can comment on or make changes to this bug.