Filtering in pulp currently only effects published Content Views but there is no way to apply a filtering algorithm to a yum repository feed. There are often cases where repositories contain large sets of packages that are never used (architectures, subdirectories, etc), eg: http://linux.dell.com/repo/hardware/latest/platform_independent/rh60_64/ where the user may only want a certain subset of the large repo. The customer should have the ability to specify on the repo a set of filters which prevent packages from being synced thus saving disk space and sync time.
I'm interested in thoughts on how this might work. I imagine that some sort of regex matching on the "name" could get us a lot of the way. The question is, how many regexs is it going to take? Are we interested in a scenario like the dell repo where there are tons of RPMs, but we only want a few? Is it sufficient to do a default-none when one or more filters are present, so that the user must match the packages to get? Or do we need the ability to exclude matched packages? Either way, this is only likely to be useful in cases where the user can make a small number of filters in a simple way to get the behavior they want. Thoughts?
Since this issue was entered in Red Hat Bugzilla, the release flag has been set to ? to ensure that it is properly evaluated for this release.
*** This bug has been marked as a duplicate of bug 754576 ***