Pulp should save exactly 1 copy of each rpm in /var/lib/pulp/packages. The repos in /var/lib/pulp/repos symlink to the packages they use, that way if a package is in multiple repos, it's only stored on disk once. There seems to be an issue with this somewhere, because I noticed that the space usage of /var/lib/pulp keeps growing. Also I noticed that after a CDS sync, the /var/lib/pulp on a CDS was smaller than what was on the RHUA, sometimes as much as 150 GB vs 99 GB. I did some research and found that the same rpm is saved multiple times in different locations under /var/lib/pulp/packages on the RHUA. Here's an example of a duplicated package: [root@rhui2 packages]# pwd /var/lib/pulp/packages [root@rhui2 packages]# find condor/7.6.1/0.10.el6/src/ condor/7.6.1/0.10.el6/src/ condor/7.6.1/0.10.el6/src/6e2 condor/7.6.1/0.10.el6/src/6e2/condor-7.6.1-0.10.el6.src.rpm condor/7.6.1/0.10.el6/src/dc9 condor/7.6.1/0.10.el6/src/dc9/condor-7.6.1-0.10.el6.src.rpm [root@rhui2 packages]# sha1sum condor/7.6.1/0.10.el6/src/6e2/condor-7.6.1-0.10.el6.src.rpm 6e2a74b19317e2e950d44ebc9f32cdd1a85ca0f0 condor/7.6.1/0.10.el6/src/6e2/condor-7.6.1-0.10.el6.src.rpm [root@rhui2 packages]# sha1sum condor/7.6.1/0.10.el6/src/dc9/condor-7.6.1-0.10.el6.src.rpm 6e2a74b19317e2e950d44ebc9f32cdd1a85ca0f0 condor/7.6.1/0.10.el6/src/dc9/condor-7.6.1-0.10.el6.src.rpm Notice the sha1 sums are the same, but somehow one ended up in a directory named 6e2 (correct), and one ended up in a directory named dc9 (incorrect). Potentially, this bug could cause /var/lib/pulp to grow in space usage unbounded.
Created attachment 522700 [details] grinder log attaching saved grinder log that shows the download of the condor package I used in the example above.
could you check if dc9 is perhaps sha256sum for condor-7.6.1-0.10.el6.src.rpm ? I'm suspecting, this repo was synced with sha256 as checksum first and perhaps with sha1?
So the issue here seems to be with metadata on cdn itself. The metadata for https://cdn.redhat.com/content/dist/rhel/rhui/server/6/6Server/i386/mrg-g/2.0/source/SRPMS/ is generated with sha1 checksum and metadata for https://cdn.redhat.com/content/dist/rhel/rhui/server/6/6Server/i386/mrg-g/2.0/source/SRPMS/ is generated with sha256. This causes the packages to be duplicated on filesystem as the save path would be different. i386: <name>condor</name> <arch>src</arch> <version epoch="0" ver="7.6.1" rel="0.10.el6"/> <checksum type="sha1" pkgid="YES">6e2a74b19317e2e950d44ebc9f32cdd1a85ca0f0</checksum> <summary>Condor: High Throughput Computing</summary> x86_64: <name>condor</name> <arch>src</arch> <version epoch="0" ver="7.6.1" rel="0.10.el6"/> <checksum type="sha256" pkgid="YES">dc97fb79fe1a5668ff31a2b2d67bc3261e2eb4e6e916ddde2dc7a1d8ee84aa14</checksum> <summary>Condor: High Throughput Computing</summary> $ find condor/7.6.1/0.10.el6/src/condor/7.6.1/0.10.el6/src/ condor/7.6.1/0.10.el6/src/6e2 condor/7.6.1/0.10.el6/src/6e2/condor-7.6.1-0.10.el6.src.rpm condor/7.6.1/0.10.el6/src/dc9 condor/7.6.1/0.10.el6/src/dc9/condor-7.6.1-0.10.el6.src.rpm
based on my above info. closing this as not a bug.