Bug 737535 - Pulp saving multiple copies of the same rpm in /var/lib/pulp/packages
Summary: Pulp saving multiple copies of the same rpm in /var/lib/pulp/packages
Alias: None
Product: Red Hat Update Infrastructure for Cloud Providers
Classification: Red Hat
Component: RHUA
Version: 2.0
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
: ---
Assignee: Jay Dobies
QA Contact: wes hayutin
Depends On:
TreeView+ depends on / blocked
Reported: 2011-09-12 13:09 UTC by James Slagle
Modified: 2011-09-12 16:15 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Last Closed: 2011-09-12 16:15:51 UTC

Attachments (Terms of Use)
grinder log (5.44 MB, application/octet-stream)
2011-09-12 13:12 UTC, James Slagle
no flags Details

Description James Slagle 2011-09-12 13:09:52 UTC
Pulp should save exactly 1 copy of each rpm in /var/lib/pulp/packages.  The repos in /var/lib/pulp/repos symlink to the packages they use, that way if a package is in multiple repos, it's only stored on disk once.

There seems to be an issue with this somewhere, because I noticed that the space usage of /var/lib/pulp keeps growing.  Also I noticed that after a CDS sync, the /var/lib/pulp on a CDS was smaller than what was on the RHUA, sometimes as much as 150 GB vs 99 GB.

I did some research and found that the same rpm is saved multiple times in different locations under /var/lib/pulp/packages on the RHUA.

Here's an example of a duplicated package:
[root@rhui2 packages]# pwd
[root@rhui2 packages]# find condor/7.6.1/0.10.el6/src/
[root@rhui2 packages]# sha1sum condor/7.6.1/0.10.el6/src/6e2/condor-7.6.1-0.10.el6.src.rpm
6e2a74b19317e2e950d44ebc9f32cdd1a85ca0f0  condor/7.6.1/0.10.el6/src/6e2/condor-7.6.1-0.10.el6.src.rpm
[root@rhui2 packages]# sha1sum condor/7.6.1/0.10.el6/src/dc9/condor-7.6.1-0.10.el6.src.rpm
6e2a74b19317e2e950d44ebc9f32cdd1a85ca0f0  condor/7.6.1/0.10.el6/src/dc9/condor-7.6.1-0.10.el6.src.rpm

Notice the sha1 sums are the same, but somehow one ended up in a directory named 6e2 (correct), and one ended up in a directory named dc9 (incorrect). 

Potentially, this bug could cause /var/lib/pulp to grow in space usage unbounded.

Comment 1 James Slagle 2011-09-12 13:12:28 UTC
Created attachment 522700 [details]
grinder log

attaching saved grinder log that shows the download of the condor package I used in the example above.

Comment 2 Pradeep Kilambi 2011-09-12 14:22:36 UTC
could you check if dc9 is perhaps sha256sum for condor-7.6.1-0.10.el6.src.rpm ?

I'm suspecting, this repo was synced with sha256 as checksum first and perhaps with sha1?

Comment 3 Pradeep Kilambi 2011-09-12 15:31:36 UTC
So the issue here seems to be with metadata on cdn itself. 

The metadata for https://cdn.redhat.com/content/dist/rhel/rhui/server/6/6Server/i386/mrg-g/2.0/source/SRPMS/ is generated with sha1 checksum and 

metadata for  https://cdn.redhat.com/content/dist/rhel/rhui/server/6/6Server/i386/mrg-g/2.0/source/SRPMS/ is generated with sha256. This causes the packages to be duplicated on filesystem as the save path would be different.


  <version epoch="0" ver="7.6.1" rel="0.10.el6"/>
  <checksum type="sha1" pkgid="YES">6e2a74b19317e2e950d44ebc9f32cdd1a85ca0f0</checksum>
  <summary>Condor: High Throughput Computing</summary>


  <version epoch="0" ver="7.6.1" rel="0.10.el6"/>
  <checksum type="sha256" pkgid="YES">dc97fb79fe1a5668ff31a2b2d67bc3261e2eb4e6e916ddde2dc7a1d8ee84aa14</checksum>
  <summary>Condor: High Throughput Computing</summary>

$ find condor/7.6.1/0.10.el6/src/condor/7.6.1/0.10.el6/src/

Comment 4 Pradeep Kilambi 2011-09-12 16:15:51 UTC
based on my above info. closing this as not a bug.

Note You need to log in before you can comment on or make changes to this bug.