Bug 753940

Summary: inconsistent metadata can cause orphaned packages during sync
Product: [Retired] Pulp Reporter: Pradeep Kilambi <pkilambi>
Component: user-experienceAssignee: Pradeep Kilambi <pkilambi>
Status: CLOSED WONTFIX QA Contact: Preethi Thomas <pthomas>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: cstpierr, jason.dobies, jmatthew, skarmark
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-11-28 23:27:10 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Pradeep Kilambi 2011-11-14 22:00:42 UTC
Description of problem:

If the source metadata has inconsistent information causing the syncs to orphan packages.For example, in the current case, the metadata refers to the package arch as i386 but href has noarch and causes the package to be orphaned

$ grinder yum -U http://downloads.linux.hp.com/SDR/downloads/ProLiantSupportPack/RedHat/6/x86_64/current/ --label /tmp/testrepo1
...
grinder.RepoFetch: INFO     Cleaning any orphaned packages..
grinder.RepoFetch: INFO     Removing orphan package: /tmp/testrepo1/hp-snmp-agents-8.7.0.23-17.x86_64.rpm
grinder.RepoFetch: INFO     Removing orphan package: /tmp/testrepo1/hp-health-8.7.0.22-17.x86_64.rpm
grinder.RepoFetch: INFO     Removing orphan package: /tmp/testrepo1/hpdiags-8.7.2-7.x86_64.rpm
grinder.RepoFetch: INFO     Removing orphan package: /tmp/testrepo1/cpqacuxe-8.70-9.0.i386.rpm
grinder.RepoFetch: INFO     Removing orphan package: /tmp/testrepo1/hpvca-6.3.0-8.i386.rpm
grinder.RepoFetch: INFO     Removing orphan package: /tmp/testrepo1/hpacucli-8.70-8.0.i386.rpm


but the metadata shows the following:

<package type="rpm">
  <name>cpqacuxe</name>
  <arch>i386</arch>
  <version epoch="0" ver="8.70" rel="9.0"/>
  <checksum type="sha" pkgid="YES">73a26e086f02f241f60f8c526ef37dcfe3f86d0e</checksum>
  <summary>HP Array Configuration Utility</summary>
  <description>The HP Array Configuration Utility is the web-based disk array
configuration program for Array Controllers.</description>
  <packager>Hewlett-Packard Company</packager>
  <url>http://www.hp.com/linux</url>
  <time file="1309362489" build="1296625315"/>
  <size package="4856021" installed="14197723" archive="14243296"/>
  <location href="cpqacuxe-8.70-9.0.noarch.rpm"/>
  <format>
    <rpm:license>See cpqacuxe.license</rpm:license>
    <rpm:vendor>Hewlett-Packard Company</rpm:vendor>
    <rpm:group>Applications/System</rpm:group>
    <rpm:buildhost>Prowl</rpm:buildhost>
    <rpm:sourcerpm>cpqacuxe-8.70-9.0.src.rpm</rpm:sourcerpm>
    <rpm:header-range start="456" end="47917"/>
    <rpm:provides>
      <rpm:entry name="cpqacuxe" flags="EQ" epoch="0" ver="8.70" rel="9.0"/>
      <rpm:entry name="hpacu.so"/>
      <rpm:entry name="libcpqimgr.so"/>
    </rpm:provides>

Expected:
all the packages should be downloaded and preserved.

Comment 1 Pradeep Kilambi 2011-11-30 18:41:40 UTC
commit faa3c7102c687c2aa5dc8d700bc4cd1a1bcc6639

Comment 2 Jeff Ortel 2011-12-03 00:00:48 UTC
build: 0.254.

Comment 3 Preethi Thomas 2011-12-15 18:48:34 UTC
verified
[root@preethi ~]# rpm -q pulp
pulp-0.0.254-7.fc15.noarch


[root@preethi ~]# pulp-admin repo create --id=753940 --feed=http://downloads.linux.hp.com/SDR/downloads/ProLiantSupportPack/RedHat/6/x86_64/current/
Successfully created repository [ 753940 ]

[root@preethi ~]# pulp-admin repo sync --id=753940 -F
Sync for repository 753940 started
Sync: Finished
18/18 new items downloaded
0/18 existing items processed

Item Details: 
RPMs: 18/18

[root@preethi ~]# pulp-admin content list --orphaned
No orphaned content on server

Comment 4 Chris St. Pierre 2011-12-16 15:50:59 UTC
This does not appear to have fixed the issue.  Unfortunately, I'm not sure listing orphaned content is a valid test.

% pulp-admin repo list | grep -A 3 generic-6-x86_64-hp
Id                      generic-6-x86_64-hp      
Name                    Generic 6 x86_64 - HP    
Repo URL                https://mirror2.example.com/pulp/repos/generic-6-x86_64-hp/
Feed URL                http://downloads.linux.hp.com/SDR/downloads/ProLiantSupportPack/RedHat/6/x86_64/current/
Feed Type               remote                   
Content Type            yum

% pulp-admin repo sync --id=generic-6-x86_64-hp -F
Sync for repository generic-6-x86_64-hp started
Sync: Finished
0/18 new items downloaded
18/18 existing items processed

Item Details: 
RPMs: 18/18

% pulp-admin repo status --id=generic-6-x86_64-hp   
+------------------------------------------+
       Status for generic-6-x86_64-hp
+------------------------------------------+
Repository: generic-6-x86_64-hp
Number of Packages: 12
Last Sync: 2011-12-16 10:46:36-05:00

% rpm -q pulp                                                
pulp-0.0.254-8.el6.noarch

So if it's not orphaning the packages that's swell, but I'm still winding up with fewer packages in Pulp than were in the upstream repo.

Comment 5 Pradeep Kilambi 2011-12-16 18:06:33 UTC
the packages are definitely not orphaned

$ ls -l /var/lib/pulp/repos/SDR/downloads/ProLiantSupportPack/RedHat/6/x86_64/current/*.rpm |wc -l
18

so ginder is definitely working as expected. Pulp though probably has an issue; i'll need to look into whats pulp is doing at the import stage.

Will keep you posted.

Comment 6 Pradeep Kilambi 2011-12-16 18:45:56 UTC
this is just due to poorly constructed metadata from the url you're using. Pulp is doing things just fine, its the mismatch in package information vs the href in primary.xml throwing things off at the pulp level. 

The fix i put in grinder, prevents orphaning the packages. So we're one step ahead. I'll see there is anything i can do at pulp side to work around. Pulp is just trying to import the metadata downloaded from the repo and its just not matching.

Comment 7 Jay Dobies 2012-11-28 23:27:10 UTC
Orphans are handled totally differently in 2.x. It's outside of metadata on the filesystem entirely, it's determined based on database knowledge of the unit being associated with one or more repos. We have no plans for another 1.x release so I'm closing this bug out.