Bug 885264 - Repos with an existing xml:base cause pulp to generate bad repodata
Summary: Repos with an existing xml:base cause pulp to generate bad repodata
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Pulp
Classification: Retired
Component: user-experience
Version: 2.0.6
Hardware: All
OS: Linux
unspecified
high
Target Milestone: ---
: ---
Assignee: John Matthews
QA Contact: Preethi Thomas
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-12-07 23:58 UTC by Steven Roberts
Modified: 2013-09-09 16:27 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-01-09 17:04:59 UTC
Embargoed:


Attachments (Terms of Use)

Description Steven Roberts 2012-12-07 23:58:19 UTC
Description of problem:
If you have Repo feed (RHN, puppetlabs both do this) that specific xml:base int he location element, that base is included in the published repodata.

this causes a pulp-consumer to try and pull the content fromt he source instead of the pulp server

Version-Release number of selected component (if applicable):
Pulp: 2.0.6-0.13.beta
Grinder: 0.1.9

How reproducible:
always

Steps to Reproduce:
1. sync a repo from RHN CDN (like rhels6)
2. have a consumer bind to it
3. try to install an rpm
  
Actual results:
get a cert error as the consumer doens't have a RHN CDN cert

Expected results:
install the rpm from the pulp server

Additional info:

Comment 1 Pradeep Kilambi 2012-12-08 00:06:04 UTC
fixed! commit 9a71a64d69628f7e594f092d6c4f479fdfeafbce

Yum dumps the xml data with the base url from metadata or uses the download url directly. This causes issues in pulp as pulp shares a package between multiple repos and we cant have one base url for it. Override this to None when its preserved in pulp and let yum handle constructing the download url from .repo file.

Steven applied this patched and tested the workflow on his end as well.

Comment 2 Pradeep Kilambi 2012-12-08 00:11:13 UTC
tagged: grinder-0.1.10-1

Comment 3 Steven Roberts 2012-12-08 01:11:52 UTC
used the puppetlabs el6 products repo: http://yum.puppetlabs.com/el/6/products/x86_64

- applied the patch
- restarted httpd 
- delete'd the repo
- purged the rpm's for it in /var/lib/pulp/content just to be sure
- create'd the repo again
- purged the yum cache ont he consumer
- re-bound the consumer to the repo (since it was removed with the delete)
- did a manual yum install of facter.  confirmed pulled from the pulp server by looking in the pulp server's httpd logs
- rpm -e facter, purge yam cache on the consumer
- did the install via 'pulp-admin rpm consumer'.  confirmed pulled from the pulp server by looking in the pulp server's httpd logs

Comment 4 Jeff Ortel 2012-12-10 16:22:28 UTC
pulp: 2.0.6-0.14.beta
grinder: 0.1.10

Comment 5 Steven Roberts 2012-12-10 22:00:32 UTC
did a yum update.  confirmed the patch is in the new grinder and looks good.

FYI, I tossed together a quick and dirty script to patch the current location xml:base references.  seems to have worked.  My first mongo interaction scripting, so a bit messy as I was learning the basics of its API.  But it seems to have doen the trick

===================================
#! /usr/bin/perl

use MongoDB;
use Data::Dumper;

my $client = MongoDB::MongoClient->new();
#my $db = $client->get_database('pulp_database');
my $db = $client->get_database(shift);
my $col = $db->get_collection('units_rpm');

my $cur = $col->find();
#$cur->limit(20);

while (my $objRef = $cur->next()) {
   my %obj = %$objRef;
   my $id = $obj{'_id'};
   print $i++,qq(   id => $id\n);
   my $primary = $obj{'repodata'}{'primary'};
   if ($primary =~ s/(location xml:base)="[^"]+"/$1=""/m) {
      $obj{'repodata'}{'primary'} = $primary;
      $col->update({'_id' => $id}, $objRef);
   }
}

Comment 6 Steven Roberts 2012-12-11 08:09:45 UTC
just got some weird behavior on the xml:base field when pulling down rhels6 "optional" packages feed.  it ended up with:
xml:base=" /Packages"
(puppet3 from the puppetlabs rpms needs rubygems RPM which is in the optional's feed on RHN for rhels6)

I'm having trouble getting a cdn cert work from my laptop (at home) so haven't been able to see what is in the upstream primary.xml yet.  I'll take a took tomorrow when I get into the office.

Comment 7 Steven Roberts 2012-12-11 18:24:57 UTC
It has location tags like this in it:
<location href="Packages/389-ds-base-devel-1.2.10.2-15.el6.i686.rpm"/>

they get converted to this in pulp (using latest code from beta):
<location xml:base=" /Packages" href="389-ds-base-devel-1.2.10.2-15.el6.i686.rpm"/>

Comment 8 Preethi Thomas 2012-12-11 22:30:43 UTC
Fails_qa

[root@preethi-el6-pulp ~]# rpm -q pulp-server
pulp-server-2.0.6-0.14.beta.noarch
[root@preethi-el6-pulp ~]# 


I synced 2 rhel6 & rhel5 repos
bound a rhel5 & rhel 6 client to the repos

On the clients

When I do a repo list I can see that there is only 500 packages in the published repo.

Also package install showed that it wasn't installing the packages with package not found error


On my rhel6 client

[root@candidate-client ~]# yum repolist
Loaded plugins: product-id, subscription-manager

repo id                             repo name                             status
el6-optional                        el6-optional                          142
pulp-v2-candidate                   Pulp v2 Beta Builds                    36
rhel6_3                             rhel6_3                               500
repolist: 678
[root@candidate-client ~]# 


[root@rhel5-pulp ~]# yum repolist
repo id                                   repo name                       status
el6-optional                                             | 2.3 kB     00:00     
rhel5                                                    | 3.5 kB     00:00     
repo id                                   repo name                       status
el6-optional                              el6-optional                       142
epel                                      Extra Packages for Enterprise L  7,226
pulp-v2-candidate                         Pulp v2 Beta Builds                 22
rhel-x86_64-server-5                      Red Hat Enterprise Linux (v. 5  14,202
rhel-x86_64-server-5-mrg-messaging-1      MRG Messaging v. 1 (for RHEL 5     242
rhel-x86_64-server-5-mrg-messaging-base-1 MRG Messaging Base v. 1 (for RH    151
rhel5                                     rhel5                              500
repolist: 22,485
[root@rhel5-pulp ~]#

Comment 9 Steven Roberts 2012-12-12 03:59:46 UTC
did some debugging and more testing with Preethi on this (coordinating over IRC).

looks like they are at least two types of repodata inputs that we are hitting:
1) href has a plain package name in it like puppetlabs:
<location href="ruby-augeas-0.4.1-1.el6.x86_64.rpm"/>

the patch in comment #1 handles this case and is what he had tried during dev testing since the puppet repo isn't nearly as large as the RHN ones are.

2) href includes a relative path, like from RHN:
<location href="Packages/389-ds-base-devel-1.2.10.2-15.el6.i686.rpm"/>
as noted in comment #7

both tests weer done against a fresh pulp-server install.  latest beta, so:
pulp-server: 2.0.6-0.14.beta
grinder: 0.1.10-1.el6

I'm a novice when it comes to python, but I still make take a stab at looking at the code to see if I can see what is going on.

Comment 10 Steven Roberts 2012-12-12 18:01:18 UTC
FYI, I changed my perl fixup script regex to:
if ($primary =~ s/(location) xml:base="[^"]*"/$1/) {

basically it is just nuke the xml:base from the mongodb.  that seems to be giving me good publish results.

Comment 11 John Matthews 2012-12-12 22:03:24 UTC
comment #8 was related to an error with pagination which broke all publishing of repos with more than 500 packages.

Comment 12 John Matthews 2012-12-12 22:52:27 UTC
Steven,

I replicated the issue:
1) Created a pulp repo with feed
 https://cdn.redhat.com/content/beta/rhel/rhui/server/6/6Server/x86_64/optional/os/ 

2) Sync'd content
3) Create a dummy yum repo and tried to install a package from it.


# cat /etc/yum.repos.d/jwm.repo 
[jwm]
name=jwm
baseurl=https://127.0.0.1/pulp/repos/content/beta/rhel/rhui/server/6/6Server/x86_64/optional/os/
enabled=1
skip_if_unavailable=1
gpgcheck=0
sslverify=0


# yum --disablerepo=* --enablerepo="jwm" install rpm-apidocs
Loaded plugins: security
Setting up Install Process
Resolving Dependencies
--> Running transaction check
---> Package rpm-apidocs.noarch 0:4.8.0-32.el6 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

=================================================================================================================================================================================================================================
 Package                                                  Arch                                                Version                                                     Repository                                        Size
=================================================================================================================================================================================================================================
Installing:
 rpm-apidocs                                              noarch                                              4.8.0-32.el6                                                jwm                                              1.4 M

Transaction Summary
=================================================================================================================================================================================================================================
Install       1 Package(s)

Total download size: 1.4 M
Installed size: 5.6 M
Is this ok [y/N]: y
Downloading Packages:


Error Downloading Packages:
  rpm-apidocs-4.8.0-32.el6.noarch: failed to retrieve rpm-apidocs-4.8.0-32.el6.noarch.rpm from jwm
error was [Errno 2] Local file does not exist: /root/ /Packages/rpm-apidocs-4.8.0-32.el6.noarch.rpm


<package type="rpm">
  <name>rpm-apidocs</name>
  <arch>noarch</arch>
  <version epoch="0" ver="4.8.0" rel="32.el6"/>
  <checksum type="sha" pkgid="YES">942c32494bfc268ba7f209ae9721ffdbfcf9d562</checksum>
  <summary>API documentation for RPM libraries</summary>
  <description>This package contains API documentation for developing applications
that will manipulate RPM packages and databases.</description>
  <packager>Red Hat, Inc. &lt;http://bugzilla.redhat.com/bugzilla&gt;</packager>
  <url>http://www.rpm.org/</url>
  <time file="1353351362" build="1352989518"/>
  <size package="1429468" installed="5827210" archive="5899672"/>
<location xml:base=" /Packages" href="rpm-apidocs-4.8.0-32.el6.noarch.rpm"/>
  <format>
    <rpm:license>GPLv2+</rpm:license>
    <rpm:vendor>Red Hat, Inc.</rpm:vendor>
    <rpm:group>Documentation</rpm:group>
    <rpm:buildhost>ppc-004.build.bos.redhat.com</rpm:buildhost>
    <rpm:sourcerpm>rpm-4.8.0-32.el6.src.rpm</rpm:sourcerpm>
    <rpm:header-range start="1384" end="88968"/>
    <rpm:provides>
      <rpm:entry name="rpm-apidocs" flags="EQ" epoch="0" ver="4.8.0" rel="32.el6"/>
    </rpm:provides>
  </format>
</package>

Comment 13 John Matthews 2012-12-12 22:56:17 UTC
To confirm, file is present and available on pulp server.
# wget --no-check-certificate https://127.0.0.1/pulp/repos/content/beta/rhel/rhui/server/6/6Server/x86_64/optional/os/rpm-apidocs-4.8.0-32.el6.noarch.rpm
--2012-12-12 14:58:26--  https://127.0.0.1/pulp/repos/content/beta/rhel/rhui/server/6/6Server/x86_64/optional/os/rpm-apidocs-4.8.0-32.el6.noarch.rpm
Connecting to 127.0.0.1:443... connected.
WARNING: cannot verify 127.0.0.1’s certificate, issued by “/C=--/ST=SomeState/L=SomeCity/O=SomeOrganization/OU=SomeOrganizationalUnit/CN=preethi-el6-pulp.usersys.redhat.com/emailAddress=root.redhat.com”:
  Self-signed certificate encountered.
WARNING: certificate common name “preethi-el6-pulp.usersys.redhat.com” doesn’t match requested host name “127.0.0.1”.
HTTP request sent, awaiting response... 200 OK
Length: 1429468 (1.4M) [application/x-rpm]
Saving to: “rpm-apidocs-4.8.0-32.el6.noarch.rpm”

100%[=======================================================================================================================================================================================>] 1,429,468   --.-K/s   in 0.1s    

2012-12-12 14:58:26 (10.5 MB/s) - “rpm-apidocs-4.8.0-32.el6.noarch.rpm” saved [1429468/1429468]


I will look into the xml:base issue and see if we can get a patch for it tomorrow.

Comment 14 John Matthews 2012-12-13 20:47:16 UTC
Below patch will fix this issue:
http://git.fedorahosted.org/cgit/grinder.git/commit/?id=435f7f081d51b26d024c047ffbd83322a68095e8

This is my understanding of the problem.
We recently moved from using 'createrepo' to generating metadata to using "snippets" of XML generated by yum.

When yum generates the primary_xml snippet it constructs the <location> tag in a manner different than createrepo.  Yum pays attention to where the RPM is being transferred from and preserves that information in the location tag.

I attempted to override a few attributes of the yum package object, which is a YumAvailablePackageSqlite instance.  This class mixes in behavior of YumAvailablePackage yet does not allow modify the "_remote_url()" which was needed to change how <location> get generated.

The fix we have is:
 1) Use a simple string substitution after getting the primary xml snippet from yum
 2) Modify <location /> to only have a 'href' value
 3) Set the href value to just the RPM name





Below is a test run installing the new grinder RPM.
Note: 
  The existing units must be deleted from Pulp prior to testing this.
  A resync with the updated grinder is not sufficient.
  To delete the units I ran:
  1) pulp-admin rpm repo delete
  2) pulp-admin orphans remove --all




# yum --disablerepo=* --enablerepo="jwm" install rpm-apidocs
Loaded plugins: security
jwm                                                                                                                                                                                                          | 2.3 kB     00:00     
jwm/primary_db                                                                                                                                                                                               | 188 kB     00:00     
Setting up Install Process
Resolving Dependencies
--> Running transaction check
---> Package rpm-apidocs.noarch 0:4.8.0-32.el6 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

====================================================================================================================================================================================================================================
 Package                                                   Arch                                                 Version                                                     Repository                                         Size
====================================================================================================================================================================================================================================
Installing:
 rpm-apidocs                                               noarch                                               4.8.0-32.el6                                                jwm                                               1.4 M

Transaction Summary
====================================================================================================================================================================================================================================
Install       1 Package(s)

Total download size: 1.4 M
Installed size: 5.6 M
Is this ok [y/N]: y
Downloading Packages:
rpm-apidocs-4.8.0-32.el6.noarch.rpm                                                                                                                                                                          | 1.4 MB     00:00     
Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
Warning: RPMDB altered outside of yum.
  Installing : rpm-apidocs-4.8.0-32.el6.noarch                                                                                                                                                                                  1/1 

Installed:
  rpm-apidocs.noarch 0:4.8.0-32.el6                                                                                                                                                                                                 

Complete!

Comment 15 John Matthews 2012-12-13 20:49:35 UTC
Grinder has been tagged with 0.1.11-1 to include this fix.

Comment 16 Jeff Ortel 2012-12-13 22:08:54 UTC
build: 2.0.6-0.17.beta

Comment 17 Preethi Thomas 2012-12-20 14:43:49 UTC
verified

[root@pulp-v2-testing ~]# rpm -q pulp-server
pulp-server-2.0.6-0.19.beta.noarch
[root@pulp-v2-testing ~]# rpm -q grinder
grinder-0.1.12-1.fc17.noarch
[root@pulp-v2-testing ~]# 

[root@pulp-v2-testing ~]# pulp-admin rpm repo create --repo-id rhel63_optional --feed https://cdn.redhat.com/content/beta/rhel/rhui/server/6/6Server/x86_64/optional/os/  --feed-cert ~/CDN/rcm-debug-20130208.crt --feed-key ~/CDN/rcm-debug-20130208.key --feed-ca-cert ~/CDN/cdn.redhat.com-chain.crt --remove-old true --relative-url "rhel63_optional"
Successfully created repository [rhel63_optional]

[root@pulp-v2-testing ~]# pulp-admin rpm repo sync run --repo-id rhel63_optional
+----------------------------------------------------------------------+
               Synchronizing Repository [rhel63_optional]
+----------------------------------------------------------------------+

This command may be exited by pressing ctrl+c without affecting the actual
operation on the server.

Downloading metadata...
[\]
... completed

Downloading repository content...
[==================================================] 100%
RPMs:       439/439 items
Delta RPMs: 0/0 items
Tree Files: 0/0 items
Files:      0/0 items
... completed

Importing errata...
[\]
... completed

Importing package groups/categories...
[-]
... completed

Publishing packages...
[==================================================] 100%
Packages: 439/439 items
... completed

Publishing distributions...
[==================================================] 100%
Distributions: 0/0 items
... completed

Generating metadata
[-]
... completed

Publishing repository over HTTPS
[-]
... completed



[root@pulp-v2-testing ~]# cat /etc/yum.repos.d/preethi.repo 
[preethi]
name=preethi
baseurl=https://127.0.0.1/pulp/repos/rhel63_optional/
enabled=1
skip_if_unavailable=1
gpgcheck=0
sslverify=0
[root@pulp-v2-testing ~]# 

[root@pulp-v2-testing ~]# yum repolist
Loaded plugins: langpacks, presto, refresh-packagekit
preethi                                                  | 2.5 kB     00:00     
preethi/primary_db                                       | 182 kB     00:00     
repo id                       repo name                                   status
fedora                        Fedora 17 - x86_64                          27,033
preethi                       preethi                                        439
pulp-v2-builds                Pulp v2 Beta Builds                             36
updates                       Fedora 17 - x86_64 - Updates                10,729
repolist: 38,237
[root@pulp-v2-testing ~]# yum --disablerepo=* --enablerepo="preethi"  install rpm-apidocs
Loaded plugins: langpacks, presto, refresh-packagekit
Resolving Dependencies
--> Running transaction check
---> Package rpm-apidocs.noarch 0:4.8.0-32.el6 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

================================================================================
 Package             Arch           Version               Repository       Size
================================================================================
Installing:
 rpm-apidocs         noarch         4.8.0-32.el6          preethi         1.4 M

Transaction Summary
================================================================================
Install  1 Package

Total download size: 1.4 M
Installed size: 5.6 M
Is this ok [y/N]: y
Downloading Packages:
rpm-apidocs-4.8.0-32.el6.noarch.rpm                      | 1.4 MB     00:00     
Running Transaction Check
Running Transaction Test
Transaction Test Succeeded
Running Transaction
  Installing : rpm-apidocs-4.8.0-32.el6.noarch                              1/1 
  Verifying  : rpm-apidocs-4.8.0-32.el6.noarch                              1/1 

Installed:
  rpm-apidocs.noarch 0:4.8.0-32.el6                                             

Complete!



Through pulp


[root@pulp-v2-testing ~]# pulp-admin rpm consumer list --details
+----------------------------------------------------------------------+
                               Consumers
+----------------------------------------------------------------------+

Id:            f17-client
Display Name:  f17-client
Bindings:      
  Confirmed:   rhel63_optional, local_rhel63
  Unconfirmed: 
Capabilities:  
Certificate:   -----BEGIN CERTIFICATE-----
               MIICGTCCAQECAQMwDQYJKoZIhvcNAQEFBQAwFDESMBAGA1UEAxMJbG9jYWxob3N0
               MB4XDTEyMTIyMDEzMzM0N1oXDTIyMTIxODEzMzM0N1owFTETMBEGA1UEAxMKZjE3
               LWNsaWVudDCBnzANBgkqhkiG9w0BAQEFAAOBjQAwgYkCgYEAtCrgYjKTCwojZxmC
               2wbFVlyh2wN6Tz1sgEL9Ic1GORvuapQUQejqz36vNq34QGG10KAK+BXAgpEUpgqw
               DcnL5S6+NVpkdI15xeAfW1USv83P3t3PDDi6afjOAngFDloTIXzE72N2kPEQRVIC
               kyVNpvi1/ii19qFuXBg/A14vU1sCAwEAATANBgkqhkiG9w0BAQUFAAOCAQEAerkg
               C3xFfqLFcoCwjqjaDsTg4162t/XGLLdH4fe1zKSlmo5XmLfru6h5MQkPiW8lY25h
               75XxMRDwspBoNbsPBsFvC9Z2PRIk2YI9UuaywJXhsr94q3RsVkJA/xD5qqnBq3Wo
               RLPNsY0n/YYwrg19KbxL3kULC5aruPpkiKJG/cGNqE0Ob+X5Fd2svDy4DBUSNgXx
               rCiIa5KweTO/Lu5/7VIRRfdKRQAPQ+pUENMR/6PBrcPERX05W46aW1JU63eGBd9/
               U6079XQJENfMZF88XPbTlje8MaNOSPh/VhJMXm1J/AyywB4E6NuGRmQ+OQxlq9Wj
               onF//E5PKAx0r18M8w== -----END CERTIFICATE-----
Description:   None
Notes:         

 
[root@pulp-v2-testing ~]# pulp-admin rpm consumer package install  run -n rpm-apidocs --consumer-id f17-client
Install task created with id [b2bf28d5-f297-4607-bf34-3db3e49be8b3]

This command may be exited via ctrl+c without affecting the install.

Refresh Repository Metadata             [ OK ]
Downloading Packages                    [ OK ]
Check Package Signatures                [ OK ]
Running Test Transaction                [ OK ]
Running Transaction                     [ OK ]
Install Succeeded

+----------------------------------------------------------------------+
                               Installed
+----------------------------------------------------------------------+

Name:    rpm-apidocs
Version: 4.8.0
Arch:    noarch
Repoid:  rhel63_optional


[root@pulp-v2-testing ~]#

Comment 18 Preethi Thomas 2013-01-09 17:04:59 UTC
Pulp v2.0 released


Note You need to log in before you can comment on or make changes to this bug.