Bug 529233

Summary: YumBase.bestPackagesFromList() very slow in yum versions 3.2.22 and 3.2.23
Product: Red Hat Enterprise Linux 5 Reporter: Steve Rotolo <steve.rotolo>
Component: yumAssignee: James Antill <james.antill>
Status: CLOSED ERRATA QA Contact: Jan Hutař <jhutar>
Severity: medium Docs Contact:
Priority: low    
Version: 5.4CC: jhutar, psklenar
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: yum-3.2.22-23.el5 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-03-30 04:30:07 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Attachments:
Description Flags
python script to demostrate poor performance of YumBase.bestPackagesFromList() on RHEL
none
python script to demostrate poor performance of YumBase.bestPackagesFromList() on Fedora
none
F8 yum-3.2.20 results
none
F10 yum-3.2.21 results
none
F10 yum-3.2.23 results
none
RHEL 5.2 yum-3.2.8 results
none
RHEL 5.4 yum-3.2.22 results
none
F10 yum.conf
none
F10 fedora repo
none
F10 updates repo
none
New depsolve.py file ... put in /usr/lib/python2.4/site-packages/yum none

Description Steve Rotolo 2009-10-15 11:56:16 EDT
Created attachment 364946 [details]
python script to demostrate poor performance of YumBase.bestPackagesFromList() on RHEL

An application using YumBase.bestPackagesFromList() shows significant performance degradation in yum versions greater or equal to version 3.2.22, compared to earlier versions.  A quick glance at the code shows the internal implementation changed radically in or around 3.2.22.

I will attach a small python program that demonstrates the problem on both RHEL 5.4 and Fedora 10.  It is useful for comparing results with earlier versions.

I will also attach the output of runs for the following:
   RHEL 5.2 yum 3.2.8    (good)
   RHEL 5.4 yum 3.2.22   (bad)
   FC8      yum 3.2.20   (good)
   F10      yum 3.2.21   (good)
   F10      yum 3.2.23   (bad)
Comment 1 Steve Rotolo 2009-10-15 11:57:07 EDT
Created attachment 364948 [details]
python script to demostrate poor performance of YumBase.bestPackagesFromList() on Fedora
Comment 2 Steve Rotolo 2009-10-15 11:58:27 EDT
Created attachment 364949 [details]
F8 yum-3.2.20 results
Comment 3 Steve Rotolo 2009-10-15 11:59:15 EDT
Created attachment 364950 [details]
F10 yum-3.2.21 results
Comment 4 Steve Rotolo 2009-10-15 12:00:01 EDT
Created attachment 364951 [details]
F10 yum-3.2.23 results
Comment 5 Steve Rotolo 2009-10-15 12:02:03 EDT
Created attachment 364952 [details]
RHEL 5.2 yum-3.2.8 results
Comment 6 Steve Rotolo 2009-10-15 12:02:34 EDT
Created attachment 364953 [details]
RHEL 5.4 yum-3.2.22 results
Comment 7 seth vidal 2009-10-15 12:10:03 EDT
and can you post the yum.conf and .repo files you're using for each of these cases? Just to be sure on a few things.
Comment 8 Steve Rotolo 2009-10-15 12:21:41 EDT
The attached program performs several calls to YumBase.bestPackagesFromList()
with lists of several versions of some package.  The function call is timed and
the result is displayed.  It is easy to diff output from different versions to
see changes in times.  For example:

$ diff yum-demo-F10-3.2.21.out yum-demo-F10-3.2.23.out
1c1
< Yum Version: 3.2.21
---
> Yum Version: 3.2.23
13c13
<     time: 0.000000
---
>     time: 0.360000
20c20
<     time: 0.000000
---
>     time: 0.920000

The times seem to indicate that the new impl of the function does not scale
well with the number of items in the input list.  The old version seemed to be
almost instantaneous.
Comment 9 Steve Rotolo 2009-10-15 12:26:31 EDT
(In reply to comment #7)
> and can you post the yum.conf and .repo files you're using for each of these
> cases? Just to be sure on a few things.  

Sure thing.  Working on it...
Comment 10 James Antill 2009-10-15 12:33:13 EDT
Is the test program representative of what you are doing?

It looks like you are using it as a really slow way of doing:

        packages = packagesNewestByNameArch(packages)

...compare_providers is used more for working out which package(s) that provide(s) "MTA" is/are the best to install.
 Note that on x86_64 it will return an .i?86 and .x86_64 package, so the assert it wrong.

 Also why do you have getAllPackageVersions() walking the entire list of pkgs?
Comment 11 James Antill 2009-10-15 12:35:44 EDT
It's also worth nothing that current HEAD (very close to the just released 3.2.25, and what is in F12 GA) gives me:

    versions: openssl-0.9.8k-1.fc11.i686, openssl-0.9.8k-1.fc11.i586, openssl-0.9.8k-5.fc11.i686, openssl-0.9.8k-1.fc11.x86_64, openssl-0.9.8k-5.fc11.i586, openssl-0.9.8k-5.fc11.x86_64

    best: openssl-0.9.8k-5.fc11.x86_64
    time: 0.000000

...after removing the assert. So this may be a bug we've already fixed.
Comment 12 James Antill 2009-10-15 12:37:24 EDT
 Can you see if this patch fixes it for you:

http://yum.baseurl.org/gitweb?p=yum.git;a=commitdiff;h=b81f0b56a68fec5f4b33e28570f3613784e75a7d
Comment 14 Steve Rotolo 2009-10-15 13:47:56 EDT
(In reply to comment #10)
> Is the test program representative of what you are doing?
> 
No not really.  I tried to create a minimal test case that shows the problem.

> It looks like you are using it as a really slow way of doing:
> 
>         packages = packagesNewestByNameArch(packages)
> 
> ...compare_providers is used more for working out which package(s) that
> provide(s) "MTA" is/are the best to install.
>  Note that on x86_64 it will return an .i?86 and .x86_64 package, so the assert
> it wrong.
> 

Oh, interesting.  In the real app I am using pkgSack.getProvides() to get all providers of some dependency, and then using bestPackagesFromList() to find the best provider.  As in:

    for dep in some_package.returnPrco('requires'):
        providers = self.pkgSack.getProvides(*dep).keys()
        best_providers = self.bestPackagesFromList(providers)
        assert(len(best_providers) == 1)
        best_provider = best_providers[0]

Yeah, I guess I have to cope with multiple best providers.  But which is correct to use, bestPackagesFromList or packagesNewestByNameArch?  Or none of the above ;-)?

>  Also why do you have getAllPackageVersions() walking the entire list of pkgs?  

Hmm, yeah that is a bit cheesy.  I do have something like that in the real app but it's hasn't proven to be a performance problem.  Generally though, for what I am doing, I find it *much* faster to carry around lists of package names rather than a sack of package objects.  Occasionally I have to map the name back to an object to something with it but that's ok.
Comment 15 Steve Rotolo 2009-10-15 14:38:44 EDT
(In reply to comment #12)
>  Can you see if this patch fixes it for you:
> 
> http://yum.baseurl.org/gitweb?p=yum.git;a=commitdiff;h=b81f0b56a68fec5f4b33e28570f3613784e75a7d  

Well I don't have an F12 system.  The latest version of yum I have is yum-3.2.23-3.fc10.noarch.  So I patched that and manually resolved the reject.  But it made no difference in the timing.
Comment 16 Steve Rotolo 2009-10-15 14:57:36 EDT
(In reply to comment #13)
> Dito.
> 
> http://yum.baseurl.org/gitweb?p=yum.git;a=commitdiff;h=332fba3e9c92abe8b62dfce29c458d43087317bc  

This patch applied cleaner to my version.  The numbers are much better, but still not all 0, for example:

    versions: glibc-2.9-3.i386, glibc-2.9-3.i686, glibc-2.9-2.i686, glibc-2.9-2.i386

    best: glibc-2.9-3.i686
    time: 0.230000

    versions: openssl-0.9.8g-11.fc10.i386, openssl-0.9.8g-14.fc10.i686, openssl-0.9.8g-14.fc10.i386, openssl-0.9.8g-11.fc10.i686

    best: openssl-0.9.8g-14.fc10.i686
    time: 0.220000

Not bad though!
Comment 17 Steve Rotolo 2009-10-15 15:06:17 EDT
Created attachment 364977 [details]
F10 yum.conf
Comment 18 Steve Rotolo 2009-10-15 15:07:59 EDT
Created attachment 364978 [details]
F10 fedora repo
Comment 19 Steve Rotolo 2009-10-15 15:08:28 EDT
Created attachment 364979 [details]
F10 updates repo
Comment 20 James Antill 2009-10-15 15:41:19 EDT
I just checked on RHEL-5, and the results I get are:

** RHEL-5.4 == yum-3.2.22-20 ...

Package 'openssl':

    versions: openssl-0.9.8b-10.el5_2.1.i686, openssl-0.9.8b-8.3.el5.i386, openssl-0.9.8b-8.3.el5_0.2.i386, openssl-0.9.8e-7.el5.i686, openssl-0.9.8e-12.el5.i386, openssl-0.9.8b-10.el5_2.1.i386, openssl-0.9.8b-8.3.el5_0.2.i686, openssl-0.9.8b-8.3.el5.i686, openssl-0.9.8b-10.el5.i686, openssl-0.9.8e-7.el5.i386, openssl-0.9.8e-12.el5.i686, openssl-0.9.8b-10.el5.i386
JDBG: 1 [<YumAvailablePackageSqlite : openssl-0.9.8e-12.el5.i686 (0x92723ac)>]

    best: openssl-0.9.8e-12.el5.i686
    time: 0.870000

** Latest upstream (basically yum-3.2.25) ...

Package 'openssl':

    versions: openssl-0.9.8b-10.el5_2.1.i686, openssl-0.9.8b-8.3.el5.i386, openssl-0.9.8b-8.3.el5_0.2.i386, openssl-0.9.8e-7.el5.i686, openssl-0.9.8e-12.el5.i386, openssl-0.9.8b-10.el5_2.1.i386, openssl-0.9.8b-8.3.el5_0.2.i686, openssl-0.9.8b-8.3.el5.i686, openssl-0.9.8b-10.el5.i686, openssl-0.9.8e-7.el5.i386, openssl-0.9.8e-12.el5.i686, openssl-0.9.8b-10.el5.i386
JDBG: 1 [<YumAvailablePackageSqlite : openssl-0.9.8e-12.el5.i686 (0x9747e0c)>]

    best: openssl-0.9.8e-12.el5.i686
    time: 0.020000

** RHEL-5.4 == yum-3.2.22-20 + the two patches ...

Package 'openssl':

    versions: openssl-0.9.8b-10.el5_2.1.i686, openssl-0.9.8b-8.3.el5.i386, openssl-0.9.8b-8.3.el5_0.2.i386, openssl-0.9.8e-7.el5.i686, openssl-0.9.8e-12.el5.i386, openssl-0.9.8b-10.el5_2.1.i386, openssl-0.9.8b-8.3.el5_0.2.i686, openssl-0.9.8b-8.3.el5.i686, openssl-0.9.8b-10.el5.i686, openssl-0.9.8e-7.el5.i386, openssl-0.9.8e-12.el5.i686, openssl-0.9.8b-10.el5.i386
JDBG: 1 [<YumAvailablePackageSqlite : openssl-0.9.8e-12.el5.i686 (0xa42022c)>]

    best: openssl-0.9.8e-12.el5.i686
    time: 0.010000
Comment 21 James Antill 2009-10-15 15:43:08 EDT
Created attachment 364982 [details]
New depsolve.py file ... put in /usr/lib/python2.4/site-packages/yum

 Here is the depsolve.py with all the patches applied to compare_provides, can you double check this?
Comment 22 James Antill 2009-10-15 15:45:26 EDT
 Oh, and...

        providers = self.pkgSack.getProvides(*dep).keys()
        best_providers = self.bestPackagesFromList(providers)

...that's fine, that's pretty much exactly what yum does when you do "yum install MTA".
Comment 23 Steve Rotolo 2009-10-15 16:33:02 EDT
(In reply to comment #22)
>  Oh, and...
> 
>         providers = self.pkgSack.getProvides(*dep).keys()
>         best_providers = self.bestPackagesFromList(providers)
> 
> ...that's fine, that's pretty much exactly what yum does when you do "yum
> install MTA".  

Thanks!  Sorry but what is "MTA"?  I'm trying the new depsolve.py now...
Comment 24 Steve Rotolo 2009-10-15 16:52:30 EDT
(In reply to comment #21)
> Created an attachment (id=364982) [details]
> New depsolve.py file ... put in /usr/lib/python2.4/site-packages/yum
> 
>  Here is the depsolve.py with all the patches applied to compare_provides, can
> you double check this?  

Awesome!  I plunked this down on RHEL 5.4 and it works great.  So should I expect an official update for this on RHEL 5.4 (eventually)?
Comment 25 James Antill 2009-10-15 22:31:50 EDT
I'll propose it for RHEL-5.5, but I can't guarantee anything yet ... obviously if you have a support contact then a push from a paying customer always helps :).
Comment 27 Jan Hutař 2009-12-15 05:21:11 EST
(In reply to comment #23)
> Thanks!  Sorry but what is "MTA"?  I'm trying the new depsolve.py now...  

MTA is Mail Transport Agent. I guess it is one of these fake provides:

# yum whatprovides MTA
[...]
sendmail-8.14.3-3.fc10.x86_64 : A widely used Mail Transport Agent (MTA)
[...]
2:postfix-2.5.6-1.fc10.x86_64 : Postfix Mail Transport Agent
[...]
Comment 31 errata-xmlrpc 2010-03-30 04:30:07 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2010-0254.html