Bug 509932 - cost excludes are much slower in yum 3.2.23 than 3.2.21
Summary: cost excludes are much slower in yum 3.2.23 than 3.2.21
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: yum
Version: 5.4
Hardware: All
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: James Antill
QA Contact: BaseOS QE Security Team
URL: http://www.mail-archive.com/fedora-de...
Whiteboard:
Depends On: 508445
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-07-06 22:56 UTC by James Antill
Modified: 2014-01-21 23:10 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
A recent version of yum was substantially slower than previous versions when calculating cost excludes. The slowdown was caused by variable that expanded to unicode encoding rather than str. The calculation no longer uses unicode, which returns yum to its previous higher speed.
Clone Of: 508445
Environment:
Last Closed: 2009-09-02 07:34:41 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2009:1419 0 normal SHIPPED_LIVE yum bug fix update 2009-09-02 07:32:56 UTC

Description James Antill 2009-07-06 22:56:21 UTC
+++ This bug was initially created as a clone of Bug #508445 +++

yum 3.2.23 on F10 is a lot slower than 3.2.21.

A simple "yum -C list updates" takes 90 seconds instead of 3 seconds,
if two repositories with the same rpms and different cost are enabled
(typical scenario for local mirroring of updates).

The problem appears to be that the code managing the exclusion of
rpm packages from the expensive repo, which loops on the rpms
and deletes them very inefficiently.
"excluding for cost:" lines are printed quite slowly (option -d 5).

Suspects on this code: (is this O(N^2) ?)

  for pkg in pkgs[1:]:
      if pkg.repo.cost > lowcost:
          msg = _('excluding for cost: %s from %s') % (pkg, pkg.repo.id)
          self.verbose_logger.log(logginglevels.DEBUG_3, msg)
          pkg.repo.sack.delPackage(pkg)


Also see

  http://www.mail-archive.com/fedora-devel-list@redhat.com/msg03366.html

(reproduced by another user with similar conf).

--- Additional comment from james.antill on 2009-07-03 10:57:45 EDT ---

 Can you see what happens if you undo this patch (ie. just delete the function starting at line 284):


diff --git a/yum/yumRepo.py b/yum/yumRepo.py
index afd0877..964f19d 100644
--- a/yum/yumRepo.py
+++ b/yum/yumRepo.py
@@ -281,6 +281,19 @@ class YumRepository(Repository, config.RepoConf):
         self._grabfunc = None
         self._grab = None
 
+    def __cmp__(self, other):
+        """ Sort yum repos. by cost, and then by alphanumeric on their id. """
+        if other is None:
+            return 1
+        if hasattr(other, 'cost'):
+            ocost = other.cost
+        else:
+            ocost = 1000
+        ret = cmp(self.cost, ocost)
+        if ret:
+            return 1
+        return cmp(self.id, other.id)
+
     def _getSack(self):
         # FIXME: Note that having the repo hold the sack, which holds "repos"

--- Additional comment from bugzillaredhat-56f0 on 2009-07-05 05:17:58 EDT ---

It didn't help.
Successive run were as slow as before the change (90 seconds).

Much better results if I remove this line
  self.costExcludePackages()
from __init__.py; it now runs in 7 seconds.
If I undestand correctly, it is now discarding duplicates
in the list of to-be-updated packages, instead of the list
of all packages; so it is faster. A better fix is certainly
possible, I suppose.

--- Additional comment from james.antill on 2009-07-06 14:16:56 EDT ---

Ok, I've confirmed it and I have a patch that "fixes" it:

Author: James Antill <james>
Date:   Mon Jul 6 14:11:58 2009 -0400

    Fix speed regression in costExcludes

diff --git a/yum/__init__.py b/yum/__init__.py
index 2f9723b..268baaf 100644
--- a/yum/__init__.py
+++ b/yum/__init__.py
@@ -1120,6 +1120,7 @@ class YumBase(depsolve.Depsolve):
                 pkgdict[po.pkgtup] = []
             pkgdict[po.pkgtup].append(po)
         
+        verb = self.verbose_logger.isEnabledFor(logginglevels.DEBUG_3)
         for pkgs in pkgdict.values():
             if len(pkgs) == 1:
                 continue
@@ -1129,8 +1130,10 @@ class YumBase(depsolve.Depsolve):
             #print '%s : %s : %s' % (pkgs[0], pkgs[0].repo, pkgs[0].repo.cost)
             for pkg in pkgs[1:]:
                 if pkg.repo.cost > lowcost:
-                    msg = _('excluding for cost: %s from %s') % (pkg, pkg.repo.
-                    self.verbose_logger.log(logginglevels.DEBUG_3, msg)
+                    if verb:
+                        msg = _('excluding for cost: %s from %s') %(pkg,
+                                                                    pkg.repo.id
+                        self.verbose_logger.log(logginglevels.DEBUG_3, msg)
                     pkg.repo.sack.delPackage(pkg)
             
 

...the big question is how/why this fixes it ... and what changed between 3.2.21 and 3.2.22 that makes the above much slower.

--- Additional comment from james.antill on 2009-07-06 14:20:41 EDT ---

My data for this was to use a RHEL-5 box and have multiple EPEL repos. named et1, et2, et3, et4 with different costs. The above patch (for 13,132 delPackage() calls) takes costExcludes() time from 243.10 seconds to 7.25 seconds (which is the same as 3.2.19-18).

--- Additional comment from bugzillaredhat-56f0 on 2009-07-06 15:49:12 EDT ---

Interesting.
I confirm that your fix works here.

I tried other variations of the two logging lines (without your patch).

This one is fast (7s):
  msg = _('excluding for cost: %s from %s') % (pkg.repo.id,pkg.repo.id)
  self.verbose_logger.log(logginglevels.DEBUG_3, msg)

This one is slow (101s):
  msg = _('excluding for cost: %s from %s') % (pkg,pkg.repo.id)
  self.verbose_logger.log(logginglevels.DEBUG_3, msg)

This one is slower (189s):
  msg = _('excluding for cost: %s from %s') % (pkg,pkg)
  self.verbose_logger.log(logginglevels.DEBUG_3, msg)

The slow down happens when the %s is expanded to pkg.

Comment 8 Ruediger Landmann 2009-09-01 22:14:54 UTC
Release note added. If any revisions are required, please set the 
"requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

New Contents:
A recent version of yum was substantially slower than previous versions when calculating cost excludes. The slowdown was caused by variable that expanded to unicode encoding rather than str. The calculation no longer uses unicode, which returns yum to its previous higher speed.

Comment 9 errata-xmlrpc 2009-09-02 07:34:41 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-1419.html


Note You need to log in before you can comment on or make changes to this bug.