+++ This bug was initially created as a clone of Bug #508445 +++ yum 3.2.23 on F10 is a lot slower than 3.2.21. A simple "yum -C list updates" takes 90 seconds instead of 3 seconds, if two repositories with the same rpms and different cost are enabled (typical scenario for local mirroring of updates). The problem appears to be that the code managing the exclusion of rpm packages from the expensive repo, which loops on the rpms and deletes them very inefficiently. "excluding for cost:" lines are printed quite slowly (option -d 5). Suspects on this code: (is this O(N^2) ?) for pkg in pkgs[1:]: if pkg.repo.cost > lowcost: msg = _('excluding for cost: %s from %s') % (pkg, pkg.repo.id) self.verbose_logger.log(logginglevels.DEBUG_3, msg) pkg.repo.sack.delPackage(pkg) Also see http://www.mail-archive.com/fedora-devel-list@redhat.com/msg03366.html (reproduced by another user with similar conf). --- Additional comment from james.antill on 2009-07-03 10:57:45 EDT --- Can you see what happens if you undo this patch (ie. just delete the function starting at line 284): diff --git a/yum/yumRepo.py b/yum/yumRepo.py index afd0877..964f19d 100644 --- a/yum/yumRepo.py +++ b/yum/yumRepo.py @@ -281,6 +281,19 @@ class YumRepository(Repository, config.RepoConf): self._grabfunc = None self._grab = None + def __cmp__(self, other): + """ Sort yum repos. by cost, and then by alphanumeric on their id. """ + if other is None: + return 1 + if hasattr(other, 'cost'): + ocost = other.cost + else: + ocost = 1000 + ret = cmp(self.cost, ocost) + if ret: + return 1 + return cmp(self.id, other.id) + def _getSack(self): # FIXME: Note that having the repo hold the sack, which holds "repos" --- Additional comment from bugzillaredhat-56f0 on 2009-07-05 05:17:58 EDT --- It didn't help. Successive run were as slow as before the change (90 seconds). Much better results if I remove this line self.costExcludePackages() from __init__.py; it now runs in 7 seconds. If I undestand correctly, it is now discarding duplicates in the list of to-be-updated packages, instead of the list of all packages; so it is faster. A better fix is certainly possible, I suppose. --- Additional comment from james.antill on 2009-07-06 14:16:56 EDT --- Ok, I've confirmed it and I have a patch that "fixes" it: Author: James Antill <james> Date: Mon Jul 6 14:11:58 2009 -0400 Fix speed regression in costExcludes diff --git a/yum/__init__.py b/yum/__init__.py index 2f9723b..268baaf 100644 --- a/yum/__init__.py +++ b/yum/__init__.py @@ -1120,6 +1120,7 @@ class YumBase(depsolve.Depsolve): pkgdict[po.pkgtup] = [] pkgdict[po.pkgtup].append(po) + verb = self.verbose_logger.isEnabledFor(logginglevels.DEBUG_3) for pkgs in pkgdict.values(): if len(pkgs) == 1: continue @@ -1129,8 +1130,10 @@ class YumBase(depsolve.Depsolve): #print '%s : %s : %s' % (pkgs[0], pkgs[0].repo, pkgs[0].repo.cost) for pkg in pkgs[1:]: if pkg.repo.cost > lowcost: - msg = _('excluding for cost: %s from %s') % (pkg, pkg.repo. - self.verbose_logger.log(logginglevels.DEBUG_3, msg) + if verb: + msg = _('excluding for cost: %s from %s') %(pkg, + pkg.repo.id + self.verbose_logger.log(logginglevels.DEBUG_3, msg) pkg.repo.sack.delPackage(pkg) ...the big question is how/why this fixes it ... and what changed between 3.2.21 and 3.2.22 that makes the above much slower. --- Additional comment from james.antill on 2009-07-06 14:20:41 EDT --- My data for this was to use a RHEL-5 box and have multiple EPEL repos. named et1, et2, et3, et4 with different costs. The above patch (for 13,132 delPackage() calls) takes costExcludes() time from 243.10 seconds to 7.25 seconds (which is the same as 3.2.19-18). --- Additional comment from bugzillaredhat-56f0 on 2009-07-06 15:49:12 EDT --- Interesting. I confirm that your fix works here. I tried other variations of the two logging lines (without your patch). This one is fast (7s): msg = _('excluding for cost: %s from %s') % (pkg.repo.id,pkg.repo.id) self.verbose_logger.log(logginglevels.DEBUG_3, msg) This one is slow (101s): msg = _('excluding for cost: %s from %s') % (pkg,pkg.repo.id) self.verbose_logger.log(logginglevels.DEBUG_3, msg) This one is slower (189s): msg = _('excluding for cost: %s from %s') % (pkg,pkg) self.verbose_logger.log(logginglevels.DEBUG_3, msg) The slow down happens when the %s is expanded to pkg.
Ok, we've found root cause: http://yum.baseurl.org/gitweb?p=yum.git;a=commitdiff;h=1f45838e29ba26dac22215dd9c918fb920569fef
Release note added. If any revisions are required, please set the "requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: A recent version of yum was substantially slower than previous versions when calculating cost excludes. The slowdown was caused by variable that expanded to unicode encoding rather than str. The calculation no longer uses unicode, which returns yum to its previous higher speed.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2009-1419.html