Bug 508445
Summary: | cost excludes are much slower in yum 3.2.23 than 3.2.21 | |||
---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Roberto Ragusa <bugzillaredhat-56f0> | |
Component: | yum | Assignee: | Seth Vidal <skvidal> | |
Status: | CLOSED UPSTREAM | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | |
Severity: | medium | Docs Contact: | ||
Priority: | low | |||
Version: | 10 | CC: | dmach, eric.tanguy, ffesti, james.antill, pmatilai, rs, tim.lauridsen, tomek | |
Target Milestone: | --- | |||
Target Release: | --- | |||
Hardware: | All | |||
OS: | Linux | |||
URL: | http://www.mail-archive.com/fedora-devel-list@redhat.com/msg03366.html | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 509932 (view as bug list) | Environment: | ||
Last Closed: | 2009-07-10 16:07:39 UTC | Type: | --- | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 509932 |
Description
Roberto Ragusa
2009-06-27 09:53:28 UTC
Can you see what happens if you undo this patch (ie. just delete the function starting at line 284): diff --git a/yum/yumRepo.py b/yum/yumRepo.py index afd0877..964f19d 100644 --- a/yum/yumRepo.py +++ b/yum/yumRepo.py @@ -281,6 +281,19 @@ class YumRepository(Repository, config.RepoConf): self._grabfunc = None self._grab = None + def __cmp__(self, other): + """ Sort yum repos. by cost, and then by alphanumeric on their id. """ + if other is None: + return 1 + if hasattr(other, 'cost'): + ocost = other.cost + else: + ocost = 1000 + ret = cmp(self.cost, ocost) + if ret: + return 1 + return cmp(self.id, other.id) + def _getSack(self): # FIXME: Note that having the repo hold the sack, which holds "repos" It didn't help. Successive run were as slow as before the change (90 seconds). Much better results if I remove this line self.costExcludePackages() from __init__.py; it now runs in 7 seconds. If I undestand correctly, it is now discarding duplicates in the list of to-be-updated packages, instead of the list of all packages; so it is faster. A better fix is certainly possible, I suppose. Ok, I've confirmed it and I have a patch that "fixes" it: Author: James Antill <james> Date: Mon Jul 6 14:11:58 2009 -0400 Fix speed regression in costExcludes diff --git a/yum/__init__.py b/yum/__init__.py index 2f9723b..268baaf 100644 --- a/yum/__init__.py +++ b/yum/__init__.py @@ -1120,6 +1120,7 @@ class YumBase(depsolve.Depsolve): pkgdict[po.pkgtup] = [] pkgdict[po.pkgtup].append(po) + verb = self.verbose_logger.isEnabledFor(logginglevels.DEBUG_3) for pkgs in pkgdict.values(): if len(pkgs) == 1: continue @@ -1129,8 +1130,10 @@ class YumBase(depsolve.Depsolve): #print '%s : %s : %s' % (pkgs[0], pkgs[0].repo, pkgs[0].repo.cost) for pkg in pkgs[1:]: if pkg.repo.cost > lowcost: - msg = _('excluding for cost: %s from %s') % (pkg, pkg.repo. - self.verbose_logger.log(logginglevels.DEBUG_3, msg) + if verb: + msg = _('excluding for cost: %s from %s') %(pkg, + pkg.repo.id + self.verbose_logger.log(logginglevels.DEBUG_3, msg) pkg.repo.sack.delPackage(pkg) ...the big question is how/why this fixes it ... and what changed between 3.2.21 and 3.2.22 that makes the above much slower. My data for this was to use a RHEL-5 box and have multiple EPEL repos. named et1, et2, et3, et4 with different costs. The above patch (for 13,132 delPackage() calls) takes costExcludes() time from 243.10 seconds to 7.25 seconds (which is the same as 3.2.19-18). Interesting. I confirm that your fix works here. I tried other variations of the two logging lines (without your patch). This one is fast (7s): msg = _('excluding for cost: %s from %s') % (pkg.repo.id,pkg.repo.id) self.verbose_logger.log(logginglevels.DEBUG_3, msg) This one is slow (101s): msg = _('excluding for cost: %s from %s') % (pkg,pkg.repo.id) self.verbose_logger.log(logginglevels.DEBUG_3, msg) This one is slower (189s): msg = _('excluding for cost: %s from %s') % (pkg,pkg) self.verbose_logger.log(logginglevels.DEBUG_3, msg) The slow down happens when the %s is expanded to pkg. Thanks, I can also confirm that. But even more interesting this is a more minimal patch for the workaround: Author: James Antill <james> Date: Mon Jul 6 14:11:58 2009 -0400 Fix speed regression in costExcludes (magic) diff --git a/yum/__init__.py b/yum/__init__.py index 2f9723b..f2a4bd5 100644 --- a/yum/__init__.py +++ b/yum/__init__.py @@ -1129,7 +1129,7 @@ class YumBase(depsolve.Depsolve): #print '%s : %s : %s' % (pkgs[0], pkgs[0].repo, pkgs[0].repo.cost) for pkg in pkgs[1:]: if pkg.repo.cost > lowcost: - msg = _('excluding for cost: %s from %s') % (pkg, pkg.repo.id) + msg = _('excluding for cost: %s from %s') % (str(pkg), pkg.repo.id) self.verbose_logger.log(logginglevels.DEBUG_3, msg) pkg.repo.sack.delPackage(pkg) ...which is pretty crazy, and implies that logger is doing something weird. Of course it can't be logger, because it's happening in % above ... so it has to be core python. Ok, we have a winner: % time python /usr/lib*/python2.?/timeit.py -n 1 -s 'import yum; yb = yum.YumBase()' 'map(str, yb.pkgSack.returnPackages(patterns="zi*"))' Loaded plugins: Avahi, auto-update-debuginfo, presto Loaded plugins: Avahi, auto-update-debuginfo, presto Loaded plugins: Avahi, auto-update-debuginfo, presto 1 loops, best of 3: 3.03 sec per loop python /usr/lib*/python2.?/timeit.py -n 1 -s 'import yum; yb = yum.YumBase()' 9.29s user 0.44s system 97% cpu 9.980 total % time python /usr/lib*/python2.?/timeit.py -n 1 -s 'import yum; yb = yum.YumBase()' 'map(unicode, yb.pkgSack.returnPackages(patterns="zi*"))' Loaded plugins: Avahi, auto-update-debuginfo, presto Loaded plugins: Avahi, auto-update-debuginfo, presto Loaded plugins: Avahi, auto-update-debuginfo, presto 1 loops, best of 3: 177 sec per loop python /usr/lib*/python2.?/timeit.py -n 1 -s 'import yum; yb = yum.YumBase()' 528.68s user 0.68s system 99% cpu 8:52.93 total ...ie. a change from str(pkg) to unicode(pkg) makes it roughly 50x slower. Fixed upstream in commit: http://yum.baseurl.org/gitweb?p=yum.git;a=commitdiff;h=1f45838e29ba26dac22215dd9c918fb920569fef ...still no idea how python hit this going from 3.2.21 to 3.2.22. git bisect shows that this is the commit that broke it: d03389725e3779928d721370aed9e14afb5304a6 ...ha, it's Seth's fault :). Quick test shows that if I do (without the other patches): diff --git a/yum/sqlitesack.py b/yum/sqlitesack.py index 5692c07..94dbc1a 100644 --- a/yum/sqlitesack.py +++ b/yum/sqlitesack.py @@ -192,7 +192,7 @@ class YumAvailablePackageSqlite(YumAvailablePackage, Package } if varname.startswith('__') and varname.endswith('__'): - if not hasattr(self, varname): + if varname == '__unicode__' or not hasattr(self, varname): raise AttributeError, varname dbname = varname ...then everything is fast again. After more thinking this is because we are in __getattr__(self, ...) and call hasattr(self, ...) ... which then calls getattr() which calls __getattr__() again. Until we run out of stack and it all unwinds again. So above fix reverted, and real fix pushed. Can this fix get pushed to F10? I can verify that the simple fix in comment #6 cut a simple 'yum list perl-File-Sync' from 2 minutes to 7 seconds... the changeset from what's in yum's git and f10 is pretty serious. I'm not sure it is a great idea to push it that far back. I just want the 1 line fix from comment #6 in yum/__init__.py... the priorities plugin has a similar issue and fix. (sure would be nice to get a yum-3.2.23-4.f10 package with these one line fixes...) --- /usr/lib/yum-plugins/priorities.py.orig 2009-10-21 11:54:34.000000000 -0400 +++ /usr/lib/yum-plugins/priorities.py 2009-10-21 11:51:04.000000000 -0400 @@ -144,7 +144,7 @@ if delPackage: conduit.delPackage(po) cnt += 1 - conduit.info(3," --> %s from %s excluded (priority)" % (po,po.repoid)) + conduit.info(3," --> %s from %s excluded (priority)" % (str(po),po.repoid)) # If this packages obsoletes other packages, check whether # one of the obsoleted packages is not available through @@ -157,7 +157,7 @@ if pkg_priorities_archless.has_key(pkg_name) and pkg_priorities_archless[pkg_name] < repo.priority: conduit.delPackage(po) cnt += 1 - conduit.info(3," --> %s from %s excluded (priority)" % (po,po.repoid)) + conduit.info(3," --> %s from %s excluded (priority)" % (str(po),po.repoid)) break conduit.info(2, '%d packages excluded due to repository priority protections' % cnt) The new yum has already been pushed to f10, just get it from updates-testing ... there's no reason to change the priorities plugin. |