Bug 360291
Summary: | yum.resolveDeps infinite loop when mashing f7-updates-testing for ppc | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Luke Macken <lmacken> | ||||||||
Component: | mash | Assignee: | Bill Nottingham <notting> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||
Severity: | high | Docs Contact: | |||||||||
Priority: | urgent | ||||||||||
Version: | rawhide | CC: | chris.stone, dcantrell, pfrields, rvokal, tim.lauridsen, tuju | ||||||||
Target Milestone: | --- | ||||||||||
Target Release: | --- | ||||||||||
Hardware: | All | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | 3.2.8-1.fc8 | Doc Type: | Bug Fix | ||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2007-12-06 20:52:12 UTC | Type: | --- | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Attachments: |
|
Description
Luke Macken
2007-10-31 14:30:32 UTC
So I was just able to mash f7-updates, which included python-2.5-14.fc7.ppc{,64}.rpm, so that may not be the culprit. Took a look at the mash code, there is a lot of os.fork()'ing, maybe i would be an idea to make a hacked version of mash without the fork'ing. I might take a little longer but it should be less resource hungry and safer. Why would forking or not change the depsolver behavior? FWIW, I'm running my mash tests with fork=False. I just kicked off a mash of f7-updates-testing using the latest yum from git, which contains some depsolver patches from Florian Festi. We'll see what happens. (In reply to comment #3) > Why would forking or not change the depsolver behavior? It should not, but as far as i can seen you are running more depsolves at the same time in a different process (one for each arch). Running something in a background can introduce some weird issues in python, so to make it easier to debug IMHO, it would better if it was running in the foreground to see, if it still base possible to make the depsolver go nuts. (In reply to comment #4) > FWIW, I'm running my mash tests with fork=False. > > I just kicked off a mash of f7-updates-testing using the latest yum from git, > which contains some depsolver patches from Florian Festi. We'll see what happens. As far as i can see in the mash git repo, fork=false is not working any more The f7-updates-testing mash with the latest yum/mash code produced the same results: two mash processes stuck depsolving multilib forever. 5052 lmacken 18 0 2158m 1.9g 1392 R 2.3 95.1 107:26.77 mash Also having trouble mashing i386/x86_64 updates-testing as well. To eliminate the forking issue - is this just mashing one arch at a time? Yep, I commented out the os.fork() in Mash.doDepSolveAndMultilib. Created attachment 247151 [details]
mash-ppc.out
Attached is the ppc{,64} multilib filelist from another failed attempt, just in
case someone can see something blatantly wrong with it.
I am not sure if this is related, but my package does not show up in the testing repos AFAICT: https://admin.fedoraproject.org/updates/F7/FEDORA-2007-2621 This updated involved changing a devel subpackage from noarch to be arch specific. Things that we know: 1. it only happens on ppc 2. it only happens on rawhide/f8 - eventhough f7 has the same mash/yum versions 3. there appears to be no rhyme nor reason to the package set installed in it when it loops 4. we can't seem to get a normal yum depsolving call to blow up like this 5. it didn't happen with yum 3.2.5-3 or 3.2.7-*. It started happening after 2 weeks of mashes on 3.2.5-3. So it clearly isn't anything NEW in the yum code it's something older that is being triggered by a change in packages. Anyone else know any other definitive facts we know? (In reply to comment #13) > Things that we know: > 1. it only happens on ppc > 2. it only happens on rawhide/f8 - eventhough f7 has the same mash/yum versions To clarify a bit, it happens *on* FC6 x86_64 when trying to compose a ppc F7 updates-testing repository, using mash-0.2.8-1 and yum-3.2.7-1/yum-3.2.5 Hmm.. random observation before I head out the door. When my f7-updates-testing.mash file contains: arches = i386 x86_64 It doesn't even enter Mash.doDepSolveAndMultilib., and finishes in less than 2 minutes. However, when I change the order of the arches to: arches = x86_64 i386 It solves for multilib, and seems to get into the same infinite loop. So now I am unable to re-produce the ordering issue that I mentioned in my last comment. Regardless of ordering of x86_64/i386, both cases add files for multilib and never finish solving deps for it. Here is how you can reproduce this issue locally. I have been able to do so on two different machines running FC6 and F7. broken-repo.tar.bz2 is about 290mb; we need all of the RPMs because mash uses YumLocalPackage. $ wget http://publictest2.fedoraproject.org/broken-repo.tar.bz2 $ tar -jxvf broken-repo.tar.bz2 $ cd broken-repo $ ./solve.py The solve.py that is in the tarball is based on the values that we're hitting in mash. You can find it here, for reference: http://publictest2.fedoraproject.org/solve.py I have done a little testing with the broken-repo I can reproduce the error. I have found out that removing libselinux-2.0.14-8.fc7.x86_64.rpm will make it work right, so the problem is some how triggered by this package. i have attached a modified solve.py there excludes libselinux-2.0.14-8.fc7.x86_64.rpm Created attachment 248921 [details]
solve.py with excludes
Added some extra debug output to the yum depsolver and found out libgcj-devel-4.1.2-18.fc7.x86_64.rpm is the package causing the endless loop combinded with libselinux-2.0.14-8.fc7.x86_64.rpm i a transaction with a lot of other packages. The looping takes place in in this piece of code in the Depsolve class (yum/depsolve.py) # check Requires while CheckDeps: print "check Requires" self.cheaterlookup = {} if self.dsCallback: self.dsCallback.tscheck() CheckDeps, checkinstalls, checkremoves, missing = self._resolveRequires(errors) CheckInstalls |= checkinstalls CheckRemoves |= checkremoves The self._resolveRequeres method is returning CheckDeps = 1 all the time when the endless looping take place. the interesting code in the self._resolveRequeres missing_in_pkg = False for po, dep in thisneeds: (checkdep, missing, errormsgs) = self._processReq(po, dep) if checkdep: print checkdep,po,missing,dep CheckDeps |= checkdep errors += errormsgs missing_in_pkg |= missing I have added the if checkdep: print checkdep,po,missing,dep lines to see package there not can be resolved When the looping occours i get this output check Requires 1 libstdc++-devel - 4.1.2-18.fc7.x86_64 0 ('/usr/lib64/libstdc++.so.6', 0, '') 1 libgcj-devel - 4.1.2-18.fc7.x86_64 0 ('/usr/lib64/libgcj.so.8rh', 0, '') 1 libgcc - 4.1.2-18.fc7.i386 0 ('/usr/sbin/libgcc_post_upgrade', 0, '') 1 calc-stdrc - 2.12.2.1-9.fc7.x86_64 0 ('/usr/bin/calc', 0, '') 1 libgcc - 4.1.2-18.fc7.x86_64 0 ('/usr/sbin/libgcc_post_upgrade', 0, '') check Requires 1 libgcj-devel - 4.1.2-18.fc7.x86_64 0 ('/usr/lib64/libgcj.so.8rh', 0, '') check Requires 1 libgcj-devel - 4.1.2-18.fc7.x86_64 0 ('/usr/lib64/libgcj.so.8rh', 0, '') check Requires 1 libgcj-devel - 4.1.2-18.fc7.x86_64 0 ('/usr/lib64/libgcj.so.8rh', 0, '') check Requires 1 libgcj-devel - 4.1.2-18.fc7.x86_64 0 ('/usr/lib64/libgcj.so.8rh', 0, '') If i remove some of the packages from the transaction the i will not loop forever. removed yum-updatesd-3.2.6-2.fc7.noarch.rpm removed yum-utils-1.1.8-1.fc7.noarch.rpm removed yum-versionlock-1.1.8-1.fc7.noarch.rpm check Requires 1 libstdc++-devel - 4.1.2-18.fc7.x86_64 0 ('/usr/lib64/libstdc++.so.6', 0, '') 1 libgcj-devel - 4.1.2-18.fc7.x86_64 0 ('/usr/lib64/libgcj.so.8rh', 0, '') 1 libgcc - 4.1.2-18.fc7.i386 0 ('/usr/sbin/libgcc_post_upgrade', 0, '') 1 calc-stdrc - 2.12.2.1-9.fc7.x86_64 0 ('/usr/bin/calc', 0, '') 1 libgcc - 4.1.2-18.fc7.x86_64 0 ('/usr/sbin/libgcc_post_upgrade', 0, '') 1 yum-refresh-updatesd - 1.1.8-1.fc7.noarch 0 ('yum-updatesd', 0, '') check Requires I could have been other packages than these 3 one, if you remove 3-8 random packages from the transaction the it will work. This is the output from yum with debuglevel=9 just before each 1 libgcj-devel - 4.1.2-18.fc7.x86_64 0 ('/usr/lib64/libgcj.so.8rh', 0, '') Checking deps for libgcj-devel.x86_64 0-4.1.2-18.fc7 - u looking for ('libgcj', 'EQ', ('0', '4.1.2', '18.fc7')) as a requirement of libgcj-devel.x86_64 0-4.1.2-18.fc7 - u looking for ('/usr/lib64/libgcj.so.8rh', None, (None, None, None)) as a requirement of libgcj-devel.x86_64 0-4.1.2-18.fc7 - u looking for ('zlib-devel', None, (None, None, None)) as a requirement of libgcj-devel.x86_64 0-4.1.2-18.fc7 - u looking for ('/usr/lib64/libz.so', None, (None, None, None)) as a requirement of libgcj-devel.x86_64 0-4.1.2-18.fc7 - u looking for ('/bin/awk', None, (None, None, None)) as a requirement of libgcj-devel.x86_64 0-4.1.2-18.fc7 - u libgcj-devel requires: /usr/lib64/libgcj.so.8rh Searching pkgSack for dep: /usr/lib64/libgcj.so.8rh skipping reposetup, pkgsack exists skipping reposetup, pkgsack exists Potential match for /usr/lib64/libgcj.so.8rh from libgcj - 4.1.2-18.fc7.x86_64 libgcj already in ts, skipping this one 1 libgcj-devel - 4.1.2-18.fc7.x86_64 libgcj-devel-4.1.2-18.fc7.x86_64.rpm ('/usr/lib64/libgcj.so.8rh', 0, '') Awesome, thanks for helping us dig into this, Tim. I also noticed yesterday that when I excluded ustr-1.0.1-5.fc7, I was able to avoid the loop -- but untagging it from updates-testing did not fix the problem. So, in terms of updates-testing, what do you recommend we do to temporarily mitigate this issue so we can get people testing stuff again? Untag gcc-4.1.2-18.fc7 and/or libselinux-2.0.14-8.fc7 ? untaging libselinux or keeping the number of packages below 200, should do also work. but that can be hard to do. Created attachment 250051 [details]
Patch to stop the looping
This patch to yum stops the looping, but i don't know if breaks the depsolving
logic.
Comments please.
Bill, this is your area, thoughts? I just looked at the patch. The checkdeps being set to 1 there doesn't make any sense b/c we're not adding anything new to the transaction set. So, Seth - you think the patch is correct? yah, I think it might be. Tim, go ahead and throw it past yum-devel. But unless I'm way off it looks like it is correct. I have committed the patch to upstream yum. I patched yum on releng1 yesterday and was able to successfully mash f7-updates-testing. Thanks for tracking down and patching this issue, Tim! yum-3.2.8-1.fc7 has been pushed to the Fedora 7 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update yum' yum-3.2.8-1.fc8 has been pushed to the Fedora 8 stable repository. If problems still persist, please make note of it in this bug report. |