|Summary:||yum upgrade FC2->FC3 hangs/freezes|
|Product:||[Fedora] Fedora||Reporter:||Trevor Cordes <trevor>|
|Component:||yum||Assignee:||Jeremy Katz <katzj>|
|Status:||CLOSED DUPLICATE||QA Contact:|
|Version:||2||CC:||athompso, jbj, katzj, mattdm, pnasrat|
|Fixed In Version:||Doc Type:||Bug Fix|
|Doc Text:||Story Points:||---|
|Last Closed:||2005-05-18 15:07:36 UTC||Type:||---|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
Description Trevor Cordes 2005-01-09 00:08:34 UTC
From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0 Description of problem: I am upgrading several near identical machines from FC1 to FC3 using yum remotely. So far it has worked perfect on 2 machines. However, on a 3rd similar machine I ran into a situation where "yum upgrade" froze (just yum, not the whole machine) and just sat there doing nothing for 6 hours until I killed it. It froze about 25% of the way through the "completing updates" phase. There were no odd syslog entries I could discern. There was no error debug output. It just sat there saying it was completing updates for mtools. Since I assume all actual updates are done before the completing updates phase, I am hoping this system will be usable as is. I am also contemplating doing some sort of rpm force of the packages, but I'm not sure if that's a better idea than just leaving it. I have about 10 more similar machines to upgrade and I will see if this happens on any others. Version-Release number of selected component (if applicable): yum-2.1.11-3 (now) How reproducible: Didn't try Steps to Reproduce: 1. On FC2, get an rpm -Uvh yum-2.1.11-3.noarch.rpm and fedora-release-3-8.i386.rpm 2. yum upgrade 3. Actual Results: yum upgrade just sat there doing nothing during "completing updates" phase. Expected Results: yum should have completed to the end. Additional info: I wish I could provide more info, but yum/rpm literally left me with no logs or debug info and since FC2->FC3 is a one way ticket, I can't reproduce this bug on this machine.
Comment 1 Trevor Cordes 2005-01-09 00:10:14 UTC
I should make it more clear that I first successfully upgraded from FC1->FC2 using yum upgrade. It's the FC2->FC3 part that failed. I was not attempting to do straight FC1->FC3!
Comment 2 Seth Vidal 2005-01-11 06:59:20 UTC
You didn't have cron or anacron still running did you? It could be that your upgrade attempt got hung when something else hit your rpmdb at the same time. It won't be the first time that's happened. You might try doing an anaconda upgrade of this machine and see if it can get fixed up that way.
Comment 3 Trevor Cordes 2005-01-11 11:57:38 UTC
Yes, cron and anacron would have been running as daemons. Since I hit this problem at 4am-ish, it very well could have been caused by some cron job. Not sure what would manipulate rpm in the stock FC cron stuff, but it's a possibility. Next time I will ensure I stop *cron first. I've found lots of yum FC upgrade instruction hints, and none mentioned your idea. I suppose it would be a good idea to shutdown *all* services (except sshd!) before trying such a thing again. As always, I will report back whatever I find (though this one may take time).
Comment 4 Trevor Cordes 2005-02-24 21:11:06 UTC
I just ran into this problem again on another machine. It hung at 348/606 at the completing update for hdparm. This time I HAD made sure to stop cron and anacron. When it hung, I verified no cron was running with ps. There has got to be something else going on here.
Comment 5 Trevor Cordes 2005-02-24 21:12:30 UTC
It's sitting there hung and I haven't killed it yet. Is there any signal or anything I can do to kick-start it back into finishing? Perhaps to just skip the one it's on? I'm desperate to not have to go onsite to fix this machine.
Comment 6 Trevor Cordes 2005-02-24 21:13:38 UTC
Also, ps -ef |grep rpm shows that nothing else is running rpm right now. What is yum sitting there waiting for if not an rpm process?
Comment 7 Trevor Cordes 2005-02-24 21:44:59 UTC
Created attachment 111407 [details] terminal output from yum up to the point where it has hung
Comment 8 Trevor Cordes 2005-02-24 22:07:20 UTC
more details I thought might help: r#/tmp/strace -p 18952 Process 18952 attached - interrupt to quit futex(0xaed6438, FUTEX_WAIT, 1, NULL
Comment 9 Trevor Cordes 2005-02-24 22:08:24 UTC
Created attachment 111410 [details] listing from lsof -p psid of the hung yum process
Comment 10 Trevor Cordes 2005-02-25 00:02:47 UTC
I needed to try to get this machine running, so I had to kill the stalled yum. SIGINT had no effect. SIGHUP had no effect. kill -9 finally killed it. Here's what I will attempt as a fix for this half-updated machine: mkdir /tmp/rpm cp /var/cache/yum/base/packages/*.rpm /tmp/rpm cp /var/cache/yum/updates-released/*.rpm /tmp/rpm cd /tmp/rpm ls -1 > /tmp/ls edit /tmp/ls and remove the packages that the yum debug output said were completely updated (including post-update tasks) -- perhaps this step would be best left out? rpm -U --force * This appears to have worked and hopefully got the packages completed properly. I sure would like to solve the yum hang issue though since I don't want hit this bug again when FC4 rolls around...
Comment 11 Adam Thompson 2005-02-25 02:22:09 UTC
FYI - I've seen the same problem on a stock FC3 install, doing a "yum -y upgrade" (which pulls down 300+ packages) does the same thing... sometimes!
Comment 12 Seth Vidal 2005-02-25 07:18:56 UTC
How many kernels did you have installed at the time? adding jbj and nasrat to get any ideas they might have
Comment 13 Trevor Cordes 2005-02-25 20:59:24 UTC
At that moment I had just the latest FC2 2.6 kernel installed. Well, by the time it froze it had already put in the latest FC3 kernel as well, so if you look at it that way, then the answer is 2.
Comment 14 Trevor Cordes 2005-02-25 21:03:14 UTC
I perhaps should also mention that while it was hung, and after I had killed it (but before I rebooted), there were some wacky files/mounts in /tmp that were causing errors when I did a df, mount or ls /tmp. I think those are some sort of shared mem files used by yum/rpm? Anyways, there were a couple that were obviously screwed up. I think this happened before also, but I never gave it too much attention before rebooting. I don't have any mounts of my own that run off /tmp so these must be system/app generated. Ah, lucky it was still in the terminal scroll-back buffer: #df Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda3 30890364 23443936 5877280 80% / /dev/hda1 101086 6900 88967 8% /boot none 128012 0 128012 0% /dev/shm df: `/tmp/tmp.FmjnEy8727': No such file or directory df: `/tmp/tmp.WVkFrU8730': No such file or directory /tmp/tmp.WVkFrU8730 128012 0 128012 0% /dev/shm Exit 1
Comment 15 Seth Vidal 2005-02-25 21:16:50 UTC
yum doesn't make these mounts - some packages might in %post - nevertheless - that's your problem - rpm was probably hanging during the install from that.
Comment 16 Trevor Cordes 2005-02-25 22:05:58 UTC
Hmm, possibly. But isn't it strange that the first stall I had (my first post here), it was mtools it hung on, but for the recent stall it was hung on hdparm? Are you sure these /tmp shm mounts aren't related to the futex yum is using that the strace indicated it was stalled on? Perhaps some python libs or rpm libs are creating them? There's still the odd fact that I've now upgraded 9 *nearly identical* systems (with regards to installed rpms they are identical) and only 2 have stalled. If it was a %post section hosing it then a) it should stall on the same package in the same place, and b) it should stall on every system. Regardless of what the cause is, what about the possiblity of adding an alarm() timer around the calls that can possibly stall? If yum had timedout on the stalled hdparm "update completion" and then continued to the next package, I'd be in much better shape than it just stopping and leaving the system in limbo. Or, instead of alarm(), perhaps catch SIGUSR1 that will kick it onto the next iteration.
Comment 17 Matthew Miller 2005-04-26 16:18:44 UTC
Fedora Core 2 is now maintained by the Fedora Legacy project for security updates only. If this problem is a security issue, please reopen and reassign to the Fedora Legacy product. If it is not a security issue and hasn't been resolved in the current FC3 updates or in the FC4 test release, reopen and change the version to match.
Comment 18 Trevor Cordes 2005-05-12 10:33:41 UTC
This bug still exists in FC3. See bug 145021 (duplicate, but perhaps should move discussion there).
Comment 19 Trevor Cordes 2005-05-12 10:42:26 UTC
Anyone watching this bug who has had this issue, please put a note in bug 145021 that you have seen / are still seeing this problem.
Comment 20 Matthew Miller 2005-05-18 02:23:35 UTC
Should we resolve this as a duplicate of #145021? (Even though this one is obviously slightly older...?)
Comment 21 Trevor Cordes 2005-05-18 15:06:03 UTC
Yes, I'd move the discussion to bug 145021. This bug is really just a more pathological / easier-to-reproduce case, but I'm convinced it's the same bug.