Red Hat Bugzilla – Bug 144589
yum upgrade FC2->FC3 hangs/freezes
Last modified: 2014-01-21 17:50:48 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.5)
Description of problem:
I am upgrading several near identical machines from FC1 to FC3 using
yum remotely. So far it has worked perfect on 2 machines. However,
on a 3rd similar machine I ran into a situation where "yum upgrade"
froze (just yum, not the whole machine) and just sat there doing
nothing for 6 hours until I killed it. It froze about 25% of the way
through the "completing updates" phase.
There were no odd syslog entries I could discern. There was no error
debug output. It just sat there saying it was completing updates for
Since I assume all actual updates are done before the completing
updates phase, I am hoping this system will be usable as is. I am
also contemplating doing some sort of rpm force of the packages, but
I'm not sure if that's a better idea than just leaving it.
I have about 10 more similar machines to upgrade and I will see if
this happens on any others.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. On FC2, get an rpm -Uvh yum-2.1.11-3.noarch.rpm and
2. yum upgrade
Actual Results: yum upgrade just sat there doing nothing during
"completing updates" phase.
Expected Results: yum should have completed to the end.
I wish I could provide more info, but yum/rpm literally left me with
no logs or debug info and since FC2->FC3 is a one way ticket, I can't
reproduce this bug on this machine.
I should make it more clear that I first successfully upgraded from
FC1->FC2 using yum upgrade. It's the FC2->FC3 part that failed. I
was not attempting to do straight FC1->FC3!
You didn't have cron or anacron still running did you?
It could be that your upgrade attempt got hung when something else hit
your rpmdb at the same time.
It won't be the first time that's happened.
You might try doing an anaconda upgrade of this machine and see if it
can get fixed up that way.
Yes, cron and anacron would have been running as daemons. Since I hit
this problem at 4am-ish, it very well could have been caused by some
cron job. Not sure what would manipulate rpm in the stock FC cron
stuff, but it's a possibility.
Next time I will ensure I stop *cron first.
I've found lots of yum FC upgrade instruction hints, and none
mentioned your idea.
I suppose it would be a good idea to shutdown *all* services (except
sshd!) before trying such a thing again.
As always, I will report back whatever I find (though this one may
I just ran into this problem again on another machine. It hung at 348/606 at
the completing update for hdparm. This time I HAD made sure to stop cron and
anacron. When it hung, I verified no cron was running with ps. There has got
to be something else going on here.
It's sitting there hung and I haven't killed it yet. Is there any signal or
anything I can do to kick-start it back into finishing? Perhaps to just skip
the one it's on? I'm desperate to not have to go onsite to fix this machine.
Also, ps -ef |grep rpm shows that nothing else is running rpm right now. What
is yum sitting there waiting for if not an rpm process?
Created attachment 111407 [details]
terminal output from yum up to the point where it has hung
more details I thought might help:
r#/tmp/strace -p 18952
Process 18952 attached - interrupt to quit
futex(0xaed6438, FUTEX_WAIT, 1, NULL
Created attachment 111410 [details]
listing from lsof -p psid of the hung yum process
I needed to try to get this machine running, so I had to kill the stalled yum.
SIGINT had no effect. SIGHUP had no effect. kill -9 finally killed it.
Here's what I will attempt as a fix for this half-updated machine:
cp /var/cache/yum/base/packages/*.rpm /tmp/rpm
cp /var/cache/yum/updates-released/*.rpm /tmp/rpm
ls -1 > /tmp/ls
edit /tmp/ls and remove the packages that the yum debug output said were
completely updated (including post-update tasks) -- perhaps this step would be
best left out?
rpm -U --force *
This appears to have worked and hopefully got the packages completed properly.
I sure would like to solve the yum hang issue though since I don't want hit this
bug again when FC4 rolls around...
FYI - I've seen the same problem on a stock FC3 install, doing a "yum -y
upgrade" (which pulls down 300+ packages) does the same thing... sometimes!
How many kernels did you have installed at the time?
adding jbj and nasrat to get any ideas they might have
At that moment I had just the latest FC2 2.6 kernel installed. Well, by the
time it froze it had already put in the latest FC3 kernel as well, so if you
look at it that way, then the answer is 2.
I perhaps should also mention that while it was hung, and after I had killed it
(but before I rebooted), there were some wacky files/mounts in /tmp that were
causing errors when I did a df, mount or ls /tmp. I think those are some sort
of shared mem files used by yum/rpm? Anyways, there were a couple that were
obviously screwed up. I think this happened before also, but I never gave it
too much attention before rebooting. I don't have any mounts of my own that run
off /tmp so these must be system/app generated.
Ah, lucky it was still in the terminal scroll-back buffer:
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/hda3 30890364 23443936 5877280 80% /
/dev/hda1 101086 6900 88967 8% /boot
none 128012 0 128012 0% /dev/shm
df: `/tmp/tmp.FmjnEy8727': No such file or directory
df: `/tmp/tmp.WVkFrU8730': No such file or directory
/tmp/tmp.WVkFrU8730 128012 0 128012 0% /dev/shm
yum doesn't make these mounts - some packages might in %post - nevertheless -
that's your problem - rpm was probably hanging during the install from that.
Hmm, possibly. But isn't it strange that the first stall I had (my first post
here), it was mtools it hung on, but for the recent stall it was hung on hdparm?
Are you sure these /tmp shm mounts aren't related to the futex yum is using that
the strace indicated it was stalled on? Perhaps some python libs or rpm libs
are creating them?
There's still the odd fact that I've now upgraded 9 *nearly identical* systems
(with regards to installed rpms they are identical) and only 2 have stalled. If
it was a %post section hosing it then a) it should stall on the same package in
the same place, and b) it should stall on every system.
Regardless of what the cause is, what about the possiblity of adding an alarm()
timer around the calls that can possibly stall? If yum had timedout on the
stalled hdparm "update completion" and then continued to the next package, I'd
be in much better shape than it just stopping and leaving the system in limbo.
Or, instead of alarm(), perhaps catch SIGUSR1 that will kick it onto the next
Fedora Core 2 is now maintained by the Fedora Legacy project for
security updates only. If this problem is a security issue, please
reopen and reassign to the Fedora Legacy product. If it is not a
security issue and hasn't been resolved in the current FC3 updates or
in the FC4 test release, reopen and change the version to match.
This bug still exists in FC3. See bug 145021 (duplicate, but perhaps should
move discussion there).
Anyone watching this bug who has had this issue, please put a note in bug 145021
that you have seen / are still seeing this problem.
Should we resolve this as a duplicate of #145021? (Even though this one is
obviously slightly older...?)
Yes, I'd move the discussion to bug 145021. This bug is really just a more
pathological / easier-to-reproduce case, but I'm convinced it's the same bug.
*** This bug has been marked as a duplicate of 145021 ***