144589 – yum upgrade FC2->FC3 hangs/freezes

Bug 144589 - yum upgrade FC2->FC3 hangs/freezes

Summary: yum upgrade FC2->FC3 hangs/freezes

Keywords:
Status:	CLOSED DUPLICATE of bug 145021
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	yum
Sub Component:
Version:	2
Hardware:	i686
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Jeremy Katz
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2005-01-09 00:08 UTC by Trevor Cordes
Modified:	2014-01-21 22:50 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2005-05-18 15:07:36 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
terminal output from yum up to the point where it has hung (2.07 KB, text/plain) 2005-02-24 21:44 UTC, Trevor Cordes	no flags	Details
listing from lsof -p psid of the hung yum process (7.32 KB, text/plain) 2005-02-24 22:08 UTC, Trevor Cordes	no flags	Details
View All

Description Trevor Cordes 2005-01-09 00:08:34 UTC

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.5)
Gecko/20041107 Firefox/1.0

Description of problem:
I am upgrading several near identical machines from FC1 to FC3 using
yum remotely.  So far it has worked perfect on 2 machines.  However,
on a 3rd similar machine I ran into a situation where "yum upgrade"
froze (just yum, not the whole machine) and just sat there doing
nothing for 6 hours until I killed it.  It froze about 25% of the way
through the "completing updates" phase.

There were no odd syslog entries I could discern.  There was no error
debug output.  It just sat there saying it was completing updates for
mtools.

Since I assume all actual updates are done before the completing
updates phase, I am hoping this system will be usable as is.  I am
also contemplating doing some sort of rpm force of the packages, but
I'm not sure if that's a better idea than just leaving it.

I have about 10 more similar machines to upgrade and I will see if
this happens on any others.


Version-Release number of selected component (if applicable):
yum-2.1.11-3 (now)

How reproducible:
Didn't try

Steps to Reproduce:
1. On FC2, get an rpm -Uvh yum-2.1.11-3.noarch.rpm and
fedora-release-3-8.i386.rpm
2. yum upgrade
3.
    

Actual Results:  yum upgrade just sat there doing nothing during
"completing updates" phase.

Expected Results:  yum should have completed to the end.

Additional info:

I wish I could provide more info, but yum/rpm literally left me with
no logs or debug info and since FC2->FC3 is a one way ticket, I can't
reproduce this bug on this machine.

Comment 1 Trevor Cordes 2005-01-09 00:10:14 UTC

I should make it more clear that I first successfully upgraded from
FC1->FC2 using yum upgrade.  It's the FC2->FC3 part that failed.  I
was not attempting to do straight FC1->FC3!

Comment 2 Seth Vidal 2005-01-11 06:59:20 UTC

You didn't have cron or anacron still running did you?

It could be that your upgrade attempt got hung when something else hit
your rpmdb at the same time.

It won't be the first time that's happened.

You might try doing an anaconda upgrade of this machine and see if it
can get fixed up that way.

Comment 3 Trevor Cordes 2005-01-11 11:57:38 UTC

Yes, cron and anacron would have been running as daemons.  Since I hit
this problem at 4am-ish, it very well could have been caused by some
cron job.  Not sure what would manipulate rpm in the stock FC cron
stuff, but it's a possibility.

Next time I will ensure I stop *cron first.

I've found lots of yum FC upgrade instruction hints, and none
mentioned your idea.

I suppose it would be a good idea to shutdown *all* services (except
sshd!) before trying such a thing again.

As always, I will report back whatever I find (though this one may
take time).

Comment 4 Trevor Cordes 2005-02-24 21:11:06 UTC

I just ran into this problem again on another machine.  It hung at 348/606 at
the completing update for hdparm.  This time I HAD made sure to stop cron and
anacron.  When it hung, I verified no cron was running with ps.  There has got
to be something else going on here.

Comment 5 Trevor Cordes 2005-02-24 21:12:30 UTC

It's sitting there hung and I haven't killed it yet.  Is there any signal or
anything I can do to kick-start it back into finishing?  Perhaps to just skip
the one it's on?  I'm desperate to not have to go onsite to fix this machine.

Comment 6 Trevor Cordes 2005-02-24 21:13:38 UTC

Also, ps -ef |grep rpm shows that nothing else is running rpm right now.  What
is yum sitting there waiting for if not an rpm process?

Comment 7 Trevor Cordes 2005-02-24 21:44:59 UTC

Created attachment 111407 [details]
terminal output from yum up to the point where it has hung

Comment 8 Trevor Cordes 2005-02-24 22:07:20 UTC

more details I thought might help:

r#/tmp/strace -p 18952
Process 18952 attached - interrupt to quit
futex(0xaed6438, FUTEX_WAIT, 1, NULL

Comment 9 Trevor Cordes 2005-02-24 22:08:24 UTC

Created attachment 111410 [details]
listing from lsof -p psid of the hung yum process

Comment 10 Trevor Cordes 2005-02-25 00:02:47 UTC

I needed to try to get this machine running, so I had to kill the stalled yum. 
SIGINT had no effect.  SIGHUP had no effect.  kill -9 finally killed it.

Here's what I will attempt as a fix for this half-updated machine:

mkdir /tmp/rpm
cp /var/cache/yum/base/packages/*.rpm /tmp/rpm
cp /var/cache/yum/updates-released/*.rpm /tmp/rpm
cd /tmp/rpm
ls -1 > /tmp/ls
edit /tmp/ls and remove the packages that the yum debug output said were
completely updated (including post-update tasks) -- perhaps this step would be
best left out?
rpm -U --force *

This appears to have worked and hopefully got the packages completed properly.
I sure would like to solve the yum hang issue though since I don't want hit this
bug again when FC4 rolls around...

Comment 11 Adam Thompson 2005-02-25 02:22:09 UTC

FYI - I've seen the same problem on a stock FC3 install, doing a "yum -y
upgrade" (which pulls down 300+ packages) does the same thing... sometimes!

Comment 12 Seth Vidal 2005-02-25 07:18:56 UTC

How many kernels did you have installed at the time?

adding jbj and nasrat to get any ideas they might have

Comment 13 Trevor Cordes 2005-02-25 20:59:24 UTC

At that moment I had just the latest FC2 2.6 kernel installed.  Well, by the
time it froze it had already put in the latest FC3 kernel as well, so if you
look at it that way, then the answer is 2.

Comment 14 Trevor Cordes 2005-02-25 21:03:14 UTC

I perhaps should also mention that while it was hung, and after I had killed it
(but before I rebooted), there were some wacky files/mounts in /tmp that were
causing errors when I did a df, mount or ls /tmp.  I think those are some sort
of shared mem files used by yum/rpm?  Anyways, there were a couple that were
obviously screwed up.  I think this happened before also, but I never gave it
too much attention before rebooting.  I don't have any mounts of my own that run
off /tmp so these must be system/app generated.

Ah, lucky it was still in the terminal scroll-back buffer:

#df
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/hda3             30890364  23443936   5877280  80% /
/dev/hda1               101086      6900     88967   8% /boot
none                    128012         0    128012   0% /dev/shm
df: `/tmp/tmp.FmjnEy8727': No such file or directory
df: `/tmp/tmp.WVkFrU8730': No such file or directory
/tmp/tmp.WVkFrU8730     128012         0    128012   0% /dev/shm
Exit 1

Comment 15 Seth Vidal 2005-02-25 21:16:50 UTC

yum doesn't make these mounts - some packages might in %post - nevertheless -
that's your problem - rpm was probably hanging during the install from that.

Comment 16 Trevor Cordes 2005-02-25 22:05:58 UTC

Hmm, possibly.  But isn't it strange that the first stall I had (my first post
here), it was mtools it hung on, but for the recent stall it was hung on hdparm?

Are you sure these /tmp shm mounts aren't related to the futex yum is using that
the strace indicated it was stalled on?  Perhaps some python libs or rpm libs
are creating them?

There's still the odd fact that I've now upgraded 9 *nearly identical* systems
(with regards to installed rpms they are identical) and only 2 have stalled.  If
it was a %post section hosing it then a) it should stall on the same package in
the same place, and b) it should stall on every system.

Regardless of what the cause is, what about the possiblity of adding an alarm()
timer around the calls that can possibly stall?  If yum had timedout on the
stalled hdparm "update completion" and then continued to the next package, I'd
be in much better shape than it just stopping and leaving the system in limbo. 
Or, instead of alarm(), perhaps catch SIGUSR1 that will kick it onto the next
iteration.

Comment 17 Matthew Miller 2005-04-26 16:18:44 UTC

Fedora Core 2 is now maintained by the Fedora Legacy project for
security updates only. If this problem is a security issue, please
reopen and reassign to the Fedora Legacy product. If it is not a
security issue and hasn't been resolved in the current FC3 updates or
in the FC4 test release, reopen and change the version to match.

Comment 18 Trevor Cordes 2005-05-12 10:33:41 UTC

This bug still exists in FC3.  See bug 145021 (duplicate, but perhaps should
move discussion there).

Comment 19 Trevor Cordes 2005-05-12 10:42:26 UTC

Anyone watching this bug who has had this issue, please put a note in bug 145021
that you have seen / are still seeing this problem.

Comment 20 Matthew Miller 2005-05-18 02:23:35 UTC

Should we resolve this as a duplicate of #145021? (Even though this one is
obviously slightly older...?)

Comment 21 Trevor Cordes 2005-05-18 15:06:03 UTC

Yes, I'd move the discussion to bug 145021.  This bug is really just a more
pathological / easier-to-reproduce case, but I'm convinced it's the same bug.

Comment 22 Matthew Miller 2005-05-18 15:07:36 UTC


*** This bug has been marked as a duplicate of 145021 ***

Note You need to log in before you can comment on or make changes to this bug.