170098 – yum hang in useradd when doing update on x86_64

Bug 170098 - yum hang in useradd when doing update on x86_64

Summary: yum hang in useradd when doing update on x86_64

Keywords:
Status:	CLOSED CANTFIX
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	yum
Sub Component:
Version:	4
Hardware:	x86_64
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Jeremy Katz
QA Contact:
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	176882 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2005-10-07 08:27 UTC by Eric Smith
Modified:	2014-01-21 22:52 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2006-04-19 20:11:10 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Eric Smith 2005-10-07 08:27:13 UTC

Description of problem:
On x86_64 systems, "yum update" hangs in useradd.

How reproducible:
100%

Steps to Reproduce:
1.  Do a fresh install of FC4 x86_64 on an Opteron or Athlon 64
2.  Do a "yum update"
  
Actual results:
System downloads hundreds of packages, installs some of them, then hangs.
ps shows hang in useradd.

Expected results:
Packages should be installed successfully

Additional info:

I did a fresh install of FC4 on an x86_64 system (Tyan S2892 motherboard,
2G RAM, 3ware RAID controller).  The install went fine.  After a reboot,
I logged in, did an su, then "yum update".  It downloaded all the packages,
installed a bunch of them, then hung.  Killing the yum yielded a system
that was broken in mysterious ways.

Did another fresh install, then yum update.  Same thing happened.  Poked
at it some more, found that it was hanging in an invocation of useradd.

Tried doing an install of FC4 x86_64 inside a VMware 5.5 beta virtual
machine on an FC4 system at home.  Install went fine, yum update failed
in the same way.

Sometimes when it fails there is an /etc/passwd.LOCK or some such, but
sometimes not.  On the most recent attempt, there is an /etc/.pwd.lock

At first I thought maybe one of these lock files was getting left over,
and it was waiting to acquire the lock.  But deleting the lock file
doesn't get it going again, so I don't think that's the problem.
Furthermore, I can't kill or even "kill -9" the useradd process; even
after "kill -9" it's still hanging around in "D+" state (whatever that
is):

[root@localhost etc]# ps -C useradd -f
UID        PID  PPID  C STIME TTY          TIME CMD
root     25933 25931  0 00:46 pts/1    00:00:00 /usr/sbin/useradd -c Network
Crash Dump user -r -u 34 -g netdump -s /bin/bash -r -d /var/crash netdump
[root@localhost etc]# 

I assume that useradd is being invoked as part of an RPM postinstall
script (or maybe preinstall).  The netdump user already exists before
the yum update.  The last time I did this it hung in useradd for a
different user, presumably as part of the installation of a different
package.

This seems *very* reporoducible, and not specific to the configuration
of the first machine I encountered the problem on.

Comment 1 Jeremy Katz 2005-10-07 12:58:11 UTC

Can you try updating the kernel first, rebooting and then seeing if it works?  I
think you're hitting a case where audit has changed in a completely incompatible
way :-/

Comment 2 Eric Smith 2005-10-09 09:09:31 UTC

Upgrading to kernel-2.6.13-1.1526_FC4 first solved the problem.  Thanks!

Comment 3 Oli Wade 2005-10-17 16:35:07 UTC

I've had the exact same experience with a new install on a gigabyte board
(http://www.giga-byte.com/MotherBoard/Products/Products_GA-K8NXP-SLI.htm).

Updating the kernel first avoided the problem.

Comment 4 Ben Youngdahl 2005-11-16 03:42:22 UTC

Glad to see this info here.  Hit this issue today, finally figured out it was
related to this.

Given how nasty this is, and the level of experience it might take to figure
this out, is there some way that package dependencies could be tweaked to "help"
the user figure out they need to be running a more recent kernel to safely
update certain packages?

I first saw bug #170087 before seeing this.

Thanks!

Comment 5 Chris Rainey 2005-11-16 22:17:50 UTC

Confirming this bug on an emachines T3104 made by Gateway with an AMD Sempron
3100+ 64-Bit CPU. Upgrading/updating yum, then the kernel and then rebooting
works here, as well, as described, above.

Comment 6 Vadim Israilevich 2005-12-21 01:42:25 UTC

Having same problems with HP DL-145 G2.  Tried updating the kernel via
kickstart, to kernel-smp-2.6.13-1.1532_FC4.x86_64.rpm without updating any other
packages - system hangs upon reboot - unable to find /.
Tried updating to kernel-smp-2.6.14-1.1644_FC4.x86_64.rpm with the same results.
 So the kernel fix does not work in my world.

Doing a complete yum update, seems to mess up audit, as described here by
others.  Soft reboot following entire system update hangs.  Have to do a hard
reboot.  The good thing, if one can call it that, is after a hard reboot
everything works fine.  

I am trying to automate system deployment using PXE and kickstart.  Everything
works, save for this.

Thoughts?

Comment 7 Seth Vidal 2006-01-04 00:16:03 UTC

*** Bug 176882 has been marked as a duplicate of this bug. ***

Comment 8 Jinesh Choksi 2006-01-29 11:47:00 UTC

I did a minimal install of FC4 and can confirm that yum updating the kernel and
then rebooting does indeed allow yum update to work properly.

Comment 9 Jeremy Katz 2006-04-19 20:11:10 UTC

This is a kernel bug where the old kernel can't handle some of the new
userspace.  There's not much we can do about it within yum :(

Note You need to log in before you can comment on or make changes to this bug.