116485 – udev sometimes uses 80-100% of processor

Bug 116485 - udev sometimes uses 80-100% of processor

Summary: udev sometimes uses 80-100% of processor

Keywords:
Status:	CLOSED RAWHIDE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	hal
Sub Component:
Version:	rawhide
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Harald Hoyer
QA Contact:
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	130739 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2004-02-21 17:41 UTC by Alexander Farley
Modified:	2007-11-30 22:10 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2004-09-21 17:05:19 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
~600 lines of output from cmd $hald --daemon=no --verbose=yes (40.08 KB, text/plain) 2004-09-19 22:04 UTC, Mickey Stein	no flags	Details
Output of the 3 things David asked for (386.89 KB, text/plain) 2004-09-19 23:43 UTC, Mickey Stein	no flags	Details
View All

Description Alexander Farley 2004-02-21 17:41:17 UTC

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040217

Description of problem:
Udev seems to use 80-100% of the processing power. This used to not be
a consistent occurence, but as of 02/19/2004 it happens with every
boot-up. Killing the udev processes of course solves the problem.

Version-Release number of selected component (if applicable):
udev-016-2

How reproducible:
Always

Steps to Reproduce:
1.Boot up into system with udev activated
2.Udev uses alot of processing power
3.System becomes sluggish
    

Expected Results:  Udev would use a sane amount of processing power.

Additional info:

Comment 1 Alexander Farley 2004-02-21 17:42:11 UTC

The udev processes when using 80-100% of processing power are
sustained indefinitely.

Comment 2 Alexander Farley 2004-02-22 00:08:13 UTC

With 02/21/2004 updated packages the problem does not occur. However,
this has happened in the past, so the problem may still exist. It has
come and gone 4 or 5 times since Fedora released a udev package.

Comment 3 Dennis Gilmore 2004-08-14 21:58:06 UTC

i have it now and i am unable to kill the process. 
 
top shows the commands of udev vc and /usr/bin/udevinfo -r -q name 
-p /block/ram4 
and between them they use every little piece of cpu time they can get 
 
this is on a fully updated rawhide system as at 8/14/04

Comment 4 Wilbert Ho 2004-08-23 05:02:49 UTC

I have it also. Fully updated as of 8/22/04. Killing the udev process
only respaws a new udev process that continues to take up 100% CPU.
Killing the udevd process fixes it until the next reboot.

Comment 5 Harald Hoyer 2004-08-24 08:38:44 UTC

*** Bug 130739 has been marked as a duplicate of this bug. ***

Comment 6 Dennis Gilmore 2004-08-24 11:12:10 UTC

Harald can you please reopen this bug.  this is a major show stopper 
for me   im running current rawhide on a Dell Inspiron 4150 its a a 
p4 1.8 512Mb Ram and its not a sometimes bug it is a everytime i 
reboot bug.  to narror it down further it seems limited to udevinfo 
and udevstart  last night i started udevd from the command line and 
its been running for nearly 12 hours without a problem but if i kill 
udevd and run udevstart  the process starts using 100% cpu.  and on 
every reboot with udev disabled i still get a udevinfo process using 
all available cpu time.

Comment 7 Dennis Gilmore 2004-08-24 11:31:09 UTC

further i just ran udevinfo -d  and it just kept on running and 
running  i killed it and had a screen full of  
P: /class/input/mouse2 
N: input/mouse2 
T: c 
M: 020644 
S: 
O: root 
G: root 
F: /etc/udev/rules.d/50-udev.rules 
L: 24 
U: 123835 
 
I do have 2 mice the inbuilt touchpad and an external logitech MX700 
connected via usb

Comment 8 Harald Hoyer 2004-08-24 11:59:20 UTC

anyone has selinux enabled?

Comment 9 Dennis Gilmore 2004-08-24 23:11:12 UTC

I dont but i can enable it and relabel my filesystem.

Comment 10 Mickey Stein 2004-09-11 23:32:35 UTC

I ran into this same thing ("udev vc" at -15 priority using 99% cpu),
after doing a yum update to the latest development tree. 

I rebooted into init 1 (>b 2.6.9-rc1-xx 1) which got udev going, but
wasn't such a horribly long process until I got to the maintenance
prompt. Then I reniced (#renice +15 pid(udev)) and was able to poke
around. 

The most recent udev overwrote my udev.conf & well, all the other ones
like rules.d/50* adding in a batch of rules about vc devices that I
"suspect" are causing a conflict tossing udev into a loop. 

These are the 'vc' rules in udev.conf/50*.rules:

# vc devices
KERNEL="tty[0-9]*",  NAME="vc/%n",  SYMLINK="%k"
KERNEL="vcs",        NAME="vcc/0",   SYMLINK="%k"
KERNEL="vcs[0-9]*",  NAME="vcc/%n",  SYMLINK="%k"
KERNEL="vcsa",       NAME="vcc/a0",  SYMLINK="%k"
KERNEL="vcsa[0-9]*", NAME="vcc/a%n", SYMLINK="%k"

and I've no definitions in any 10* files in /etc/hotplug.. to override
these. The above defs aren't the problem, because I removed them and
am still getting the same 'udev vc' hangup. I guess my suspicions lie
in the /dev directory where there's quite a few /dev/vc* devices
already defined, but only a single symbolic link one: 

ls -al /dev/vcs* | more
crw--w----  1 vcsa tty  7,   0 Sep  7 16:03 /dev/vcs
lrwxr-xr-x  1 root root      5 Sep 10 08:31 /dev/vcs1 -> vcc/1

And the time stamp on the /dev/vcs2 device (sorry, see below) is
exactly when I was last hungup in the 'udev vc' process. Apparently
there's enough population in the /dev dir for me to disable udev by
going into /etc/udev/udev.conf and setting :

# set USE_UDEV to yes, if you want to use udev
USE_UDEV="no"

which of course eliminates the problem but gets us no further. 

--- 

Another thing I did to get by this mess was to just go to the line in
/sbin/start_udev where udev is cranked up and started it using a
'$nice -n 15 blah blah' command so it can flail all day long without
my being hassled with it. 

I'm suspecting that there's at least two problems, the first being
that udev of course, shouldn't get stuck in loops and should detect
whatever is wrong, and the 2nd being that my switching over from the
last version of udev to this current one which overwrote my config
files has caused a conflict. 

I'll try a few things (like cleaning out all the /dev/vc* files,
removing them from the 50rules*.rules file & restarting, but I need to
make a system backup before I have yet another non-booting situation
caused by this sort of thing ;)

I read enough of the .html file in udev-031/docs to get the drift on
writing a simple rule, but I can't yet see that one is required or a
part of this particular problem. I suppose Greg-K-H could answer this
in a flash (in case he happens by). 

Things someone might want to know about the state my system's in when
this occurs: 

$ls /sys/class/vc
vcs   vcs2  vcs4  vcs6  vcsa   vcsa2  vcsa4  vcsa6
vcs1  vcs3  vcs5  vcs7  vcsa1  vcsa3  vcsa5  vcsa7

which seems as if it'd be a 'good' sign. 

$more /sys/class/vc/vcs/dev
7:0

$udevinfo -a -p /sys/class/vc/vcs

udevinfo starts with the device the node belongs to and then walks up the
device chain, to print for every device found, all possibly useful
attributes
in the udev key format.
Only attributes within one device section may be used together in one
rule,
to match the device for which the node will be created.

  looking at class device '/sys/class/vc/vcs':
    SYSFS{dev}="7:0"

$ls -al /dev/vcs[1-9]
lrwxr-xr-x  1 root root    5 Sep 10 08:31 /dev/vcs1 -> vcc/1
crw-------  1 vcsa tty  7, 2 Sep 11 11:50 /dev/vcs2
crw-------  1 vcsa tty  7, 7 Sep  7 16:03 /dev/vcs7
crw--w----  1 vcsa tty  7, 8 Sep  7 16:03 /dev/vcs8

Hm.. This has got to be a clue. I was referring to this above and
coudn't recall the correct thing to print. /dev/vcs2 has the only
valid timestamp that matches the last period when I ran udev and was
hungup in 'udev vc'. It 'appears' that it was perhaps attempting to
create linkage from vcs2 -> vcc/2 (which is what the rule above was
implying) but never got past /dev/vcs2. Note no /dev/vcs[3-6] either. 

It appears I've got a mishmosh of junk in my dev dir. I'll have to
pinch off this comment, try some things and return (if I can lol). 

Mickey

Comment 11 Mickey Stein 2004-09-11 23:36:54 UTC

Sorry for writing "War and Peace" up there. This little box can be
deceptive. Anyway: I noticed that this bug started quite a time back
(Feb) & that I hadn't included the version of udev I'm using:

Vers: udev-030-24 (as well as having tried yesterday udev-031 from the
tarball on the kernel.org/utils area)

Comment 12 Mickey Stein 2004-09-12 04:42:00 UTC

Tried a few more things. Got rid of all references I could find in the
udev tree & /dev/vc* (to vc[1-xx], vcs[1-xx]) and reenabled udev.
After restart, I still had "udev vc" pegged at 98% cpu at priority -15. 

I did a grep on vc(s) "grep -i vc /etc -r" only to find tons of
selinux, security, & various seemingly unreleated possibly obsolete
references.

A fairly big part of this is that I don't even see what the need for
the vc* devices really is. I can use virtual consoles galore with or
without them. Any illumination on this would be great. There's many
conflicts when it comes to permissions, groups & owners on vc files in
existance & how the various pieces think they should be. Thanks very
much, Mick

Comment 13 Dennis Gilmore 2004-09-13 03:52:14 UTC

OK,  i have a second HDD for my laptop so i did a clean install 
tonight of the current rawhide tree and udev is working fine.  
previously they system had been a clean install of FC2 updates to 
rawhide.  so the problem is that something was there in the upgrade 
that is causing the problem.  Can anyone else test to see if a clean 
rawhide install on the same hardware works as expected.  i enabled 
selinux on my new install.  i had switched on selinux with the old 
install  but it made no difference

Comment 14 Harald Hoyer 2004-09-13 08:04:05 UTC

btw, in the original udev rules for FC there are no rules for vcc
whatever!!

Comment 15 Dennis Gilmore 2004-09-13 11:38:54 UTC

on the new install /dev had only a handful of entries for me  on my 
upgraded system dev has 7557 entries so there seems to be some 
problem with the files dev creates i removed dev  and reinstalled it 
but still get the huge amount of /dev entries while dev is not 
installed /dev  shows only a handful of entries which are ones that 
have added for my modem

Comment 16 Mickey Stein 2004-09-13 15:38:43 UTC

Harold: I noticed that as well, but in other places, like makedev.d,
there's vc entries, and I don't know how they're supposed to interact
or  even be used by udev. If I remove all rules pertaining to vc* or
even to tty[1-x], I still wind up hung by "udev vc".

Dennis: Could you please try a couple of commands on your newly built
system:

1) "ls -l /dev/vcs[1-9] /dev/tty[1-9] /sys/class/vc"
2) attach your /etc/rules.d/50*.rules file
3) "ps -ef | grep udev"

If I disable udev (setting USE_UDEV="no" in udev.conf), *and* delete
all the /dev/vc* entries and symlinks *and* remove all lines with 'vc'
from the 50*.rules file, all my virtual consoles and all else I can
try works fine, and of course there's no udev running so that's not an
issue then. 

When I enable udev at this point and reboot (or just start it) I wind
up in the original state we're all in here "udev vc" etc. 

From what you've seen, It just seems that we're caught in a
configurational chaos of 'old' (pre-udev or pre-this-version-udev) and
'new' devices and device creation attempts. 

If this is a configuration problem due in part to switching over from
dev to udev, then the udev process, regardless of vintage, should run
into the problem and detect it and spit out an error (or
recommendation) and more along. I think this is more important than
the specifics of some arbitrary config problem. 

I sent off an email to the auther of udev (greg) who has been very
helpful in the past on some other things he's authored asking about
this and mentioned the bug, although I've no idea if he's connected
with fedora's implentation. Since I tried it with his kernel.org/utils
new version of udev, I thought that would be perhaps enough to remove
it from the realm of pure fedora-ness (fedora-ness? ;) 

I don't really want to do a clean rawhide install because it will
obscure the problem (which will just go away). I think you've proven
that already, and would rather people have a working path from an
older dev implentation to a newer udev implementation that just works
and takes old trash into account. Thanks much.

Comment 17 Mickey Stein 2004-09-13 17:21:25 UTC

Update: Made rawhide tree up to date (bind*, glibc*, * ) as of today,
and also did a make on the kernel de jour (2.6.9-rc2).

I wanted to see if the very first rule in 50*.rules was being
executed, so I added my own rule for test. KERNEL="tty[8-9]*", 
NAME="vc/%n",  SYMLINK="%k" was what I added, and there were no other
tty[8-9] rules nor were any 'vc' rules left in my 50*.rules file. 

I then deleted all /dev/tty's > 7, and all /dev/vc* and set udev.conf
to enable udev and rebooted into level 1 (to minimize the misery of
waiting if 'udev vc' stuck again). 

It stuck as usual at 'udev vc' so I reniced it to 15 & did an ls -al
/dev/tty[8-9] (nothing was there), as well as an ls -l /dev/vc*, which
 showed that it'd attempted (or successfully) to create /dev/vcsa1:
which  showed that it'd attempted (or successfully) to create /dev/vcsa1

--

$ls -l /dev/vc* 

crw-------  1 vcsa tty 7, 129 Sep 13 09:27 /dev/vcsa1

--

I wonder where this determination to create the 'vc' device above
originates? I don't want or need it, but udev insists upon creating it.

Doing a full grep on vcsa, I find a few hits, the possibly interesting
ones being /etc/makedev.d/linux-2.6.x,
/etc/udev/permissions.d/50-udev.permissions, /dev/.udev.tdb. 

And it suddenly becomes kind of obvious that I should probably stick
with the udev-031 (newest generic udev source), build it in debug,
install and boot, crank it up with gdb or strace and see what this
thing is doing at the time of the 'udev vc'. It'll probably take a
little while to figure this out due to my employer actually wanting me
to do some work (the nerve lol).

Comment 18 Mickey Stein 2004-09-17 19:19:19 UTC

There was a new udev release on kernel.org (by Greg K-H) today that I
installed and built, which creates all the devices it can fine, and
still leaves ~22 "udev vc" processes out there (although they're no
longer using cpu that I can notice). 

I can see by comparing the rules in 50*.rules that all /dev/vc*
devices are being made now with the correct time & date, major/minor
devices stamps. I built it with debug enabled (debug messages to log
anyway) and the log is filled with minutia about creations, but also
with a ton of "udev: /etc/udev/udev.conf:80:24: unknown key ''"
messages. (where 80:24 can be just about anything. I'm guessing that's
udev for charpos:line#.

Since the /dev/vc* devices are all looking as if they've been created
as they should be, I'm more baffled than before. 

Once thing I noted in a newsgroup was Greg K-H (the author) telling
someone that making udev using 'klibc=true' was recommended (rather
than using glibc which is default). I can make it using glibc without
a problem, but when using klibc and kernel 2.6.9-rc2, I run into some
'endian.h' header file errors that I can't get past. 

Has anyone here made any progress on the basic udev vc using 80+% cpu
problem? Also: Does anyone know if Greg KH is involved in the fedora
project (or just at the kernel utils level)? I suppose my next move
will be to post this on the kernel list since I can't seem to find any
valid links upstream to the udev lists. 

PS: I also tried the new udev-030-26 from rawhide which for some
reason, managed to delete (well, its supposed to) my entire /dev
directory and then be unable to recreate it. I had to boot from linux
rescue & create the needed devices myself, which wasn't difficult
since I'd downloaded Greg's udev-32 version which worked for that.

Comment 19 Mickey Stein 2004-09-18 01:17:41 UTC

Apparently, after posting on the linux.kernel list about this, I was
informed by GKH that it's not so easy to cause to happen (it is for
us, just not for all), but that its being worked on. 

I happened upon a work-around that might only be good for those of us
not using many hotplug (camera, etc) type devices. 

I killed haldaemon (the hardware abstraction layer daemon) using a
kill -10 pid(hald) command. Immediately all 26 of my stuck 'udev vc'
processes vanished. I did this because after doing a $top command, I
realized that hald was churning away at the top of the list. A normal
"service haldaemon stop" or kill -9 pid(hald) won't do it. Only the
kill -10. 

Anyway, I then checked it out of the start config for now using
"chkconfig --del haldaemon". The system now boots using udev without
any problem. I'm just taking a wild guess that there's some
conflict/race condition / etc between the two. 

If you've got hald running, you might want to give this a try, and
it's also handy if it works because now I can recreate the problem
anytime for debugging purposes. Just run "chkconfig --add haldaemon"
to get yourself back into trouble.

Comment 20 David Zeuthen 2004-09-19 19:42:07 UTC

Hi, 

a few questions:

1. do you have a link to your posting on lkml about this? I can't seem
to find it

2. when booting up with haldaemon disabled, try starting hald manually
in a root shell like this '/usr/sbin/hald --daemon=no --verbose=yes'
and attach the output to this bug report

Comment 21 Mickey Stein 2004-09-19 22:04:02 UTC

Created attachment 104001 [details]
~600 lines of output from cmd $hald --daemon=no --verbose=yes

Not sure if the comment I wrote was added, if so it'll be there twice. 

Where is the lkml post? I tried to find it using google, but they're not quite
up to date. It was 09/17/2004 12:38 PM (time of post, might be GMT-7) and
titled "udev vc" processes taking over PC" or something very close to that. You
can just search the list in any usenet reader for my name (Mickey Stein).

Thanks:: I had to cut the 100MB or so of 30 seconds of output from that command
down to 600 lines for a start. If you'd like more, it only takes a moment to
produce.

Thanks much,

Mickey

Comment 22 David Zeuthen 2004-09-19 22:19:09 UTC

Hi, thanks for the traces - it seems hald is looping for some reason -
it might be a duplicate of 132768 (which was fixed upstream a few days
ago but not yet packaged in Rawhide).

Is it possible you can attach some of the last lines from hald output
and also some of the last lines from the /var/log/messages file? The
reason for the latter I want to find out if there's a stream of
hotplug add/remove events. 

Also, what kind of hardware do you have attached? A brief description
and the output of 'tree /sys' will suffice.

Thanks,
David

Comment 23 Mickey Stein 2004-09-19 23:43:38 UTC

Created attachment 104003 [details]
Output of the 3 things David asked for

David: Hope this helps a bit. The things you asked for are (hopefully) all in
the attached file. 

What I've got attached: Not very much, but it's an nforce2 (abit) motherboard,
very standard mouse/kbd/monitor, a GF4200 nvidia card, and a turtle beach santa
cruz (cs46xx) soundcard. Network is on the motherboard and I use the reverse
engineered forcedeth driver from the kernel. Only closed binary driver is the
6111 nvidia. Kernel is 2.6.9-rc2-bk5 (today's). 

All attached h/w is working fine, and performance with preempt enabled is
excellent. (unless I've hald enabled). 

I'll look into that bug you mentioned as well.

Thanks, Mick

Comment 24 David Zeuthen 2004-09-21 07:32:03 UTC

This should be fixed in hal-0.2.98-4 which is available in Rawhide
soon - alternatively you can download the SRPM from 

 http://people.redhat.com/davidz/hal-0.2.98-4.src.rpm

Comment 25 Mickey Stein 2004-09-21 16:50:01 UTC

David,

The new 'hal' appears to work well. So far, I've only installed that
from the dev tree, and will try to work in the latest fedora udev &
hotplug as well to see how this entire set works. 

Thanks very much,

Mick

Comment 26 David Zeuthen 2004-09-21 17:05:19 UTC

Excellent, good to hear.

Thanks,
David

Comment 27 Mickey Stein 2004-09-21 17:21:18 UTC

I did a yum update hotplug, and rebooted. Here's the type of error I
get from hotplug in /var/log/messages upon reboot:

-------- from 'messages log' --------
Sep 21 09:49:55 Kathaldo hal.hotplug[15594]: DEVPATH is not set
Sep 21 09:49:56 Kathaldo hal.hotplug[15617]: DEVPATH is not set
Sep 21 09:49:56 Kathaldo hal.hotplug[15647]: DEVPATH is not set
-------------------------------------

The good news: "udev vc" is no longer a problem. The udev processes
fade away as they should (or as the vc devices are actually created)
over the course of a couple minutes.

The bad news: Hotplug exits after a few DEVPATH errors. I know where
I've seen DEVPATH set, but it appears to only be set in Greg KH's
kernel.org version of udev(032 is a decent example). 

I'm not sure it really applies to fedora because one of the methods he
uses for various distro's (probably other than fedora) is to use a
script named 'udev' that actually goes where most service scripts go,
(rather than the start_udev technique). His script uses DEVPATH by
appending paths that're under the /sys tree that locate individual
devices, that apparently, hotplug wants to use. 

Still not clear on how this pieces together, and although its an
improvement, the three processes (hald, hotplug & udev) aren't yet
playing nicely together on my system. 

I'll try the latest fedora udev next and see if it's of any help.

If you've any idea where fedora's udev (or hald or hotplug) is
supposed to setup /sys/$DEVPATH, I'd love to know that. 

Mick

Comment 28 Mickey Stein 2004-09-22 03:14:02 UTC

I've updated all packages to the latest in 2.91 & development now,
including udev. 

So far as the problem in this bug goes, it looks pretty good. 

But (there's always one of those), using yum (or rpm -Uvh I suppose)
to install udev, always appears to overwrite udev.conf and pretty much
any other file in the .rpm package (like /etc/udev/rules.d/50*.rules.

This obliterates my tty0 creation statement, and doesn't by default
create another on the next boot, thus making the system unbootable
until you use a cd / linux rescue, and chroot etc etc, and either copy
your old 50*.rules over the new one, or better yet, read all the
documentation and create the rules in a place where they won't be
touched (like 10**.rules or maybe in /etc/dev.d/* although I'm not
comfortable modifying that one yet). 

Anyway, it seems good enough, and if I really think this other thing
is a bug and not just my so-so configuration setup, I'll start another
bug. 

Thanks David,

Mick

Note You need to log in before you can comment on or make changes to this bug.