Bug 413831

Summary: udev timeout at boot; vol_id process stuck when /sbin/start_udev is run
Product: Red Hat Enterprise Linux 5 Reporter: James Hogarth <j.hogarth>
Component: udevAssignee: Harald Hoyer <harald>
Status: CLOSED ERRATA QA Contact:
Severity: high Docs Contact:
Priority: low    
Version: 5.1CC: bugzilla
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
URL: https://bugzilla.redhat.com/show_bug.cgi?id=213476
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-01-20 20:46:58 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Strace of expected behaviour (from different machine) and actual behaviour none

Description James Hogarth 2007-12-06 12:44:23 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-GB; rv:1.8.1.10) Gecko/20071126 Ubuntu/7.10 (gutsy) Firefox/2.0.0.10

Description of problem:
On boot, udev-095-14 fails to start in a timely manner; server then takes over
10 minutes to load with other errors generated due to vol_id process hogging CPU. This effectively reduces the server to zero value and stops any application from running.



Version-Release number of selected component (if applicable):
udev-095-14.9.el5

How reproducible:
Sometimes


Steps to Reproduce:
1. Update RHEL5 to RHEL5.1
2. Error occurs on all reboots.
3. Commenting out /sbin/start_udev in /etc/rc.d/rc.sysinit allows for fast
restart, but causes other apps to fail and many device nodes not to be created.

Actual Results:
/dev fills with .tmp files and vol_id processes hog CPU.

Expected Results:
4 minute reboot, not 15 and an idle CPU after boot not 100% usage by udev/vol_id

Additional info:
kill -SIGTERM on udev process and the vol_id processes cleanly exits processes and returns CPU to idle.
Running /sbin/start_udev causes vol_id to hang again.

It also hangs if /lib/udev/vol_id is called directly.

/lib/udev/vol_id is statically linked to 2.6.9 kernel.

It appears to be exactly the same symptoms as bug id 213476 that was raised against FC6 - unfortunately we can't just upgrade RHEL5 to FC7 to fix!

Comment 1 James Hogarth 2007-12-06 12:46:23 UTC
Created attachment 279681 [details]
Strace of expected behaviour (from different machine) and actual behaviour

Comment 2 James Hogarth 2007-12-17 11:08:21 UTC
Working with RedHat engineers the problem boiled down to this:

The /etc/passwd file has the line '#foo' in it - blame the QA guys testing
tripwire....

An upstream patch with more robust error checking is being considered for
inclusion in a an updated udev release - in the meantime since the file differed
from the published syntax... NOTABUG.

Comment 3 Toralf 2008-03-28 12:36:38 UTC
I just encountered the same problem, after I upgraded to RHEL 5.1 a couple of
days ago. The passwd file has been unchanged across several upgrades between
Linux versions (I've been using the machine for several years, with the same
user setup.)

Just thought I might add that it looks like the problem occurs if, and only if,
the first line in the file starts with a '#'. "#" lines at other locations in
the file, i.e. after the first real user entry, work just fine.

Also, while fixing up the file I came across another, related issue. To reproduce:

1. Insert a blank line at the beginning of /etc/passwd file.
2. /lib/udev/vol_id --export /dev/sda1

Actual result:
*** buffer overflow detected ***: /lib/udev/vol_id terminated
======= Backtrace: =========
[0x8058e90]
[0x8058900]
[0x8048b5c]
[0x80483ea]
[0x804cc08]
[0x8048131]
======= Memory map: ========
00986000-00987000 r-xp 00986000 00:00 0          [vdso]
08048000-080c0000 r-xp 00000000 08:05 4398802    /lib/udev/vol_id
080c0000-080c1000 rw-p 00077000 08:05 4398802    /lib/udev/vol_id
080c1000-080c3000 rw-p 080c1000 00:00 0 
09c72000-09c94000 rw-p 09c72000 00:00 0 
b7f50000-b7f52000 r--s 00000000 08:05 2392       /etc/passwd
bf923000-bf938000 rw-p bf923000 00:00 0          [stack]
Abort

Expected result:
ID_FS_USAGE=filesystem
ID_FS_TYPE=ext3
ID_FS_VERSION=1.0
ID_FS_UUID=45ad209e-5be9-11d5-8e49-000103bb126d
ID_FS_LABEL=/boot
ID_FS_LABEL_SAFE=boot



I really don't get this. Aren't '#' comment lines, or blank lines, allowed in
the passwd file? I've always thought they were, and have had a number of them in
the passwd file on many different Linux versions and also other Unix platforms,
without ever encountering any problems.

Also, this seems to indicate vol_id has its own passwd parser. Why?

Comment 4 Phil Knirsch 2008-04-28 12:28:41 UTC
The man page for the /etc/passwd page doesn't mention the possibility of empty
or comment lines (see man 5 passwd).

Nonetheless i agree that the parser should be more robust in /lib/udev/vol_id.

Proposing for RHEL-5.3, waiting with final ACK on developer review.

Read ya, Phil 


Comment 5 RHEL Program Management 2008-06-02 20:27:14 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 6 Phil Knirsch 2008-06-13 13:20:55 UTC
This bug needs a review from the component owner before granting Devel ACK.

Thanks,

Read ya, Phil


Comment 7 Harald Hoyer 2008-06-19 15:06:38 UTC
dev ack+

Comment 8 Phil Knirsch 2008-06-19 15:21:44 UTC
Granting Devel ACK after review of the package maintainer

Thanks,

Read ya, Phil


Comment 13 errata-xmlrpc 2009-01-20 20:46:58 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-0076.html