Created attachment 474339 [details] boot.log Description of problem: At boot, the system hangs for 3 minutes at "Starting udev". The boot.log contains the message, "udevadm settle - timeout of 180 seconds reached, the event queue contains" followed by a big long list (which I will attach). However, if I run "udevadm settle" manually after boot, it finishes in 0.004 seconds. Version-Release number of selected component (if applicable): udev-161-8.fc14.x86_64 How reproducible: Always or nearly always. Please let me know if there is any other information that would be helpful. Thanks.
So I've had this problem with a large number of new machines, but it looks like the problem goes away after a few boots. I'm starting to think that it might have something to do with the new motherboards arriving with the date set in the future. Does this seem plausible? Anaconda doesn't seem to run an ntp client, so lots of timestamps get set in the future.
can you add to the kernel command line "udev.log-priority=debug udev.children-max=1" and without "rhgb quiet" and see, if it hangs on a special udev event?
I would be happy to add the kernel command-line options, but I won't be able to visually check whether it hangs on a special udev event (most of the machines are headless). Will there be a log file that will show when the hang happens? Unfortunately, the problem is unpredictable enough that I can't easily plug a monitor into a machine and force the problem to happen, but I can reboot a few dozen machines and note which of them have the problem. Thanks.
So I've continued to see this error message (so its apparent correlation with the clock described in comment #1 turns out to have been purely chance). Since I added the "udev.log-priority=debug udev.children-max=1" to the kernel command line in grub.conf, I haven't seen the "udevadm settle - timeout" error. I'll keep on rebooting 25 machines a few times and see if I can reproduce it.
After about 100 reboots, I haven't been able to reproduce the problem when "udev.log-priority=debug udev.children-max=1" is in the kernel command line. However, for the machines that do not have this line I see the "udevadm settle - timeout" error about 30% of the time. Is there anything else I can do to produce useful information? Was there anything in the original boot.log that would be helpful?
you could try to increase the children-max or decrease the log level: next steps: 1. "udev.log-priority=debug udev.children-max=2" 2. "udev.log-priority=info udev.children-max=1" 3. "udev.log-priority=info udev.children-max=2" 4. "udev.children-max=1" 5. "udev.children-max=2"
I've tried increasing children-max and decreasing log level, and in all cases the timeout message did not appear (out of 25 boots on each setting). When I then removed the children-max and log-priority settings, the message appeared very frequently (15 out of 25 boots). What else should I try?
Any suspicious kernel messages in /var/log/messages?
Created attachment 487093 [details] /var/log/messages from a boot with a hang
Created attachment 487094 [details] /var/log/messages from a boot without hanging I've attached /var/log/messages (trimmed to the appropriate time interval) for a boot with a hang and a boot without hanging. The machines have essentially identical hardware. I didn't see anything suspicious, but I may easily have missed something important.
Mar 22 14:14:15 19potato kernel: [ 4.362669] udev[451]: starting version 161 Mar 22 14:14:15 19potato kernel: [ 304.765206] Clocksource tsc unstable (delta = 299936213078 ns) Mar 22 14:14:15 19potato kernel: [ 304.770030] Switching to clocksource hpet Mar 22 14:14:15 19potato kernel: [ 306.047584] ACPI: WMI: Mapper loaded versus Mar 22 14:09:40 18potato kernel: [ 2.014453] udev[162]: starting version 161 Mar 22 14:09:40 18potato kernel: [ 2.037836] [drm] Initialized drm 1.1.0 20060810 .... Maybe "Clocksource tsc unstable" is a problem. You can tell it to use the HPET clocksource from the start by adding "clocksource=hpet" to your kernel command line. Hope that helps.
I just tried adding "clocksource=hpet" to the kernel command line. Unfortunately, the timeout error is still appearing on most boots.
does "Clocksource tsc unstable" appear for every timeout error?
In the latest boot of 25 machines, 14 had the timeout, and 15 had the "Clocksource tsc unstable" error. All 14 machines that timed out had the "Clocksource tsc unstable" error. One machine timed out with giving the "Clocksource tsc unstable" error. The correlation isn't 100%, but it's extraordinarily close.
Sorry, I had that last sentence backwards. One machine had the "Clocksource tsc unstable" error but did not time out.
I have the same problem, but with udevd[453]: worker [457] unexpectedly returned with status 0x0100 udevd[453]: worker [457] failed while handling '/devices/pci0000:00/0000:00:1e.0/0000:04:01.2/video4linux/video1' A more detailed description can be found on http://www.fedoraforum.org/forum/showthread.php?p=1467388&posted=1#post1467388 Often the timeout vanishes if I try booting two or three times, using the reset button during waiting for udev. Hope to find a solution here, thanks and please ask for more information if you need it! Andreas
I also encountered this problem with my Fedora 14 workstation (2.6.35.13-91.fc14.x86_64) I think I have found the issue in my case. I get all the same issues listed in the ticket (e.g. 'udevadm settle - timeout...', 'Clocksource tsc unstable', etc.) with the exception in my case that after finally booting, a 'udevadm settle' command reproduces the issue. In addition, my machine's load average stays in the 8.0-10.0 range until reboot (extremely abnormal, usually 0.05 or less), and a 'top' and 'iotop' show almost no activity in the system. In addition to several other devices (USB and IEEE1394), I have a cheap Logitech webcam attached, the output from hot-plugging this device is below (serial number obscured on purpose): [ 1298.615502] usb 1-3.2: new high speed USB device using ehci_hcd and address 5 [ 1298.829132] usb 1-3.2: New USB device found, idVendor=046d, idProduct=0802 [ 1298.829142] usb 1-3.2: New USB device strings: Mfr=0, Product=0, SerialNumber=2 [ 1298.829150] usb 1-3.2: SerialNumber: XXXXXXXXXX [ 1299.102907] Linux video capture interface: v2.00 [ 1299.404793] uvcvideo: Found UVC 1.00 device <unnamed> (046d:0802) [ 1299.405092] usbcore: registered new interface driver snd-usb-audio [ 1299.422466] input: UVC Camera (046d:0802) as /devices/pci0000:00/0000:00:0b.1/usb1/1-3/1-3.2/1-3.2:1.0/input/input4 [ 1299.422687] usbcore: registered new interface driver uvcvideo [ 1299.422693] USB Video Class driver (v1.0.0) When the device is plugged in before boot, the errors and timeouts occur. When removed *before* boot, the machine boots normally. Removal during boot seems to not make a difference. Plugging it in after booting completely allows the camera and system to function normally. I honestly cannot remember the exact model number (and it's not printed on the unit), but the device is extremely similar to this one: http://www.logitech.com/en-us/webcam-communications/webcams/devices/5865 (Logitech - Webcam C200) I acquired this device while still running F11, and the problem has remained through all Fedora versions till the current one I am running. I don't mind removing the camera when not needed (to facilitate booting automatically after e.g. a power outage), but I just hope this info helps someone.
This message is a notice that Fedora 14 is now at end of life. Fedora has stopped maintaining and issuing updates for Fedora 14. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At this time, all open bugs with a Fedora 'version' of '14' have been closed as WONTFIX. (Please note: Our normal process is to give advanced warning of this occurring, but we forgot to do that. A thousand apologies.) Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, feel free to reopen this bug and simply change the 'version' to a later Fedora version. Bug Reporter: Thank you for reporting this issue and we are sorry that we were unable to fix it before Fedora 14 reached end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged to click on "Clone This Bug" (top right of this page) and open it against that version of Fedora. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping