Fedora Account System
Red Hat Associate
Red Hat Customer
On boot, udev-095-14 fails to start in a timely manner; server then takes over 10 minutes to load with other errors generated due to vol_id process hogging CPU. 1. Error occurs on all reboots. 2. Commenting out /sbin/start_udev in /etc/rc.d/rc.sysinit allows for fast restart, but causes other apps to fail. Actual results: /dev fills with .tmp files and vol_id processes hog CPU. Expected results: 4 minute reboot, not 15! Additional info: Server hardware is Dell PowerEdge 1850, BIOS A05. Two hard drives - server 1 has dmraid (software); server 2 has hardware RAID. Server 1 has 3 dual port NICs, server 2 has one dual port NIC. kill -SIGTERM on udev process cleanly exits processes and returns CPU to idle. Running /sbin/start_udev causes vol_id to hang again. /lib/udev/vol_id is statically linked to 2.6.9 kernel. File size is 526k on x86_64. Replacing file with /sbin/vol_id from FC5 (udev-084-13.fc5.2) stops hang and "Staring udev" returns quickly with OK.
Time-out error re-produced on Dell PowerEdge 850. Temporarily fixed with older copy of 'vol_id'. As this is not a Plug-and-play environment, there is no issue at present.
Output of "ps aux | grep udev" showing hung vol_id processes: root 3067 0.2 0.0 12564 668 ? S<s 13:58 0:00 /sbin/udevd -d root 3829 91.6 0.0 760 128 ? R< 13:58 3:07 /lib/udev/vol_id --export /dev/.tmp-9-0 root 3866 92.5 0.0 764 132 ? R< 13:58 3:09 /lib/udev/vol_id --export /dev/.tmp-8-0 root 3907 0.0 0.0 12564 640 ? S< 14:01 0:00 /sbin/udevd -d root 3908 0.0 0.0 12564 640 ? S< 14:01 0:00 /sbin/udevd -d root 3909 0.0 0.0 12564 640 ? S< 14:01 0:00 /sbin/udevd -d root 3910 49.8 0.0 760 124 ? R< 14:01 0:12 /lib/udev/vol_id --export /dev/.tmp-8-3 root 3911 33.6 0.0 760 128 ? R< 14:01 0:08 /lib/udev/vol_id --export /dev/.tmp-8-2 root 3912 31.7 0.0 760 128 ? R< 14:01 0:07 /lib/udev/vol_id --export /dev/.tmp-8-1
Created attachment 140131 [details] Output of strace to where vol_id processes hang. strace output ends at "wait4(-1", then I pressed Ctrl + C to exit as 'vol_id' had hung up.
please run strace with the F and f flag.. # strace -Ff
Created attachment 140134 [details] Output of 'strace -Ff /sbin/start_udev' Re-run of strace with F and f flags.
there is no vol_id in the last strace... you may run vol_id alone.. not the whole start_udev :) # for i in /dev/hd* /dev/sd*; do /lib/udev/vol_id --export $i;done
Running 'for i in /dev/hd* /dev/sd*; do /lib/udev/vol_id --export $i; done' produces: /dev/hda: error open volume And "ps aux | grep vol_id" reports /lib/udev/vol_id --export /dev/sda running at 100% CPU. Killing the process with -SIGTERM moves on to /dev/sda1 and hogs the processor again. /dev/sda2 and /dev/sda3 produce the same results. Killing all processes returns the CPU to normal. Testing carried out on Dell PowerEdge 850 with LSI MegaRAID SCSI RAID with two 73GB drives in RAID 1. This is a production server, but I'll not be fried if I kill it :D Output of 'strace /lib/udev/vol_id --export /dev/sda': 8534 execve("/lib/udev/vol_id", ["/lib/udev/vol_id", "--export", "/dev/sda"], [/* 18 vars */]) = 0 8534 uname({sys="Linux", node="theoline.aminocom.com", ...}) = 0 8534 brk(0) = 0x686000 8534 brk(0x686f20) = 0x686f20 8534 arch_prctl(ARCH_SET_FS, 0x686850) = 0 8534 brk(0x6a7f20) = 0x6a7f20 8534 brk(0x6a8000) = 0x6a8000 8534 open("/dev/sda", O_RDONLY) = 3 8534 ioctl(3, BLKGETSIZE64, 0x7fffe4ac2218) = 0 8534 open("/etc/passwd", O_RDONLY) = 4 8534 fstat(4, {st_mode=S_IFREG|0644, st_size=2654, ...}) = 0 8534 mmap(NULL, 2654, PROT_READ, MAP_SHARED, 4, 0) = 0x2aaaaaaab000 8534 close(4) = 0 8534 --- SIGTERM (Terminated) @ 0 (0) --- 8534 +++ killed by SIGTERM +++
still a problem with latest kernel/udev/fedora versions?
No longer a problem with Fedora 7. Updated all Dell servers with this issue some time ago and they are all fine. Problem solved and closed. :)(In reply to comment #8) > still a problem with latest kernel/udev/fedora versions?