Hide Forgot
Recent versions of udev introduced a limitation for number of processes forked to process udev events. This was done to avoid another problem where timeouts occured in a system with numerous devices activated/changed at once (see bug #918511 for more info). However, with such a limit, we could hit a deadlock when LVM2 is used with lvmetad and so the LVM autoactivation takes place. The LVM autoactivation is trigered directly in udev rules as part of the pvscan --cache -aay call (/lib/udev/rules.d/69-dm-lvmetad.rules). In addition, the LVM activation is itself synced with udev and requires the additional udev processing for which the pvscan waits and it does not finish until this processing is done. In this situation, we can easily reach a deadlock where the pvscan --cache -aay autoactivates LVM volumes and it needs to wait for udev processing of just activated volumes AND at the same time the udev process limit is reached. Here, the pvscan --cache -aay waits, consuming the resource and not releasing it until the pvscan --cache -aay and its autoactivation finishes. But this will never finish as to finish the autoactivation, we need udev to finish the processing of newly autoactivated LVM volumes. This processing can't happen as we're at the limit -> DEADLOCK! The limit can be reached quite easily as on a system with 1 CPU the limit is max 10 udev processes in parallel. This is already patched upstream: https://git.fedorahosted.org/cgit/lvm2.git/commit/?id=008c33a21ba0e0fb27319b63c7bad8fb8136f804 And accompanying patches: https://git.fedorahosted.org/cgit/lvm2.git/commit/?id=5face2010dfb4e9fef43b867b4736a90abe51c00 https://git.fedorahosted.org/cgit/lvm2.git/commit/?id=96880102a3b792cadbfc97e44c09c5714a96ca9e https://git.fedorahosted.org/cgit/lvm2.git/commit/?id=10a5838a600bbadfe9e20f686f86ffcd9d0c473e
To QA: to test this issue, the following steps should prove the fix: -> change /lib/udev/rules.d/69-dm-lvm-metad.rules and replace: RUN+="$env{LVM_SBIN_PATH}/lvm pvscan --cache --background --activate ay --major $major --minor $minor", ENV{LVM_SCANNED}="1" with RUN+="$env{LVM_SBIN_PATH}/lvm pvscan --cache --activate ay --major $major --minor $minor", ENV{LVM_SCANNED}="1" (iow, remove the --background option) -> kill existing udevd and run a new one with debug log [root@rhel6-a ~]# killall udevd [root@rhel6-a ~]# udevd --debug -> change max children to 1 (to create the tightest udev processing bottleneck) [root@rhel6-a ~]# udevadm control --max-childs=1 -> check udev log that the max childre is really 1 1383224102.414987 [2088] handle_ctrl_msg: udevd message (SET_MAX_CHILDS) received, max_childs=1 -> create VG/LV [root@rhel6-a ~]# vgcreate vg /dev/sda Volume group "vg" successfully created [root@rhel6-a ~]# lvcreate -l1 vg Logical volume "lvol0" created [root@rhel6-a ~]# lvchange -an vg -> check there are no dm udev_sync cookies in the system [root@rhel6-a ~]# dmsetup udevcookies Cookie Semid Value Last semop time Last change time -> trigger the add event (like it happens on device hotplug or coldplug) [root@rhel6-a ~]# echo add > /sys/block/sda/uevent -> now, the LV gets activated via the pvscan --cache called in udev rules, BUT there's a dangling symlink... [root@rhel6-a ~]# dmsetup udevcookies Cookie Semid Value Last semop time Last change time 0xd4d08b2 589831 1 Thu Oct 31 13:50:51 2013 Thu Oct 31 13:50:51 2013 -> also, there's blocked pvscan --cache call (from the udev rule) because udev can't do more than one process at a time (the pvscan --cache waits for udev notifying it about the LV activation completion, but LV activation is *another* event and that can't be handled because of the udev limit) [root@rhel6-a ~]# ps aux | grep lvm root 1028 0.0 1.5 184144 16296 ? Ssl 13:41 0:00 lvmetad root 2036 0.0 1.7 36148 17396 pts/0 S<+ 13:50 0:00 /sbin/lvm pvscan --cache --activate ay --major 8 --minor 0 root 2062 0.0 0.0 6392 664 pts/1 S+ 13:52 0:00 grep lvm -> after ~3 minutes, the timeout happens (code 0x100 is timeout) 1383224031.389869 [1921] handle_signal: worker [1927] exit 1383224031.389889 [1921] handle_signal: worker [1927] unexpectedly returned with status 0x0100 1383224031.389895 [1921] handle_signal: worker [1927] failed while handling '/devices/pci0000:00/0000:00:07.0/host2/target2:0:0/2:0:0:0/block/sda' 1383224031.390329 [1921] event_queue_delete: seq 1516 done with -32 1383224031.390339 [1921] worker_unref: worker [1927] cleaned up -> and leaving dangling cookies in the system as consequence and evidence Cookie Semid Value Last semop time Last change time 0xd4d08b2 589831 1 Thu Oct 31 13:50:51 2013 Thu Oct 31 13:50:51 2013 [root@rhel6-a ~]# rpm -q udev udev-147-2.51.el6.x86_64 Now, to test the fix, do the same with the new --background option back in the udev rule mentioned at the beginning. At this time, here should be no dangling dm udev_sync cookie seen in the dmsetup udevcookies (not taking into accound any old ones from previous test) and no timeout in udev...
(or you can also do the same with echo 1 > /sys/block/sda/device/delete and then scsi-rescan --forcescan for the device to be plugged "more officially" then just issuing the "echo add" artificially)
Tested with and without the fix prduced expected results. Without the fix a background process is stuck and a semaphore/cookie is still present and waiting: [root@virt-008 ~]# dmsetup udevcookies Cookie Semid Value Last semop time Last change time 0xd4d1bdb 294916 4 Thu Oct 31 15:12:16 2013 Thu Oct 31 15:12:16 2013 [root@virt-008 ~]# ps -ef | grep lvm root 743 1 0 14:56 ? 00:00:00 lvmetad root 2371 1 0 15:12 ttyS0 00:00:00 /sbin/lvm pvscan --cache --activate ay --major 8 --minor 209 After a few mins (udev debug log): 1383228916.230809 [2310] handle_signal: worker [2329] unexpectedly returned with status 0x0100 With the fix applied, it works differently: [root@virt-008 ~]# dmsetup udevcookies Cookie Semid Value Last semop time Last change time [root@virt-008 ~]# ps -ef | grep lvm root 743 1 0 14:56 ? 00:00:00 lvmetad Marking VERIFIED with lvm2-2.02.100-8.el6.x86_64
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-1704.html