Description of problem: When running multiple copies of the lvmdbustest.py unit test concurrently against the lvmdbusd service everything hangs quite quickly. When looking at the process list to see whats going on I discovered: root 27603 1681 4 13:23 pts/0 00:00:01 python3 ./lvmdbusd --debug root 27613 1801 0 13:23 pts/1 00:00:00 python3 ./lvmdbustest.py -v root 27806 20917 0 13:23 pts/2 00:00:00 python3 ./lvmdbustest.py -v root 27826 2395 0 13:23 pts/3 00:00:00 python3 ./lvmdbustest.py -v root 29428 27613 0 13:23 pts/1 00:00:00 /usr/sbin/lvm pvcreate /dev/sdc root 29430 27603 1 13:23 pts/0 00:00:00 /usr/sbin/lvm vgremove -f qtapifkq_vg --config global/notify_dbus=0 Then I did: # kill -2 29430 29428 I then discovered: The pvcreate exited with (This was executed directly from the unit test case): pvcreate failed with stdout= , stderr= Interrupted... Giving up waiting for lock. /run/lock/lvm/V_qtapifkq_vg: flock failed: Interrupted system call Can't get lock for qtapifkq_vg Cannot process volume group qtapifkq_vg Interrupted... Interrupted... Device /dev/sdc not found (or ignored by filtering). And the vgremove exited with (This was executed in the context of the lvm dbus service): 27603:27609 - EC= 5 for ['/usr/sbin/lvm', 'vgremove', '-f', 'qtapifkq_vg', '--config', 'global/notify_dbus=0'] STARTED: 1494008624.489977, ENDED: 1494008665.708364 STDOUT= Logical volume "alheatvy_lv" successfully removed STDERR= Interrupted... Giving up waiting for lock. /run/lock/lvm/P_orphans: flock failed: Interrupted system call Can't get lock for orphan PVs It appears that the pvcreate is waiting for lock: /run/lock/lvm/V_qtapifkq_vg And vgremove is waiting for lock: /run/lock/lvm/P_orphans My best guess is the pvcreate is holding the /run/lock/lvm/P_orphans lock and vgremove is holding the /run/lock/lvm/V_qtapifkq_vg, thus they are deadlocked on one another. Processes which need to acquire multiple locks need to acquire them in the same order. Version-Release number of selected component (if applicable): # lvm version LVM version: 2.02.168(2) (2016-11-30) Library version: 1.02.137 (2016-11-30) Driver version: 4.35.0 How reproducible: 100% Steps to Reproduce: 1. Start the lvmdbusd daemon, # ./lvmdbusd --debug 2. Fire up a few lvmdbustest.py instances: # LVM_DBUSD_PV_DEVICE_LIST=/dev/sdb:/dev/sdc:/dev/sdd:/dev/sde ./lvmdbustest.py -v > /tmp/unit1 2>&1 & # LVM_DBUSD_PV_DEVICE_LIST=/dev/sdf:/dev/sdg:/dev/sdh:/dev/sdi ./lvmdbustest.py -v > /tmp/unit2 2>&1 & # LVM_DBUSD_PV_DEVICE_LIST=/dev/sdj:/dev/sdk:/dev/sdl:/dev/sdm ./lvmdbustest.py -v > /tmp/unit3 2>&1 & Actual results: Everything comes to a grinding halt Expected results: Unit tests run to completion Additional info: Likely only need 2 concurrent unit tests to re-create, but haven't tested that scenario.
This commit is partially responsible. I don't know about the "policy of non-blocking second locks" that is referred to. We should probably move the orphan lock back to the top level regardless, since that is the right place. The liblvm concept which motivated this commit didn't work out, and we are fixing problems that it introduced to the rest of lvm. https://sourceware.org/git/?p=lvm2.git;a=commit;h=03660b30454ec07c09c164f7fd23af278677324a commit 03660b30454ec07c09c164f7fd23af278677324a Author: Dave Wysochanski <dwysocha> Date: Fri Jul 10 20:08:37 2009 +0000 Move orphan lock inside vg_remove_single. Move the vg orphan lock inside vg_remove_single, now a complete liblvm function. Note that this changes the order of the locks - originally VG_ORPHAN was obtained first, then the vgname lock. With the current policy of non-blocking second locks, this could mean we get a failure obtaining the orphan lock. In the case of a vg with lvs being removed, this could result in the lvs being removed but not the vg. Such a scenario could have happened prior though with a different failure. Other tools were examined for side-effects, and no major problems were noted.
"policy of non-blocking second locks" refers to the fact that if one lock is already held and a second lock is requested, the code never blocks waiting for that lock - if it can't obtain it immediately it fails and the caller drops the first lock before either aborting the operation or retrying. Secondly the locks were supposed to be ordered, with the orphans lock consistently always acquired either before or after the VG locks (I think the order got swapped at one point in the code's history), and the VG locks are always obtained in alphabetical order of VG name. We need to audit the code again to determine what the current situation has become and decide how to reimpose consistency.
Some quick tests suggest the code is now breaking both of those rules. I'm seeing examples where the ordering is going wrong and examples where blocking locks are requested that should have been non-blocking, so it's not surprising that a deadlock was discovered. The current rules to which the code is supposed to adhere are documented in locking.h: * If more than one lock needs to be held simultaneously, they must be * acquired in alphabetical order of 'vol' (to avoid deadlocks), with * VG_ORPHANS last.
So we should begin by finding all the places that breach the current rules and then understand why the rules weren't followed and reassess whether they can all be brought back into line or whether the rules need to be modified again.
The last time the code was audited, the ordering was adhered to everywhere so the non-blocking second lock code got removed, specifically because the orphan lock was now being taken last (used to be taken first) and had no retry code leading to avoidable failures. commit d7cbaae1fd5bdf1e013d769a0a828ef9d04fb8ef Author: Milan Broz <mbroz> Date: Wed Mar 31 17:23:56 2010 +0000 Always use blocking lock for VGs and orphan locks. Because we have now strong rule for lock ordering: - VG locks must be taken in alphabetical order - ORPHAN locks must be the last vgs_locked() is now not needed. So: 1) find all the places that break the current lock ordering rules in locking.h 2) work out if it's reasonable to fix them to adhere to those rules 3) if not, then temporarily reinstate the "2nd lock is non-blocking" code, albeit reintroducing the problem its removal fixed, until a new long-term solution is found
This is a design problem in lvm locking (using two different "global" locks, and using them inconsistently: GLOBAL and ORPHAN). It is fixed in https://sourceware.org/git/?p=lvm2.git;a=commit;h=8c87dda195ffadcce1e428d3481e8d01080e2b22 That change is not suitable for a backporting to 2.02 releases.