Description of problem: Looks to be deadlock potential between two processes and the lock ordering of a mapped device inode's I_LOCK i_state and the mapped device's r/w semaphore lock. I think the potential for such deadlock exists anytime the dm.c code calls bdget_disk() or bdget() in order to lock the block device inode of the mapped device for which it already has read sharing or write exclusive ownership of the r/w semaphore lock. This deadlock potential exists due to the fact that the page writeback code can call dm_request() to acquire the mapped device's lock for reading while already owning the mapped device's I_LOCK i_state bits. This appears to happen in the call to __unlock_fs() from dm_suspend() and in the call to __set_size() from __bind() from dm_swap_table() in dm.c. It is not clear why dm_suspend() acquires the mapped device's lock for reading while calling __lock_fs() yet acquires the same lock for writing while calling __unlock_fs(). I've gotten several actual deadlocks between multipath(8) trying to swap in a new table and a dd(1) performing page writeback using 2.6.11-rc3. I do not see the problem fixed in Red Hat AS 4 Updaet 1 kernel code. Multipath owns the multipath mapped device r/w semaphore lock for writing obtained in dm_swap_table() and is blocked trying to obtain the I_LOCK inode i_state bits for the mapped device in __set_size() called from __bind() while trying to set the inode size of the mapped device as part of binding a new mapping table to the device. The dd(1) owns the I_LOCK i_state bits of the mapped device's inode from __sync_single_inode() as part of page writeback and is trying to submit an i/o to the mapped device but is blocked in dm_request() trying to obtain the r/w semaphore lock of the mapped device for reading. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Is there any update on this issue? Thanks.
This looks tractable: the difficulty is in not introducing new race conditions whilst solving it. First draft of a patch (unfinished) is in 'editing' dir (00020). 1. lock/unlock fs should not hold the md lock any more. 2. suspend/swap_table/resume must still never be capable of interfering with each other.
Patches aimed at achieving these goals are at: ftp://sources.redhat.com/pub/dm/patches/2.6-unstable/editing/patches/ Please can people review them and try them out? My main concern is how many new race conditions I've introduced while attempting to fix the existing ones...
Alisdair - have you had any feedback on the patches or on the request in general? Thanks. Heather
Ed - have you reviewed the patches that Alisdair posted? If so, do you have any feedback that you can share? Thanks. Heather
I believe I these fixes made it into RHEL4 U2.
This item can be closed. Andrius please close this issue.
Closing issue, as notabug, as this has been resolved in RHEL4 U2.