Description of problem: When using 'crypt' target with parallel encrypting (enhancement from 4.0 kernel) we lost ability to use 'dmsetup suspend --noflush' to suspend device without blocking. As of now with 4.3-rc1 - when crypto device is in use (being busy) and 'suspend --noflush' is executed - it communicates with layer bellow - thus such target can't be replace if the layer below gets 'frozen'. As an easy reproducer - create LV create luksFormat on top of this LV open & mkfs & mount create some load on this mounted volume (i.e. while : ; do echo 1 ; sleep 1; done >/mnt/test/write) So while having table like this: vg-lvol0: 0 106496 linear 7:0 2048 cryptdev: 0 102400 crypt aes-xts-plain64 0000000000000000000000000000000000000000000000000000000000000000 0 253:0 4096 suspend vg-lvol0 suspend --noflush --nolockfs cryptdev --> gets frozen - while it should have pass. (to unblock - resume LV, resume crypt-device) As a workaround user can use: cryptosetup --perf-submit_from_crypt_cpus --perf-same_cpu_crypt luksOpen to open Luks - in this case 'suspend --noflush' always works (thus it's possible to replace table line) Version-Release number of selected component (if applicable): cryptsetup-1.6.8-2.fc24.x86_64 How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: suspend --noflush always works Additional info:
This is problem inside dmcrypt kernel module...
Suspend must be done from the top device to bottom. Suspend in reversed order isn't supposed to work. Suspend in reverse order is racy - if there is any bio blocked in the lower device, suspend of the higher device gets stuck. Dm crypt parallelization probably changes timing so that it triggers the race condition.