1264076 – crypt target is not properly handling 'suspend --noflush'

Bug 1264076 - crypt target is not properly handling 'suspend --noflush'

Summary: crypt target is not properly handling 'suspend --noflush'

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	kernel
Sub Component:
Version:	rawhide
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Mikuláš Patočka
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2015-09-17 12:54 UTC by Zdenek Kabelac
Modified:	2015-09-21 16:46 UTC (History)
CC List:	11 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2015-09-21 16:46:43 UTC
Type:	Bug
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Zdenek Kabelac 2015-09-17 12:54:35 UTC

Description of problem:


When using 'crypt' target with parallel encrypting (enhancement from 4.0 kernel)
we lost ability to use  'dmsetup suspend --noflush' to suspend device without blocking.

As of now with 4.3-rc1 - when crypto device is in use (being busy) and 'suspend --noflush' is executed - it communicates with layer bellow - thus such target can't be replace if the layer below gets 'frozen'.

As an easy reproducer -

create LV
create luksFormat on top of this LV
open & mkfs & mount
create some load on this mounted volume
(i.e. while : ; do echo 1 ; sleep 1; done >/mnt/test/write)

So while having table like this:
vg-lvol0: 0 106496 linear 7:0 2048
cryptdev: 0 102400 crypt aes-xts-plain64 0000000000000000000000000000000000000000000000000000000000000000 0 253:0 4096

suspend vg-lvol0
suspend --noflush --nolockfs  cryptdev

--> gets frozen - while it should have pass.
(to unblock -   resume LV, resume crypt-device)

As a workaround user can use:

cryptosetup --perf-submit_from_crypt_cpus --perf-same_cpu_crypt luksOpen

to open Luks - in this case  'suspend --noflush' always works
(thus it's possible to replace table line)

Version-Release number of selected component (if applicable):
cryptsetup-1.6.8-2.fc24.x86_64

How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:
suspend --noflush  always works

Additional info:

Comment 1 Milan Broz 2015-09-17 13:00:27 UTC

This is problem inside dmcrypt kernel module...

Comment 2 Mikuláš Patočka 2015-09-21 16:46:43 UTC

Suspend must be done from the top device to bottom. Suspend in reversed order isn't supposed to work. Suspend in reverse order is racy - if there is any bio blocked in the lower device, suspend of the higher device gets stuck. Dm crypt parallelization probably changes timing so that it triggers the race condition.

Note You need to log in before you can comment on or make changes to this bug.