242985 – kernel dm-crypt: OOM and lockup when using PAE kernel

Bug 242985 - kernel dm-crypt: OOM and lockup when using PAE kernel

Summary: kernel dm-crypt: OOM and lockup when using PAE kernel

Keywords:
Status:	CLOSED RAWHIDE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	kernel
Sub Component:
Version:	rawhide
Hardware:	i386
OS:	Linux
Priority:	low
Severity:	medium
Target Milestone:	---
Assignee:	Milan Broz
QA Contact:	Corey Marthaler
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2007-06-06 20:01 UTC by Daphne Shaw
Modified:	2013-03-01 04:05 UTC (History)
CC List:	15 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2008-03-30 02:50:35 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Output of the crasher script (3.93 KB, text/plain) 2007-06-21 22:22 UTC, Daphne Shaw	no flags	Details
View All

Description Daphne Shaw 2007-06-06 20:01:05 UTC

Description of problem:

Running mke2fs on a dm-crypt device causes a flurry of OOMs on the console and
finally a general lockup.

Version-Release number of selected component (if applicable):

kernel-PAE-2.6.20-1.2952.fc6
device-mapper-1.02.13-1.fc6

How reproducible:

This is a SATA system, and the first SATA disk is partioned as follows:

Disk /dev/sda: 160.0 GB, 160000000000 bytes
255 heads, 63 sectors/track, 19452 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

/dev/sda1               1          13      104391   83  Linux
/dev/sda2              14          26      104422+  83  Linux
/dev/sda3              27         400     3004155   82  Linux swap / Solaris
/dev/sda4             401       19452   153035190    5  Extended
/dev/sda5             401         712     2506108+  83  Linux
/dev/sda6             713        1024     2506108+  83  Linux
/dev/sda7            1025        2332    10506478+  83  Linux
/dev/sda8            2333        3640    10506478+  83  Linux
/dev/sda9            3641       19452   127009858+  83  Linux

Creating a dm-crypt device for /dev/sda9:

 echo "0 254017661 crypt aes-cbc-essiv:sha256 01234567890123456789012345678901 0
/dev/sda9 2056" | dmsetup create crypt-device

Make a filesystem on it:

 mke2fs /dev/mapper/crypt-device

For me, if I repeat the mke2fs step 3-4 times, the system will start to OOM over
and over for a few seconds, and then freeze up completely.

Comment 1 Daphne Shaw 2007-06-07 00:00:54 UTC

Also note that creating a dm-crypt device using a whole unpartitioned disk on
the same machine (e.g. "/dev/sdb" instead of "/dev/sda9") works just fine.

Comment 2 Milan Broz 2007-06-07 08:43:28 UTC

Please attach system info - memory size, syslog messages. 

Is it reproducible using standard (no PAE) kernel ?
You can try to use lvmdump (from lvm2 package) to collect some info about system
and attach it to this bz.

Will help if you use sync between repeated mke2fs ?

Comment 3 Daphne Shaw 2007-06-07 12:57:12 UTC

There were no lines at all logged in syslog (once the OOMing starts, things go
bad very quickly).  There were some OOM reports on the console speeding by, but
none in syslog.

Here's memory info:
             total       used       free     shared    buffers     cached
Mem:       4142464     192156    3950308          0      76096      56992
-/+ buffers/cache:      59068    4083396
Swap:      3004144          0    3004144

Note that no swap is used, and the machine has plenty of free RAM as well when
it starts to OOM.

I will test your other questions shortly.

Comment 4 Daphne Shaw 2007-06-12 17:42:32 UTC

I just tested kernel-PAE-2.6.20-1.2952.fc6 against kernel-2.6.20-1.2952.fc6.

I was able to repeat the failure using kernel-PAE-2.6.20-1.2952.fc6
I was NOT able to repeat the failure using kernel-2.6.20-1.2952.fc6

That is, it only fails with the PAE kernel.  Sync-ing after each run did not
make a difference on either PAE or non-PAE: PAE always failed, and non-PAE
always succeeded.

Comment 5 Chuck Ebbert 2007-06-20 21:01:27 UTC

Can you get the contents of /proc/vmstat:

(1) before running mke2fs (or is that what's above)
(2) after each run of mke2fs that succeeeds

Comment 6 Daphne Shaw 2007-06-21 22:22:54 UTC

Created attachment 157581 [details]
Output of the crasher script

Using this script, I can get the failure to happen within 3-4 runs.  The
attachment is the output.  Note that run #3 didn't complete.

# for i in `seq 1 10`
> do
> echo "Pass $i" >> output
> echo BEFORE >> output
> cat /proc/vmstat >> output
> sync
> mke2fs /dev/mapper/crypt-device
> echo AFTER >> output
> cat /proc/vmstat >> output
> sync
> done

Comment 7 Chuck Ebbert 2007-06-22 19:17:57 UTC

Kernel 2962 has dm-crypt bugfixes from 2.6.22 applied. Can you test that?
It's in the updates-testing repo.

Comment 8 Daphne Shaw 2007-06-22 19:56:48 UTC

I tested kernel 2962, and there is still a problem.  It shows up in a slightly
different fashion in that the machine freezes without first showing the OOMs on
the console, but the end result is the same.

Comment 9 Milan Broz 2007-11-28 07:09:49 UTC

I cannot reproduce this on 2.6.24-rc rawhide kernel
(kernel-PAE-2.6.24-0.42.rc3.git1.fc9, using 6GB RAM)

There were some changes (per BDI limits, dm-crypt bugfixes) in 2.6.24-rc kernel
which, I think, should prevent that.
(I expect that problem was related to committing too much work for internal
crypt threads.)
Please could you verify that it works with some 2.6.24 test kernel ? 

[changed fc6 -> fc-devel]

Note You need to log in before you can comment on or make changes to this bug.