Red Hat Bugzilla – Bug 242985
kernel dm-crypt: OOM and lockup when using PAE kernel
Last modified: 2013-02-28 23:05:47 EST
Description of problem:
Running mke2fs on a dm-crypt device causes a flurry of OOMs on the console and
finally a general lockup.
Version-Release number of selected component (if applicable):
This is a SATA system, and the first SATA disk is partioned as follows:
Disk /dev/sda: 160.0 GB, 160000000000 bytes
255 heads, 63 sectors/track, 19452 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
/dev/sda1 1 13 104391 83 Linux
/dev/sda2 14 26 104422+ 83 Linux
/dev/sda3 27 400 3004155 82 Linux swap / Solaris
/dev/sda4 401 19452 153035190 5 Extended
/dev/sda5 401 712 2506108+ 83 Linux
/dev/sda6 713 1024 2506108+ 83 Linux
/dev/sda7 1025 2332 10506478+ 83 Linux
/dev/sda8 2333 3640 10506478+ 83 Linux
/dev/sda9 3641 19452 127009858+ 83 Linux
Creating a dm-crypt device for /dev/sda9:
echo "0 254017661 crypt aes-cbc-essiv:sha256 01234567890123456789012345678901 0
/dev/sda9 2056" | dmsetup create crypt-device
Make a filesystem on it:
For me, if I repeat the mke2fs step 3-4 times, the system will start to OOM over
and over for a few seconds, and then freeze up completely.
Also note that creating a dm-crypt device using a whole unpartitioned disk on
the same machine (e.g. "/dev/sdb" instead of "/dev/sda9") works just fine.
Please attach system info - memory size, syslog messages.
Is it reproducible using standard (no PAE) kernel ?
You can try to use lvmdump (from lvm2 package) to collect some info about system
and attach it to this bz.
Will help if you use sync between repeated mke2fs ?
There were no lines at all logged in syslog (once the OOMing starts, things go
bad very quickly). There were some OOM reports on the console speeding by, but
none in syslog.
Here's memory info:
total used free shared buffers cached
Mem: 4142464 192156 3950308 0 76096 56992
-/+ buffers/cache: 59068 4083396
Swap: 3004144 0 3004144
Note that no swap is used, and the machine has plenty of free RAM as well when
it starts to OOM.
I will test your other questions shortly.
I just tested kernel-PAE-2.6.20-1.2952.fc6 against kernel-2.6.20-1.2952.fc6.
I was able to repeat the failure using kernel-PAE-2.6.20-1.2952.fc6
I was NOT able to repeat the failure using kernel-2.6.20-1.2952.fc6
That is, it only fails with the PAE kernel. Sync-ing after each run did not
make a difference on either PAE or non-PAE: PAE always failed, and non-PAE
Can you get the contents of /proc/vmstat:
(1) before running mke2fs (or is that what's above)
(2) after each run of mke2fs that succeeeds
Created attachment 157581 [details]
Output of the crasher script
Using this script, I can get the failure to happen within 3-4 runs. The
attachment is the output. Note that run #3 didn't complete.
# for i in `seq 1 10`
> echo "Pass $i" >> output
> echo BEFORE >> output
> cat /proc/vmstat >> output
> mke2fs /dev/mapper/crypt-device
> echo AFTER >> output
> cat /proc/vmstat >> output
Kernel 2962 has dm-crypt bugfixes from 2.6.22 applied. Can you test that?
It's in the updates-testing repo.
I tested kernel 2962, and there is still a problem. It shows up in a slightly
different fashion in that the machine freezes without first showing the OOMs on
the console, but the end result is the same.
I cannot reproduce this on 2.6.24-rc rawhide kernel
(kernel-PAE-2.6.24-0.42.rc3.git1.fc9, using 6GB RAM)
There were some changes (per BDI limits, dm-crypt bugfixes) in 2.6.24-rc kernel
which, I think, should prevent that.
(I expect that problem was related to committing too much work for internal
Please could you verify that it works with some 2.6.24 test kernel ?
[changed fc6 -> fc-devel]