Description of problem: Running mke2fs on a dm-crypt device causes a flurry of OOMs on the console and finally a general lockup. Version-Release number of selected component (if applicable): kernel-PAE-2.6.20-1.2952.fc6 device-mapper-1.02.13-1.fc6 How reproducible: This is a SATA system, and the first SATA disk is partioned as follows: Disk /dev/sda: 160.0 GB, 160000000000 bytes 255 heads, 63 sectors/track, 19452 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes /dev/sda1 1 13 104391 83 Linux /dev/sda2 14 26 104422+ 83 Linux /dev/sda3 27 400 3004155 82 Linux swap / Solaris /dev/sda4 401 19452 153035190 5 Extended /dev/sda5 401 712 2506108+ 83 Linux /dev/sda6 713 1024 2506108+ 83 Linux /dev/sda7 1025 2332 10506478+ 83 Linux /dev/sda8 2333 3640 10506478+ 83 Linux /dev/sda9 3641 19452 127009858+ 83 Linux Creating a dm-crypt device for /dev/sda9: echo "0 254017661 crypt aes-cbc-essiv:sha256 01234567890123456789012345678901 0 /dev/sda9 2056" | dmsetup create crypt-device Make a filesystem on it: mke2fs /dev/mapper/crypt-device For me, if I repeat the mke2fs step 3-4 times, the system will start to OOM over and over for a few seconds, and then freeze up completely.
Also note that creating a dm-crypt device using a whole unpartitioned disk on the same machine (e.g. "/dev/sdb" instead of "/dev/sda9") works just fine.
Please attach system info - memory size, syslog messages. Is it reproducible using standard (no PAE) kernel ? You can try to use lvmdump (from lvm2 package) to collect some info about system and attach it to this bz. Will help if you use sync between repeated mke2fs ?
There were no lines at all logged in syslog (once the OOMing starts, things go bad very quickly). There were some OOM reports on the console speeding by, but none in syslog. Here's memory info: total used free shared buffers cached Mem: 4142464 192156 3950308 0 76096 56992 -/+ buffers/cache: 59068 4083396 Swap: 3004144 0 3004144 Note that no swap is used, and the machine has plenty of free RAM as well when it starts to OOM. I will test your other questions shortly.
I just tested kernel-PAE-2.6.20-1.2952.fc6 against kernel-2.6.20-1.2952.fc6. I was able to repeat the failure using kernel-PAE-2.6.20-1.2952.fc6 I was NOT able to repeat the failure using kernel-2.6.20-1.2952.fc6 That is, it only fails with the PAE kernel. Sync-ing after each run did not make a difference on either PAE or non-PAE: PAE always failed, and non-PAE always succeeded.
Can you get the contents of /proc/vmstat: (1) before running mke2fs (or is that what's above) (2) after each run of mke2fs that succeeeds
Created attachment 157581 [details] Output of the crasher script Using this script, I can get the failure to happen within 3-4 runs. The attachment is the output. Note that run #3 didn't complete. # for i in `seq 1 10` > do > echo "Pass $i" >> output > echo BEFORE >> output > cat /proc/vmstat >> output > sync > mke2fs /dev/mapper/crypt-device > echo AFTER >> output > cat /proc/vmstat >> output > sync > done
Kernel 2962 has dm-crypt bugfixes from 2.6.22 applied. Can you test that? It's in the updates-testing repo.
I tested kernel 2962, and there is still a problem. It shows up in a slightly different fashion in that the machine freezes without first showing the OOMs on the console, but the end result is the same.
I cannot reproduce this on 2.6.24-rc rawhide kernel (kernel-PAE-2.6.24-0.42.rc3.git1.fc9, using 6GB RAM) There were some changes (per BDI limits, dm-crypt bugfixes) in 2.6.24-rc kernel which, I think, should prevent that. (I expect that problem was related to committing too much work for internal crypt threads.) Please could you verify that it works with some 2.6.24 test kernel ? [changed fc6 -> fc-devel]