+++ This bug was initially created as a clone of Bug #834896 +++ If you try to mount an extended partition directly, previously it would give an error (which seems like the correct thing to do, since an extended partition is never a filesystem, and so cannot be mounted). However with recent kernels it goes into an infinite loop using 100% of CPU. Here is a simple reproducer using libguestfs. guestfish -x <<EOF sparse test1.img 100M run part-init /dev/sda mbr part-add /dev/sda p 32 127 part-add /dev/sda e 128 -32 part-add /dev/sda l 140 499 part-add /dev/sda l 501 -64 part-list /dev/sda mount /dev/sda2 / EOF It hangs at the last (mount) line where it's trying to mount the extended partition. You can get additional debug information by adding the '-v' flag to the guestfish command line. guestfsd is just executing this command: mount -o "" /dev/vda2 /sysroot/ So it appears to be a kernel bug. Affected systems: Distro Kernel Affected? Fedora 16 3.1.0-7.fc16.x86_64 No Fedora 16 3.4.2-1.fc16.x86_64 Yes Fedora 17 3.4.0-1.fc17.x86_64 Yes Rawhide 3.5.0-0.rc2.git0.1.fc18.x86_64 Yes Rawhide 3.5.0-0.rc3.git0.2.fc18.x86_64 Yes RHEL 6 2.6.32-221.el6.x86_64 No So it appears to be a bug that has been introduced to the kernel between 3.1.0 and 3.4.2 (unfortunately rather a large range of versions!)
Created attachment 594138 [details] test1.img.xz Here is a way to reproduce this without libguestfs, using a virtual machine. Take the attached disk image and uncompress it. Then add it as an extra disk to a virtual machine. Boot the virtual machine, and inside run the following command (assumes that you added the disk image as /dev/vdb): mkdir /tmp/mnt mount -o '' /dev/vdb2 /tmp/mnt The mount command will spin in a loop using 100% of CPU, apparently forever (or at least for many minutes). Also the mount command is unkillable, even with -9.
Possibly this bug? http://www.spinics.net/lists/linux-ext4/msg32567.html
Stack trace from 'mount' command (captured using sysrq + t): [ 8.073005] mount R running task 0 134 133 0x00000000 [ 8.073005] ffff88001d6e3aa8 0000000000000082 ffff88001d768000 ffff88001d6e3fd8 [ 8.073005] ffff88001d6e3fd8 ffff88001d6e3fd8 ffff88001d769700 ffff88001d768000 [ 8.073005] 0000000000000000 ffff88001d6e2000 0000000000000000 ffff88001dc26c60 [ 8.073005] Call Trace: [ 8.073005] [<ffffffff8108671a>] __cond_resched+0x2a/0x40 [ 8.073005] [<ffffffff815ef820>] _cond_resched+0x30/0x40 [ 8.073005] [<ffffffff8111d2eb>] find_lock_page+0x3b/0x80 [ 8.073005] [<ffffffff8111d9df>] find_or_create_page+0x3f/0xb0 [ 8.073005] [<ffffffff811acf12>] __getblk+0xf2/0x2a0 [ 8.073005] [<ffffffff811ad113>] __bread+0x13/0xb0 [ 8.073005] [<ffffffff8121b4e7>] ext4_fill_super+0x207/0x2a50 [ 8.073005] [<ffffffff8118055b>] mount_bdev+0x1cb/0x210 [ 8.073005] [<ffffffff8121b2e0>] ? ext4_remount+0x5d0/0x5d0 [ 8.073005] [<ffffffff8116b611>] ? __kmalloc_track_caller+0x51/0x180 [ 8.073005] [<ffffffff8120a7f5>] ext4_mount+0x15/0x20 [ 8.073005] [<ffffffff81181063>] mount_fs+0x43/0x1b0 [ 8.073005] [<ffffffff8113de80>] ? __alloc_percpu+0x10/0x20 [ 8.073005] [<ffffffff81199bc7>] vfs_kern_mount+0x67/0xf0 [ 8.073005] [<ffffffff8119a6e4>] do_kern_mount+0x54/0x110 [ 8.073005] [<ffffffff8119bf4a>] do_mount+0x26a/0x840 [ 8.073005] [<ffffffff8113832b>] ? strndup_user+0x5b/0x80 [ 8.073005] [<ffffffff8119c65d>] sys_mount+0x8d/0xe0 [ 8.073005] [<ffffffff815f8ae9>] system_call_fastpath+0x16/0x1b
Here's an even simpler way to reproduce the bug. Simply create a 1024 byte device (empty) and try to mount it: guestfish -x -v <<EOF sparse test1.img 1024 run mount /dev/sda / EOF The stack trace from this one is substantially the same: [ 7.476010] mount R running task 0 109 108 0x00000000 [ 7.476010] ffff88001d783aa8 0000000000000082 ffff88001d6cc500 ffff88001d783fd8 [ 7.476010] ffff88001d783fd8 ffff88001d783fd8 ffff88001d430000 ffff88001d6cc500 [ 7.476010] ffffea0000722ddc ffff88001d782000 0000000000000000 ffff88001dc248a0 [ 7.476010] Call Trace: [ 7.476010] [<ffffffff8108671a>] __cond_resched+0x2a/0x40 [ 7.476010] [<ffffffff8111d2f2>] ? find_lock_page+0x42/0x80 [ 7.476010] [<ffffffff815ef820>] _cond_resched+0x30/0x40 [ 7.476010] [<ffffffff8111d2eb>] find_lock_page+0x3b/0x80 [ 7.476010] [<ffffffff8111d9df>] find_or_create_page+0x3f/0xb0 [ 7.476010] [<ffffffff811acf12>] __getblk+0xf2/0x2a0 [ 7.476010] [<ffffffff811ad113>] __bread+0x13/0xb0 [ 7.476010] [<ffffffff8121b4e7>] ext4_fill_super+0x207/0x2a50 [ 7.476010] [<ffffffff8118055b>] mount_bdev+0x1cb/0x210 [ 7.476010] [<ffffffff8121b2e0>] ? ext4_remount+0x5d0/0x5d0 [ 7.476010] [<ffffffff8116b611>] ? __kmalloc_track_caller+0x51/0x180 [ 7.476010] [<ffffffff8120a7f5>] ext4_mount+0x15/0x20 [ 7.476010] [<ffffffff81181063>] mount_fs+0x43/0x1b0 [ 7.476010] [<ffffffff8113de80>] ? __alloc_percpu+0x10/0x20 [ 7.476010] [<ffffffff81199bc7>] vfs_kern_mount+0x67/0xf0 [ 7.476010] [<ffffffff8119a6e4>] do_kern_mount+0x54/0x110 [ 7.476010] [<ffffffff8119bf4a>] do_mount+0x26a/0x840 [ 7.476010] [<ffffffff8113832b>] ? strndup_user+0x5b/0x80 [ 7.476010] [<ffffffff8119c65d>] sys_mount+0x8d/0xe0 [ 7.476010] [<ffffffff815f8ae9>] system_call_fastpath+0x16/0x1b
Thanks to Jeff Moyer who suggested the following patch: https://lkml.org/lkml/2012/6/25/306 which fixes this bug.
*** Bug 835084 has been marked as a duplicate of this bug. ***
Patch committed to Fedora git. Will be in the next build.
kernel-3.4.4-3.fc17 has been submitted as an update for Fedora 17. https://admin.fedoraproject.org/updates/kernel-3.4.4-3.fc17
kernel-3.4.4-3.fc16 has been submitted as an update for Fedora 16. https://admin.fedoraproject.org/updates/kernel-3.4.4-3.fc16
Package kernel-3.4.4-3.fc17: * should fix your issue, * was pushed to the Fedora 17 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing kernel-3.4.4-3.fc17' as soon as you are able to, then reboot. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2012-9988/kernel-3.4.4-3.fc17 then log in and leave karma (feedback).
kernel-3.4.4-3.fc17 has been pushed to the Fedora 17 stable repository. If problems still persist, please make note of it in this bug report.
kernel-3.4.4-4.fc16 has been submitted as an update for Fedora 16. https://admin.fedoraproject.org/updates/kernel-3.4.4-4.fc16
kernel-3.4.4-4.fc16 has been pushed to the Fedora 16 stable repository. If problems still persist, please make note of it in this bug report.
Reopening on the basis of this email: https://lkml.org/lkml/2012/8/21/692 "[PATCH] block: replace __getblk_slow misfix by grow_dev_page fix" I am now testing the alternate fix proposed there.
The first patch (comment 14) caused a regression. A second version of the patch went upstream and is already included in kernel-3.6.0-0.rc3.git2.1.fc18.x86_64.rpm. I wasn't able to test this until now. However I have just tested it, and the regression has gone. Therefore I am closing this bug again.