Bug 835019
Summary: | trying to mount an empty 1K partition causes a hang in ext4 driver, using 100% CPU | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Richard W.M. Jones <rjones> | ||||
Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> | ||||
Status: | CLOSED RAWHIDE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | rawhide | CC: | gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda, mbooth, rjones, sdake, virt-maint, walkerrichardj | ||||
Target Milestone: | --- | Keywords: | Reopened | ||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | 834896 | ||||||
: | 835084 (view as bug list) | Environment: | |||||
Last Closed: | 2012-08-28 11:05:13 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 834896 | ||||||
Attachments: |
|
Description
Richard W.M. Jones
2012-06-25 09:18:42 UTC
Created attachment 594138 [details]
test1.img.xz
Here is a way to reproduce this without libguestfs, using
a virtual machine.
Take the attached disk image and uncompress it.
Then add it as an extra disk to a virtual machine.
Boot the virtual machine, and inside run the following
command (assumes that you added the disk image as /dev/vdb):
mkdir /tmp/mnt
mount -o '' /dev/vdb2 /tmp/mnt
The mount command will spin in a loop using 100% of CPU,
apparently forever (or at least for many minutes).
Also the mount command is unkillable, even with -9.
Possibly this bug? http://www.spinics.net/lists/linux-ext4/msg32567.html Stack trace from 'mount' command (captured using sysrq + t): [ 8.073005] mount R running task 0 134 133 0x00000000 [ 8.073005] ffff88001d6e3aa8 0000000000000082 ffff88001d768000 ffff88001d6e3fd8 [ 8.073005] ffff88001d6e3fd8 ffff88001d6e3fd8 ffff88001d769700 ffff88001d768000 [ 8.073005] 0000000000000000 ffff88001d6e2000 0000000000000000 ffff88001dc26c60 [ 8.073005] Call Trace: [ 8.073005] [<ffffffff8108671a>] __cond_resched+0x2a/0x40 [ 8.073005] [<ffffffff815ef820>] _cond_resched+0x30/0x40 [ 8.073005] [<ffffffff8111d2eb>] find_lock_page+0x3b/0x80 [ 8.073005] [<ffffffff8111d9df>] find_or_create_page+0x3f/0xb0 [ 8.073005] [<ffffffff811acf12>] __getblk+0xf2/0x2a0 [ 8.073005] [<ffffffff811ad113>] __bread+0x13/0xb0 [ 8.073005] [<ffffffff8121b4e7>] ext4_fill_super+0x207/0x2a50 [ 8.073005] [<ffffffff8118055b>] mount_bdev+0x1cb/0x210 [ 8.073005] [<ffffffff8121b2e0>] ? ext4_remount+0x5d0/0x5d0 [ 8.073005] [<ffffffff8116b611>] ? __kmalloc_track_caller+0x51/0x180 [ 8.073005] [<ffffffff8120a7f5>] ext4_mount+0x15/0x20 [ 8.073005] [<ffffffff81181063>] mount_fs+0x43/0x1b0 [ 8.073005] [<ffffffff8113de80>] ? __alloc_percpu+0x10/0x20 [ 8.073005] [<ffffffff81199bc7>] vfs_kern_mount+0x67/0xf0 [ 8.073005] [<ffffffff8119a6e4>] do_kern_mount+0x54/0x110 [ 8.073005] [<ffffffff8119bf4a>] do_mount+0x26a/0x840 [ 8.073005] [<ffffffff8113832b>] ? strndup_user+0x5b/0x80 [ 8.073005] [<ffffffff8119c65d>] sys_mount+0x8d/0xe0 [ 8.073005] [<ffffffff815f8ae9>] system_call_fastpath+0x16/0x1b Here's an even simpler way to reproduce the bug. Simply create a 1024 byte device (empty) and try to mount it: guestfish -x -v <<EOF sparse test1.img 1024 run mount /dev/sda / EOF The stack trace from this one is substantially the same: [ 7.476010] mount R running task 0 109 108 0x00000000 [ 7.476010] ffff88001d783aa8 0000000000000082 ffff88001d6cc500 ffff88001d783fd8 [ 7.476010] ffff88001d783fd8 ffff88001d783fd8 ffff88001d430000 ffff88001d6cc500 [ 7.476010] ffffea0000722ddc ffff88001d782000 0000000000000000 ffff88001dc248a0 [ 7.476010] Call Trace: [ 7.476010] [<ffffffff8108671a>] __cond_resched+0x2a/0x40 [ 7.476010] [<ffffffff8111d2f2>] ? find_lock_page+0x42/0x80 [ 7.476010] [<ffffffff815ef820>] _cond_resched+0x30/0x40 [ 7.476010] [<ffffffff8111d2eb>] find_lock_page+0x3b/0x80 [ 7.476010] [<ffffffff8111d9df>] find_or_create_page+0x3f/0xb0 [ 7.476010] [<ffffffff811acf12>] __getblk+0xf2/0x2a0 [ 7.476010] [<ffffffff811ad113>] __bread+0x13/0xb0 [ 7.476010] [<ffffffff8121b4e7>] ext4_fill_super+0x207/0x2a50 [ 7.476010] [<ffffffff8118055b>] mount_bdev+0x1cb/0x210 [ 7.476010] [<ffffffff8121b2e0>] ? ext4_remount+0x5d0/0x5d0 [ 7.476010] [<ffffffff8116b611>] ? __kmalloc_track_caller+0x51/0x180 [ 7.476010] [<ffffffff8120a7f5>] ext4_mount+0x15/0x20 [ 7.476010] [<ffffffff81181063>] mount_fs+0x43/0x1b0 [ 7.476010] [<ffffffff8113de80>] ? __alloc_percpu+0x10/0x20 [ 7.476010] [<ffffffff81199bc7>] vfs_kern_mount+0x67/0xf0 [ 7.476010] [<ffffffff8119a6e4>] do_kern_mount+0x54/0x110 [ 7.476010] [<ffffffff8119bf4a>] do_mount+0x26a/0x840 [ 7.476010] [<ffffffff8113832b>] ? strndup_user+0x5b/0x80 [ 7.476010] [<ffffffff8119c65d>] sys_mount+0x8d/0xe0 [ 7.476010] [<ffffffff815f8ae9>] system_call_fastpath+0x16/0x1b Thanks to Jeff Moyer who suggested the following patch: https://lkml.org/lkml/2012/6/25/306 which fixes this bug. *** Bug 835084 has been marked as a duplicate of this bug. *** Patch committed to Fedora git. Will be in the next build. kernel-3.4.4-3.fc17 has been submitted as an update for Fedora 17. https://admin.fedoraproject.org/updates/kernel-3.4.4-3.fc17 kernel-3.4.4-3.fc16 has been submitted as an update for Fedora 16. https://admin.fedoraproject.org/updates/kernel-3.4.4-3.fc16 Package kernel-3.4.4-3.fc17: * should fix your issue, * was pushed to the Fedora 17 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing kernel-3.4.4-3.fc17' as soon as you are able to, then reboot. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2012-9988/kernel-3.4.4-3.fc17 then log in and leave karma (feedback). kernel-3.4.4-3.fc17 has been pushed to the Fedora 17 stable repository. If problems still persist, please make note of it in this bug report. kernel-3.4.4-4.fc16 has been submitted as an update for Fedora 16. https://admin.fedoraproject.org/updates/kernel-3.4.4-4.fc16 kernel-3.4.4-4.fc16 has been pushed to the Fedora 16 stable repository. If problems still persist, please make note of it in this bug report. Reopening on the basis of this email: https://lkml.org/lkml/2012/8/21/692 "[PATCH] block: replace __getblk_slow misfix by grow_dev_page fix" I am now testing the alternate fix proposed there. The first patch (comment 14) caused a regression. A second version of the patch went upstream and is already included in kernel-3.6.0-0.rc3.git2.1.fc18.x86_64.rpm. I wasn't able to test this until now. However I have just tested it, and the regression has gone. Therefore I am closing this bug again. |