Bug 642243
Summary: | guest kernel panic when transfering file from host to guest during migration | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Amos Kong <akong> | |
Component: | qemu-kvm | Assignee: | Michael S. Tsirkin <mst> | |
Status: | CLOSED DUPLICATE | QA Contact: | Virtualization Bugs <virt-bugs> | |
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | 6.0 | CC: | ailan, amit.shah, bcao, jasowang, llim, mkenneth, mst, plyons, tburke, virt-maint | |
Target Milestone: | beta | |||
Target Release: | 6.1 | |||
Hardware: | All | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 658437 (view as bug list) | Environment: | ||
Last Closed: | 2011-01-31 22:23:34 UTC | Type: | --- | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 580951, 658437 |
Description
Amos Kong
2010-10-12 12:46:39 UTC
Looks like it's related to virtio-block: BUG: unable to handle kernel paging request at ffffea000dfa72f0 IP: [<ffffffff81157b63>] free_block+0xe3/0x180 PGD 20a2067 PUD 20a3067 PMD 0 Oops: 0002 [#1] SMP last sysfs file: /sys/devices/virtio-pci/virtio1/block/vda/dev CPU 0 Modules linked in: virtio_balloon ipv6 dm_mirror dm_region_hash dm_log ppdev parport_pc parport virtio_net i2c_piix4 i2c_core sg ext4 mbcache jbd2 sr_mod cdrom virtio_blk pata_acpi ata_generic ata_piix virtio_pci virtio_ring virtio dm_mod [last unloaded: speedstep_lib] Modules linked in: virtio_balloon ipv6 dm_mirror dm_region_hash dm_log ppdev parport_pc parport virtio_net i2c_piix4 i2c_core sg ext4 mbcache jbd2 sr_mod cdrom virtio_blk pata_acpi ata_generic ata_piix virtio_pci virtio_ring virtio dm_mod [last unloaded: speedstep_lib] Pid: 303, comm: kdmflush Tainted: G W ---------------- 2.6.32-70.el6.x86_64 #1 KVM RIP: 0010:[<ffffffff81157b63>] [<ffffffff81157b63>] free_block+0xe3/0x180 RSP: 0018:ffff880001e03ab0 EFLAGS: 00010002 RAX: ffffea0001abaac8 RBX: ffff8800377825c0 RCX: 000000000313b1fe RDX: dead000000100100 RSI: 0000000000000000 RDI: ffffea0001abaac8 RBP: ffff880001e03b00 R08: ffffea0001abaac8 R09: 0000000000000000 R10: 00000000ffffffff R11: 000000000000003d R12: 000000000000003c R13: ffff880037626908 R14: 000000000000001e R15: ffffea0000000000 FS: 0000000000000000(0000) GS:ffff880001e00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: ffffea000dfa72f0 CR3: 000000007a279000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kdmflush (pid: 303, threadinfo ffff880079636000, task ffff88007c8020c0) Stack: ffff8800375c20c0 ffffea0001abaac8 000000000000100c ffff88007b13cfb0 <0> ffffea0000cc3d68 ffff880037626800 0000000000000082 ffff88007b470ba0 <0> ffff8800377825c0 ffff880037626818 ffff880001e03b70 ffffffff81157988 Call Trace: <IRQ> [<ffffffff81157988>] kmem_cache_free+0x248/0x2b0 [<ffffffff811a2084>] ? bio_free+0x64/0x70 [<ffffffff8110e617>] mempool_free_slab+0x17/0x20 [<ffffffff8110e6d5>] mempool_free+0x95/0xa0 [<ffffffffa0001520>] dec_pending+0xc0/0x1e0 [dm_mod] [<ffffffffa00018df>] clone_endio+0x9f/0xd0 [dm_mod] [<ffffffff811a0d3d>] bio_endio+0x1d/0x40 [<ffffffff8123f7fb>] req_bio_endio+0xab/0x110 [<ffffffff8124083f>] blk_update_request+0xff/0x440 [<ffffffff81240ba7>] blk_update_bidi_request+0x27/0x80 [<ffffffff812418fe>] __blk_end_request_all+0x2e/0x60 [<ffffffffa004c125>] blk_done+0x35/0xe0 [virtio_blk] [<ffffffffa002519c>] vring_interrupt+0x3c/0xd0 [virtio_ring] (In reply to comment #1) > Looks like it's related to virtio-block: I can also reproduce this bug with ide/e1000 configuration. host kernel:2.6.32-71.3.1.el6_0.x86_64 qemu-kvm-0.12.1.2-2.113.el6_0.1.x86_64 Panic msg: --------------------------------------------------------- general protection fault: 0000 [#1] SMP last sysfs file: /sys/devices/pci0000:00/0000:00:03.0/irq CPU 1 Modules linked in: ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 dm_mirror dm_region_hash dm_log ppdev parport_pc parport e1000 i2c_piix4 i2c_core sg ext4 mbcache jbd2 sd_mod crc_t10dif sr_mod cdrom pata_acpi ata_generic ata_piix dm_mod [last unloaded: speedstep_lib] Modules linked in: ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 dm_mirror dm_region_hash dm_log ppdev parport_pc parport e1000 i2c_piix4 i2c_core sg ext4 mbcache jbd2 sd_mod crc_t10dif sr_mod cdrom pata_acpi ata_generic ata_piix dm_mod [last unloaded: speedstep_lib] Pid: 1438, comm: sshd Not tainted 2.6.32-71.el6.x86_64 #1 KVM RIP: 0010:[<ffffffff81268a77>] [<ffffffff81268a77>] __list_add+0x17/0xa0 RSP: 0018:ffff88007a161a88 EFLAGS: 00010086 RAX: ffffea0001aa9f88 RBX: ffffea0001aa9fb0 RCX: ffffea0001aa9fb0 RDX: dead000000100100 RSI: ffffea0001aa9fb0 RDI: ffffea0001aa9fb0 RBP: ffff88007a161aa8 R08: ffffea0001aa9fb0 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000001000 R12: 0000000000000000 R13: ffff8800000126c0 R14: 0000000000000001 R15: ffffea0001aa9fb0 FS: 00007f498c65f7c0(0000) GS:ffff880001f00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fa756381000 CR3: 0000000037f6a000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process sshd (pid: 1438, threadinfo ffff88007a160000, task ffff880037be6a70) Stack: 0000000000000000 000000000000001f 0000000000000000 ffff8800000126c0 <0> ffff88007a161bd8 ffffffff8111d186 ffffea0001aa9fb0 ffffea0001aa9f88 <0> 0000000100000001 ffff88007ccbcec0 000000000000000f 00000040ffffffff Call Trace: [<ffffffff8111d186>] get_page_from_freelist+0x5c6/0x820 [<ffffffff8111e1c6>] __alloc_pages_nodemask+0xf6/0x810 [<ffffffff811502a7>] alloc_pages_current+0x87/0xd0 [<ffffffff8117636c>] pipe_write+0x36c/0x650 [<ffffffff8116c51a>] do_sync_write+0xfa/0x140 [<ffffffff81091ca0>] ? autoremove_wake_function+0x0/0x40 [<ffffffff8120bf0f>] ? selinux_file_permission+0xbf/0x150 [<ffffffff811ff3b6>] ? security_file_permission+0x16/0x20 [<ffffffff8116c818>] vfs_write+0xb8/0x1a0 [<ffffffff810830a1>] ? sigprocmask+0x71/0x110 [<ffffffff8116d251>] sys_write+0x51/0x90 [<ffffffff81013172>] system_call_fastpath+0x16/0x1b Code: ff 48 8b 03 eb 92 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 48 83 ec 20 48 89 5d e8 4c 89 65 f0 48 89 fb 4c 89 6d f8 <4c> 8b 42 08 49 89 f5 49 89 d4 49 39 f0 75 27 4d 8b 45 00 4d 39 RIP [<ffffffff81268a77>] __list_add+0x17/0xa0 RSP <ffff88007a161a88> ---[ end trace 9cd7f43d04294bd4 ]--- Kernel panic - not syncing: Fatal exception Pid: 1438, comm: sshd Tainted: G D ---------------- 2.6.32-71.el6.x86_64 #1 Call Trace: [<ffffffff814c7b23>] panic+0x78/0x137 [<ffffffff814cbbf4>] oops_end+0xe4/0x100 [<ffffffff8101733b>] die+0x5b/0x90 [<ffffffff814cb742>] do_general_protection+0x152/0x160 [<ffffffff814caf15>] general_protection+0x25/0x30 [<ffffffff81268a77>] ? __list_add+0x17/0xa0 [<ffffffff8111d186>] get_page_from_freelist+0x5c6/0x820 [<ffffffff8111e1c6>] __alloc_pages_nodemask+0xf6/0x810 [<ffffffff811502a7>] alloc_pages_current+0x87/0xd0 [<ffffffff8117636c>] pipe_write+0x36c/0x650 [<ffffffff8116c51a>] do_sync_write+0xfa/0x140 [<ffffffff81091ca0>] ? autoremove_wake_function+0x0/0x40 [<ffffffff8120bf0f>] ? selinux_file_permission+0xbf/0x150 [<ffffffff811ff3b6>] ? security_file_permission+0x16/0x20 [<ffffffff8116c818>] vfs_write+0xb8/0x1a0 [<ffffffff810830a1>] ? sigprocmask+0x71/0x110 [<ffffffff8116d251>] sys_write+0x51/0x90 [<ffffffff81013172>] system_call_fastpath+0x16/0x1b This looks like random memory allocator freelist corruption. I don't think the path that we reach it with has anything to do with the root cause. Does it happen with different hardware and different configurations? (In reply to comment #3) > This looks like random memory allocator freelist corruption. I don't think the > path that we reach it with has anything to do with the root cause. > > Does it happen with different hardware and different configurations? Yes, it can be reproduced in different hosts, and different configuration. Which makes it pretty clear it's not related to virtio. Any chance you could check if there is any older guest kernel and/or qemu that does not show these issues? I can reproduce this bug with those version combination. guest kernel qemu-kvm ------------ -------- 2.6.32-76 0.12.1.2-2.113.el6_0.3 2.6.32-70 0.12.1.2-2.113.el6_0.3 2.6.32-66 0.12.1.2-2.113.el6_0.3 2.6.32-70 0.12.1.2-2.104.el6 2.6.32-70 0.12.1.2-2.97.el6 host kernel: 2.6.32-71.3.1 Could be a duplicate of: https://bugzilla.redhat.com/show_bug.cgi?id=647367 Can you check with host 2.6.32-83.el6 please? Is this fixed? Can we close? (In reply to comment #12) > Is this fixed? Can we close? Hi,mst I am working on it ,will check whether this issue is dup of bug #647367 (In reply to comment #12) > Is this fixed? Can we close? I've verified bug 658437. https://bugzilla.redhat.com/show_bug.cgi?id=658437#c6 qemu-kvm-0.12.1.2-2.129.el6.x86_64 host kernel: kernel-2.6.32-94.el6 So can we duplicate this bug with 658437 ? more detail: vhost: on kernel-2.6.32-71.el6: host crash of bz #623915 kernel-2.6.32-94.el6: not reproduced vhost: off kernel-2.6.32-71.el6: reproduced kernel-2.6.32-94.el6: not reproduced I think this is not a duplicate of 658437 as that deals with vhost-net issue only. this one happens without vhost, so I think this one is same as https://bugzilla.redhat.com/show_bug.cgi?id=647367 *** This bug has been marked as a duplicate of bug 647367 *** |