Bug 455097
Summary: | 2.6.26-0.124.rc9.git5.fc10.x86_64 oops in new_slab in kvm guest | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Roland Dreier <rolandd> |
Component: | kvm | Assignee: | Glauber Costa <gcosta> |
Status: | CLOSED DUPLICATE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | high | Docs Contact: | |
Priority: | medium | ||
Version: | 10 | CC: | bashton, berrange, clalance, gcosta, kernel-maint, markmc, mtosatti, quintela, virt-maint, xen-maint |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2009-03-04 21:50:37 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Roland Dreier
2008-07-11 22:17:14 UTC
Got a similar looking crash on 2.6.26-136.fc10.x86_64 (running in a VM on the same host system): BUG: unable to handle kernel paging request at ffff8100375c0000 IP: [<ffffffff810b17d4>] new_slab+0x279/0x2f1 PGD 8063 PUD 9063 PMD 38222163 PTE 80000000375c0160 Oops: 0002 [1] SMP DEBUG_PAGEALLOC CPU 0 Modules linked in: bridge bnep rfcomm l2cap bluetooth fuse sunrpc ipt_REJECT nf_conntrack_ipv4 iptab le_filter ip_tables ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 xt_state nf_conntrack ip6table_filter ip 6_tables x_tables ipv6 loop dm_multipath ppdev sr_mod cdrom snd_seq_dummy snd_seq_oss snd_seq_midi_e vent snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm virtio_net parport_pc parport floppy s nd_timer ata_generic snd soundcore snd_page_alloc pcspkr ata_piix pata_acpi i2c_piix4 i2c_core dm_sn apshot dm_zero dm_mirror dm_log dm_mod virtio_blk virtio_pci virtio_ring virtio ext3 jbd mbcache uhc i_hcd ohci_hcd ehci_hcd [last unloaded: freq_table] Pid: 2916, comm: sendmail Not tainted 2.6.26-136.fc10.x86_64 #1 RIP: 0010:[<ffffffff810b17d4>] [<ffffffff810b17d4>] new_slab+0x279/0x2f1 RSP: 0018:ffffffff816b0990 EFLAGS: 00010016 RAX: 002000000000205a RBX: ffffe200014c2800 RCX: 0000000000008000 RDX: 0000000000000003 RSI: 0000000000008000 RDI: ffff8100375c0000 RBP: ffffffff816b09c0 R08: ffffffff816b0760 R09: 0000000000000086 R10: 0000000000000000 R11: 0000000000000001 R12: 00000000ffffffff R13: 0000000000004020 R14: ffff8100375c0000 R15: ffffffff81531860 FS: 00007f70e567a7a0(0000) GS:ffffffff81492000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: ffff8100375c0000 CR3: 0000000000201000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process sendmail (pid: 2916, threadinfo ffff8100341fe000, task ffff8100318025b0) Stack: ffffffff816b09a0 ffffe20001553400 0000000000000000 ffff81000107bf80 ffffffff81531860 0000000000000020 ffffffff816b0a20 ffffffff810b1df9 ffffffff81255863 00000020ffffffff ffffffff816b0a10 ffffffff810595dd Call Trace: <IRQ> [<ffffffff810b1df9>] __slab_alloc+0x273/0x490 [<ffffffff81255863>] ? __netdev_alloc_skb+0x31/0x4e [<ffffffff810595dd>] ? mark_held_locks+0x5c/0x77 [<ffffffff810b2e33>] __kmalloc_node_track_caller+0x9f/0x103 [<ffffffff81255863>] ? __netdev_alloc_skb+0x31/0x4e [<ffffffff81254dec>] __alloc_skb+0x6f/0x135 [<ffffffff81255863>] __netdev_alloc_skb+0x31/0x4e [<ffffffffa01111c4>] :virtio_net:try_fill_recv+0x53/0x10b [<ffffffffa0111d15>] :virtio_net:virtnet_poll+0x22a/0x2e0 [<ffffffff81258592>] ? net_rx_action+0x73/0x22d [<ffffffff81258606>] net_rx_action+0xe7/0x22d [<ffffffff8103e7ea>] __do_softirq+0x77/0x101 [<ffffffff8100d64c>] call_softirq+0x1c/0x28 [<ffffffff8100e955>] do_softirq+0x4d/0xb0 [<ffffffff8103e2af>] irq_exit+0x4e/0x8f [<ffffffff8100ec75>] do_IRQ+0x147/0x169 [<ffffffff8100c732>] ret_from_intr+0x0/0x1e <EOI> [<ffffffff81097438>] ? unmap_vmas+0x3e7/0x876 [<ffffffff810973d1>] ? unmap_vmas+0x380/0x876 [<ffffffff8109ba16>] ? exit_mmap+0x7c/0xf3 [<ffffffff81036bda>] ? mmput+0x42/0x9e [<ffffffff8103acb5>] ? exit_mm+0xe6/0xef [<ffffffff8103c7e8>] ? do_exit+0x27b/0x8d4 [<ffffffff8107c26b>] ? audit_syscall_entry+0x126/0x15a [<ffffffff8107bf3c>] ? audit_syscall_exit+0x331/0x353 [<ffffffff8103ceba>] ? do_group_exit+0x79/0xa9 [<ffffffff8103cefc>] ? sys_exit_group+0x12/0x14 [<ffffffff8100c2c2>] ? tracesys+0xd0/0xd5 Code: 10 49 8b 07 f6 c4 08 74 24 48 8b 03 31 d2 f6 c4 20 74 06 8b 93 b8 00 00 00 88 d1 be 00 10 00 00 b0 5a 48 d3 e6 4c 89 f7 48 89 f1 <f3> aa 4d 89 f5 4d 89 f4 eb 21 4c 89 ea 48 89 de 4c 89 ff e8 9a RIP [<ffffffff810b17d4>] new_slab+0x279/0x2f1 RSP <ffffffff816b0990> CR2: ffff8100375c0000 ---[ end trace 42efba9b37ce41f3 ]--- Roland: seen this lately with more recent rawhide kernels? Can you confirm you weren't seeing it with the stock F9 kernel? Haven't seen any reports of this upstream ... Both oops seem to show slab corruption when virtio_net tries to allocate more skbs from an interrupt that preempted an exiting process while it was freeing its vmas It'd be interesting to see if it triggers with e.g. DEBUG_PAGEALLOC or SLUB_DEBUG_ON disabled Sorry for the slow response. Anyway, I tried updating to 2.6.27-0.226.rc1.git5.fc10.x86_64 and I've found that it's difficult to even get the VM to boot reliably (same host -- kvm 72 with kernel post-2.6.27-rc2 latest git, including upstream kvm module). For example I just got this early in boot: BUG: unable to handle kernel paging request at ffff8800335d1f58 IP: [<ffffffff8100ed1f>] copy_thread+0x47/0x1ae PGD 202063 PUD 206063 PMD 33aca163 PTE 335d1160 Oops: 0002 [1] SMP DEBUG_PAGEALLOC CPU 0 Modules linked in: dm_snapshot dm_zero dm_mirror dm_log dm_mod virtio_blk virtio_pci virtio_ring virtio ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd Pid: 801, comm: udevd Tainted: G S 2.6.27-0.226.rc1.git5.fc10.x86_64 #1 RIP: 0010:[<ffffffff8100ed1f>] [<ffffffff8100ed1f>] copy_thread+0x47/0x1ae RSP: 0018:ffff88002f9f5da8 EFLAGS: 00010286 RAX: ffff8800335d2000 RBX: ffff88002f9fcb20 RCX: 000000000000002a RDX: 00007fff16698fe0 RSI: ffff88002f9f5f58 RDI: ffff8800335d1f58 RBP: ffff88002f9f5dd8 R08: ffff88002f9fcb20 R09: ffff88002f9f5f58 R10: 0000000000000046 R11: ffff88002f9f5c58 R12: ffff88002f9f8000 R13: ffff8800335d1f58 R14: 0000000001200011 R15: 0000000001200011 FS: 00007fe70e671780(0000) GS:ffffffff814f7380(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: ffff8800335d1f58 CR3: 000000002f8ee000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process udevd (pid: 801, threadinfo ffff88002f9f4000, task ffff88002f9f8000) Stack: ffff88002f988568 0000000000000000 0000000000000000 ffff88002f9fcb20 0000000000000000 ffff88002f9fcdb8 ffff88002f9f5e78 ffffffff8104289e ffff88002f9f5f58 0000000000000000 ffffffffff5fc0b0 ffffffff81010bbe Call Trace: [<ffffffff8104289e>] copy_process+0xc6d/0x13d7 [<ffffffff81010bbe>] ? restore_args+0x0/0x30 [<ffffffff81043111>] do_fork+0x109/0x250 [<ffffffff810cd120>] ? fd_install+0x5b/0x64 [<ffffffff810d57c4>] ? do_pipe_flags+0xb5/0x110 [<ffffffff8101034a>] ? system_call_fastpath+0x16/0x1b [<ffffffff8100e622>] sys_clone+0x28/0x2a [<ffffffff81010867>] ptregscall_common+0x67/0xb0 I booted 2.6.27-0.226.rc1.git5.fc10.x86_64 with "slub_debug=FPZ" and got the following oops after leaving the VM idle for a while: BUG: unable to handle kernel paging request at ffff880012c08000 IP: [<ffffffff810c86c9>] new_slab+0x158/0x1cb PGD 202063 PUD 206063 PMD 19d06163 PTE 12c08160 Oops: 0002 [1] SMP DEBUG_PAGEALLOC CPU 0 Modules linked in: bridge stp rfcomm bnep l2cap bluetooth fuse sunrpc ipt_REJECT nf_conntrack_ipv4 iptable_filter ip_tables ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables x_tables ipv6 loop dm_multipath sr_mod cdrom ppdev snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device virtio_net floppy snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore snd_page_alloc pcspkr ata_generic parport_pc parport i2c_piix4 ata_piix i2c_core pata_acpi dm_snapshot dm_zero dm_mirror dm_log dm_mod virtio_blk virtio_pci virtio_ring virtio ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: freq_table] Pid: 2689, comm: 0logwatch Tainted: G S 2.6.27-0.226.rc1.git5.fc10.x86_64 #1 RIP: 0010:[<ffffffff810c86c9>] [<ffffffff810c86c9>] new_slab+0x158/0x1cb RSP: 0018:ffffffff816e3980 EFLAGS: 00010016 RAX: 000000000000005a RBX: ffffe20000708300 RCX: 0000000000008000 RDX: 000000000000005a RSI: 0000000000008000 RDI: ffff880012c08000 RBP: ffffffff816e39b0 R08: ffffffff816e3760 R09: ffffffff816e37b0 R10: 0000000000000046 R11: 0000000000000001 R12: 0000000000004020 R13: 000000000003000f R14: ffffffff814f4708 R15: ffff880012c08000 FS: 00007f94bb3956f0(0000) GS:ffffffff814f7380(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: ffff880012c08000 CR3: 0000000000201000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process 0logwatch (pid: 2689, threadinfo ffff88002d984000, task ffff88002ec1cac0) Stack: ffffffff816e39b0 ffffe200011ab100 0000000000000000 ffff88000107dd50 ffffffff814f4708 0000000000000020 ffffffff816e3a10 ffffffff810c8cd1 ffffffff812814a4 00000020ffffffff ffff88002ec1cac0 ffffffff814f4708 Call Trace: <IRQ> [<ffffffff810c8cd1>] __slab_alloc+0x267/0x3eb [<ffffffff812814a4>] ? __netdev_alloc_skb+0x36/0x52 [<ffffffff810c9c9e>] __kmalloc_node_track_caller+0xa4/0x108 [<ffffffff812814a4>] ? __netdev_alloc_skb+0x36/0x52 [<ffffffff812809f6>] __alloc_skb+0x74/0x13a [<ffffffff812814a4>] __netdev_alloc_skb+0x36/0x52 [<ffffffffa014328a>] try_fill_recv+0x5f/0x1c5 [virtio_net] [<ffffffffa014408f>] virtnet_poll+0x2c4/0x386 [virtio_net] [<ffffffff812845c4>] net_rx_action+0xff/0x246 [<ffffffff81048ff1>] __do_softirq+0x83/0x10e [<ffffffff81011d8c>] call_softirq+0x1c/0x28 [<ffffffff81013051>] do_softirq+0x52/0xb5 [<ffffffff81048b99>] irq_exit+0x53/0xa2 [<ffffffff81013380>] do_IRQ+0x14c/0x16e [<ffffffff81010a93>] ret_from_intr+0x0/0x2e <EOI> [<ffffffff81026d5a>] ? native_flush_tlb_global+0x47/0x56 [<ffffffff8102d4c1>] ? kernel_map_pages+0x11a/0x12d [<ffffffff810a16f9>] ? free_hot_cold_page+0xb6/0x193 [<ffffffff810a1804>] ? __pagevec_free+0x2e/0x42 [<ffffffff810a4de6>] ? release_pages+0x183/0x1ef [<ffffffff810b85a8>] ? free_pages_and_swap_cache+0x5c/0x77 [<ffffffff810acba4>] ? unmap_vmas+0x5fb/0x87c [<ffffffff810b0f4e>] ? exit_mmap+0x91/0x10a [<ffffffff8104153f>] ? mmput+0x47/0xa3 [<ffffffff810453d4>] ? exit_mm+0x10d/0x118 [<ffffffff8104704b>] ? do_exit+0x2a0/0x904 [<ffffffff8104772d>] ? do_group_exit+0x7e/0xae [<ffffffff81047774>] ? sys_exit_group+0x17/0x19 [<ffffffff8101034a>] ? system_call_fastpath+0x16/0x1b Code: c1 e0 0c 4c 8d 3c 10 49 8b 06 f6 c4 08 74 1e 48 89 df e8 6d d3 ff ff be 00 10 00 00 89 c1 b2 5a 48 d3 e6 4c 89 ff 88 d0 48 89 f1 <f3> aa 4d 89 fd 4d 89 fc eb 21 4c 89 ea 48 89 de 4c 89 f7 e8 30 RIP [<ffffffff810c86c9>] new_slab+0x158/0x1cb RSP <ffffffff816e3980> CR2: ffff880012c08000 ---[ end trace 39747af2df17e80b ]--- Re-assigning kvm.ko bugs to the kvm package for easier tracking This bug appears to have been reported against 'rawhide' during the Fedora 10 development cycle. Changing version to '10'. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping *** This bug has been marked as a duplicate of bug 480822 *** |