I am seeing a kernel oops during tempest runs in net/core/skbuff.c, pasted below This is Fedora 24 (4.6.4-301.fc24.x86_64) running as a guest in a KVM environment on the OSIC cloud. I realise this isn't much help at this point ... tempest is doing a whole bunch of adding interfaces, routers, etc. as neutron goes through it's paces. I'll try and work on getting something more repeatable... Just as another data point, I believe the issue happens on RAX hosts as well (Xen guests) -- however due to their different infrastructure I can not seem to get the oops from the serial log or have it logged via netconsole. But the host dies at a similar point of testing which is very suspicious. --- [ 7162.053779] skbuff: skb_over_panic: text:ffffffff817a3f58 len:40 put:40 head:ffff88004d9b6000 data:ffff88004d9c6010 tail:0x10038 end:0xec0 dev:qg-0d7c6880-b1 [ 7162.053921] ------------[ cut here ]------------ [ 7162.053922] kernel BUG at net/core/skbuff.c:104! [ 7162.053924] invalid opcode: 0000 [#1] SMP [ 7162.053944] Modules linked in: xt_CT xt_mac xt_physdev br_netfilter xt_set ip_set_hash_net ip_set nf_conntrack_netlink nfnetlink veth xt_nat xt_REDIRECT nf_nat_redirect xt_mark xt_connmark ip6table_raw ip6table_mangle xt_comment iptable_raw xfs loop xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 nls_utf8 isofs tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_nat nfsd auth_rpcgss nfs_acl lockd grace sunrpc openvswitch nf_conntrack_ipv6 nf_nat_ipv6 nf_nat_ipv4 nf_defrag_ipv6 nf_nat libcrc32c nf_log_ipv4 nf_log_common xt_LOG xt_limit nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack kvm_intel kvm ttm ppdev drm_kms_helper irqbypass drm crct10dif_pclmul crc32_pclmul acpi_cpufreq ghash_clmulni_intel virtio_net parport_pc i2c_piix4 joydev pvpanic parport tpm_tis tpm virtio_balloon virtio_blk crc32c_intel virtio_pci virtio_ring ata_generic virtio pata_acpi serio_raw [ 7162.053956] CPU: 3 PID: 22738 Comm: ovs-vswitchd Not tainted 4.6.4-301.fc24.x86_64 #1 [ 7162.053956] Hardware name: OpenStack Foundation OpenStack Nova, BIOS Bochs 01/01/2011 [ 7162.053957] task: ffff880189a89e80 ti: ffff8801ac350000 task.ti: ffff8801ac350000 [ 7162.053978] RIP: 0010:[<ffffffff816ac334>] [<ffffffff816ac334>] skb_panic+0x64/0x70 [ 7162.053988] RSP: 0018:ffff88023fd83c88 EFLAGS: 00010282 [ 7162.053988] RAX: 0000000000000091 RBX: ffff88002d5ca400 RCX: 0000000000000002 [ 7162.053989] RDX: 0000000000000000 RSI: 0000000000000296 RDI: 0000000000000296 [ 7162.053990] RBP: ffff88023fd83ca8 R08: 000000000f99a6c7 R09: 0000000000000091 [ 7162.053990] R10: ffff880237002f00 R11: 000000000000b1e4 R12: ffff88002d4b5e80 [ 7162.053991] R13: ffff88023fd83d20 R14: ffffffff81d07f00 R15: 0000000000000000 [ 7162.053992] FS: 00007f79b25e6a40(0000) GS:ffff88023fd80000(0000) knlGS:0000000000000000 [ 7162.053992] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 7162.053993] CR2: 000056163abd8008 CR3: 0000000189871000 CR4: 00000000001406e0 [ 7162.053998] Stack: [ 7162.053999] ffff88004d9c6010 0000000000010038 0000000000000ec0 ffff8801477b3000 [ 7162.054003] ffff88023fd83cb8 ffffffff816ac84d ffff88023fd83cf0 ffffffff817a3f58 [ 7162.054004] ffff88002d5ca400 ffff88023fd83d20 ffff88002d4b5e80 ffff8801477b3000 [ 7162.054004] Call Trace: [ 7162.054011] <IRQ> [ 7162.054011] [<ffffffff816ac84d>] skb_put+0x4d/0x50 [ 7162.054019] [<ffffffff817a3f58>] ip6_mc_hdr.constprop.39+0x58/0xd0 [ 7162.054020] [<ffffffff817a4101>] mld_newpack+0x131/0x1c0 [ 7162.054032] [<ffffffff810651ce>] ? kvm_clock_read+0x1e/0x20 [ 7162.054034] [<ffffffff817a421d>] add_grhead.isra.27+0x8d/0xa0 [ 7162.054035] [<ffffffff817a4f10>] add_grec+0x390/0x4b0 [ 7162.054036] [<ffffffff817a5df2>] mld_ifc_timer_expire+0x1f2/0x290 [ 7162.054037] [<ffffffff817a5c00>] ? igmp6_timer_handler+0x70/0x70 [ 7162.054044] [<ffffffff81112eb5>] call_timer_fn+0x35/0x120 [ 7162.054045] [<ffffffff817a5c00>] ? igmp6_timer_handler+0x70/0x70 [ 7162.054046] [<ffffffff811134fe>] run_timer_softirq+0x21e/0x2d0 [ 7162.054051] [<ffffffff817e34e2>] __do_softirq+0x112/0x2e7 [ 7162.054056] [<ffffffff810ad145>] irq_exit+0x105/0x110 [ 7162.054057] [<ffffffff817e32e2>] smp_apic_timer_interrupt+0x42/0x50 [ 7162.054058] [<ffffffff817e141c>] apic_timer_interrupt+0x8c/0xa0 [ 7162.054063] <EOI> [ 7162.054063] [<ffffffff810ffb58>] ? console_unlock+0x348/0x590 [ 7162.054065] [<ffffffff81100037>] vprintk_emit+0x297/0x490 [ 7162.054066] [<ffffffff811003b9>] vprintk_default+0x29/0x40 [ 7162.054073] [<ffffffff811b68cd>] printk+0x57/0x73 [ 7162.054086] [<ffffffff81408f41>] ? __nla_reserve+0x41/0x50 [ 7162.054091] [<ffffffff816c53f6>] __dev_set_promiscuity+0xf6/0x1d0 [ 7162.054092] [<ffffffff81408fe6>] ? nla_put+0x36/0x40 [ 7162.054098] [<ffffffffc0282fa1>] ? ovs_vport_get_upcall_portids+0x71/0x80 [openvswitch] [ 7162.054100] [<ffffffff816c5794>] dev_set_promiscuity+0x24/0x50 [ 7162.054102] [<ffffffffc02832d1>] internal_dev_destroy+0x31/0x60 [openvswitch] [ 7162.054104] [<ffffffffc0282c66>] ovs_vport_del+0x46/0x50 [openvswitch] [ 7162.054106] [<ffffffffc027798b>] ovs_vport_cmd_del+0x12b/0x190 [openvswitch] [ 7162.054112] [<ffffffff816fb04b>] genl_family_rcv_msg+0x1db/0x3c0 [ 7162.054118] [<ffffffff812234c1>] ? __slab_free+0xc1/0x2a0 [ 7162.054119] [<ffffffff816ad04b>] ? __alloc_skb+0x5b/0x1f0 [ 7162.054120] [<ffffffff816fb230>] ? genl_family_rcv_msg+0x3c0/0x3c0 [ 7162.054122] [<ffffffff816fb2b7>] genl_rcv_msg+0x87/0xc0 [ 7162.054123] [<ffffffff816f873b>] ? netlink_lookup+0xdb/0x140 [ 7162.054124] [<ffffffff816fa7b4>] netlink_rcv_skb+0xa4/0xc0 [ 7162.054125] [<ffffffff816fae58>] genl_rcv+0x28/0x40 [ 7162.054126] [<ffffffff816fa1a1>] netlink_unicast+0x191/0x220 [ 7162.054127] [<ffffffff816fa527>] netlink_sendmsg+0x2f7/0x3b0 [ 7162.054130] [<ffffffff816a50b8>] sock_sendmsg+0x38/0x50 [ 7162.054131] [<ffffffff816a5b6e>] ___sys_sendmsg+0x28e/0x2a0 [ 7162.054141] [<ffffffff816a4a3f>] ? sock_destroy_inode+0x2f/0x40 [ 7162.054145] [<ffffffff81264e6b>] ? destroy_inode+0x3b/0x60 [ 7162.054147] [<ffffffff81264fac>] ? evict+0x11c/0x180 [ 7162.054148] [<ffffffff8126045e>] ? dentry_free+0x4e/0x90 [ 7162.054148] [<ffffffff812611dd>] ? __dentry_kill+0x13d/0x1b0 [ 7162.054149] [<ffffffff81261416>] ? dput+0x1c6/0x200 [ 7162.054151] [<ffffffff816a64c4>] __sys_sendmsg+0x54/0x90 [ 7162.054152] [<ffffffff816a6512>] SyS_sendmsg+0x12/0x20 [ 7162.054159] [<ffffffff81003d92>] do_syscall_64+0x62/0x110 [ 7162.054160] [<ffffffff817e0821>] entry_SYSCALL64_slow_path+0x25/0x25 [ 7162.054170] Code: cc 00 00 00 48 89 44 24 10 8b 87 c8 00 00 00 48 89 44 24 08 48 8b 87 d8 00 00 00 48 c7 c7 20 58 ac 81 48 89 04 24 e8 42 a5 b0 ff <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 e8 c7 cb d3 ff [ 7162.054171] RIP [<ffffffff816ac334>] skb_panic+0x64/0x70 [ 7162.054172] RSP <ffff88023fd83c88> [ 7162.054443] ---[ end trace 1314dd437df3d236 ]--- [ 7162.054444] Kernel panic - not syncing: Fatal exception in interrupt [ 7162.054472] Kernel Offset: disabled [ 7162.055439] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ---
Created attachment 1185355 [details] Full log from host Full serial log from host
Created attachment 1185383 [details] Script to reproduce the tempest failure Script to reproduce tempest failure
This replicates with the current rawhide kernel 4.8.0-0.rc0.git3.1.fc26.x86_64 --- [ 2256.904429] ------------[ cut here ]------------ [ 2256.904430] kernel BUG at net/core/skbuff.c:105! [ 2256.904431] invalid opcode: 0000 [#1] SMP [ 2256.904455] Modules linked in: nf_conntrack_netlink xt_CT xt_mac xt_physdev br_netfilter xt_set ip_set_hash_net ip_set nfnetlink veth xt_nat xt_REDIRECT nf_nat_redirect xt_mark xt_connmark ip6table_raw ip6table_mangle xt_comment iptable_raw xfs loop xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 nls_utf8 isofs tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_nat nfsd auth_rpcgss nfs_acl lockd grace sunrpc openvswitch nf_conntrack_ipv6 nf_nat_ipv6 nf_nat_ipv4 nf_defrag_ipv6 nf_nat libcrc32c nf_log_ipv4 nf_log_common xt_LOG xt_limit nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel ppdev joydev ttm drm_kms_helper virtio_balloon parport_pc acpi_cpufreq pvpanic drm i2c_piix4 [ 2256.904461] tpm_tis parport tpm virtio_net virtio_blk crc32c_intel serio_raw virtio_pci virtio_ring ata_generic pata_acpi virtio [ 2256.904464] CPU: 3 PID: 6510 Comm: httpd Not tainted 4.8.0-0.rc0.git3.1.fc26.x86_64 #1 [ 2256.904464] Hardware name: OpenStack Foundation OpenStack Nova, BIOS Bochs 01/01/2011 [ 2256.904465] task: ffffa01bea8eb240 task.stack: ffffa01bea8fc000 [ 2256.904482] RIP: 0010:[<ffffffff8377ed64>] [<ffffffff8377ed64>] skb_panic+0x64/0x70 [ 2256.904483] RSP: 0000:ffffa01c37403c18 EFLAGS: 00010286 [ 2256.904484] RAX: 0000000000000091 RBX: ffffa01a899d4300 RCX: ffffa01bea8eb240 [ 2256.904484] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000286 [ 2256.904485] RBP: ffffa01c37403c38 R08: 0000000000000001 R09: 0000000000000001 [ 2256.904485] R10: ffffa01a0a0be000 R11: 000000000001cbe4 R12: ffffa01a45845480 [ 2256.904486] R13: ffffa01c37403cb0 R14: ffffffff83f75ca0 R15: 0000000000000000 [ 2256.904487] FS: 00007f265a42c700(0000) GS:ffffa01c37400000(0000) knlGS:0000000000000000 [ 2256.904488] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2256.904489] CR2: 00007f264dd13468 CR3: 00000001ea3ee000 CR4: 00000000001406e0 [ 2256.904493] Stack: [ 2256.904496] ffffa01a0a0ce010 0000000000010038 0000000000000ec0 ffffa01c354a4000 [ 2256.904497] ffffa01c37403c48 ffffffff8377efdd ffffa01c37403c80 ffffffff838a4348 [ 2256.904498] ffffa01a899d4300 ffffa01c37403cb0 ffffa01a45845480 ffffa01c354a4000 [ 2256.904498] Call Trace: [ 2256.904504] <IRQ> [ 2256.904504] [<ffffffff8377efdd>] skb_put+0x4d/0x50 [ 2256.904513] [<ffffffff838a4348>] ip6_mc_hdr.constprop.42+0x58/0xd0 [ 2256.904514] [<ffffffff838a44f1>] mld_newpack+0x131/0x1c0 [ 2256.904516] [<ffffffff838a460d>] add_grhead.isra.30+0x8d/0xa0 [ 2256.904518] [<ffffffff838a5ff0>] add_grec+0x390/0x4b0 [ 2256.904520] [<ffffffff838a6533>] mld_ifc_timer_expire+0x203/0x2a0 [ 2256.904521] [<ffffffff838a6330>] ? mld_dad_timer_expire+0x60/0x60 [ 2256.904533] [<ffffffff831390df>] call_timer_fn+0xaf/0x340 [ 2256.904534] [<ffffffff83139035>] ? call_timer_fn+0x5/0x340 [ 2256.904536] [<ffffffff838a6330>] ? mld_dad_timer_expire+0x60/0x60 [ 2256.904537] [<ffffffff8313a00e>] run_timer_softirq+0x26e/0x660 [ 2256.904544] [<ffffffff83037e89>] ? sched_clock+0x9/0x10 [ 2256.904549] [<ffffffff830ea307>] ? sched_clock_local+0x17/0x80 [ 2256.904555] [<ffffffff838f5846>] __do_softirq+0xd6/0x4c5 [ 2256.904559] [<ffffffff830b5fd1>] irq_exit+0xf1/0x100 [ 2256.904560] [<ffffffff838f5502>] smp_apic_timer_interrupt+0x42/0x50 [ 2256.904573] [<ffffffff838f3566>] apic_timer_interrupt+0x96/0xa0 [ 2256.904575] <EOI> [ 2256.904587] Code: cc 00 00 00 48 89 44 24 10 8b 87 c8 00 00 00 48 89 44 24 08 48 8b 87 d8 00 00 00 48 c7 c7 88 17 d1 83 48 89 04 24 e8 1a 5d a7 ff <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 e8 b7 42 cf ff [ 2256.904589] RIP [<ffffffff8377ed64>] skb_panic+0x64/0x70 [ 2256.904589] RSP <ffffa01c37403c18> ---
Here's what I think is going on [ 2061.740016] skbuff: skb_over_panic: text:ffffffff817a3f58 len:40 put:40 head:ffff880022d53000 data:ffff880022d63010 tail:0x10038 end:0xec0 dev:tap07c424ec-90 So tail is greater than end, by 0x10038 - 0xec0 = 0xf178 That is a rather suspicious number. This seems most likely to be set by skb_reserve(skb, hlen), where hlen = LL_RESERVED_SPACE(dev) Where LL_RESERVED_SPACE comes from #define LL_RESERVED_SPACE(dev) \ 257 ((((dev)->hard_header_len+(dev)->needed_headroom)&~(HH_DATA_MOD - 1)) + HH_DATA_MOD) I pulled the skb reference out of the registers and checked out dev->needed_headroom crash> struct sk_buff.dev ffff880152d43600 dev = 0xffff880138885000 crash> struct net_device.needed_headroom 0xffff880138885000 needed_headroom = 65535 So there's our 0xffff Now I notice in 871b642adebe300be2e50aa5f65a418510f636ec [1] we introduced ndo_set_rx_headroom to net_device with the comment + * void (*ndo_set_rx_headroom)(struct net_device *dev, int needed_headroom); + * This function is used to specify the headroom that the skb must + * consider when allocation skb during packet reception. Setting + * appropriate rx headroom value allows avoiding skb head copy on + * forward. Setting a negative value reset the rx headroom to the + * default value. So I suspect we're seeing a negative value here being propagated into net_device->needed_headroom (incorectly?). This is a tap device crash> struct net_device.name 0xffff880138885000 name = "tap07c424ec-90\000" But most certainly attached to an ovs bridge (not exactly sure how to go from a net_device to see what it's attached to) The only place I can see this happening is in net/openvswitch/vport-internal_dev.c (via 3a927bc7cf9d0fbe8f4a8189dd5f8440228f64e7 [2]) where ndo_set_rx_headroom has been implemented with +void internal_set_rx_headroom(struct net_device *dev, int new_hr) +{ + dev->needed_headroom = new_hr; +} + So I'm wondering, should that be new_hr < 0 ? 0 : new_hdr So that a negative value (i.e. the default mentioned in ndo_set_rx_headroom) doesn't get propagated? Or, is this a miscalculation in mld_newpack? [1] http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=871b642adebe300be2e50aa5f65a418510f636ec [2] http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=3a927bc7cf9d0fbe8f4a8189dd5f8440228f64e7
Thanks to Ben for pointing out to me I should have followed the device ops which leads back to it being an ovs port. this makes even more sense --- struct net_device 0xffff880138885000 name = "tap07c424ec-90\000", [...] netdev_ops = 0xffffffffc0251be0, struct net_device_ops 0xffffffffc024e0e0 ndo_open = 0xffffffffc024e0e0 <internal_dev_open>, ndo_set_rx_headroom = 0xffffffffc024e140 <internal_set_rx_headroom> kmem 0xffffffffc024e0e0 ffffffffc024e0e0 (t) internal_dev_open [openvswitch] ---
See http://marc.info/?l=linux-netdev&m=147018227805298&w=2 https://patchwork.ozlabs.org/patch/655184/
Fix included with https://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=5ef9f289c4e698054e5687edb54f0da3cdc9173a
I dropped the patch in rawhide, should show up in a few days. I'm going to leave this bug open until verified that it is fixed.
kernel-4.6.7-300.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2016-5e24d8c350
kernel-4.6.7-300.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-5e24d8c350
kernel-4.6.7-300.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report.