Bug 1361414
| Summary: | kernel BUG at net/core/skbuff.c:104 during OpenStack tempest run | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Ian Wienand <iwienand> | ||||||
| Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> | ||||||
| Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
| Severity: | unspecified | Docs Contact: | |||||||
| Priority: | unspecified | ||||||||
| Version: | 24 | CC: | gansalmon, itamar, jonathan, kernel-maint, labbott, madhu.chinakonda, mchehab | ||||||
| Target Milestone: | --- | ||||||||
| Target Release: | --- | ||||||||
| Hardware: | x86_64 | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | kernel-4.6.7-300.fc24 | Doc Type: | If docs needed, set a value | ||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2016-08-23 12:48:02 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
|
Description
Ian Wienand
2016-07-29 04:21:24 UTC
Created attachment 1185355 [details]
Full log from host
Full serial log from host
Created attachment 1185383 [details]
Script to reproduce the tempest failure
Script to reproduce tempest failure
This replicates with the current rawhide kernel 4.8.0-0.rc0.git3.1.fc26.x86_64 --- [ 2256.904429] ------------[ cut here ]------------ [ 2256.904430] kernel BUG at net/core/skbuff.c:105! [ 2256.904431] invalid opcode: 0000 [#1] SMP [ 2256.904455] Modules linked in: nf_conntrack_netlink xt_CT xt_mac xt_physdev br_netfilter xt_set ip_set_hash_net ip_set nfnetlink veth xt_nat xt_REDIRECT nf_nat_redirect xt_mark xt_connmark ip6table_raw ip6table_mangle xt_comment iptable_raw xfs loop xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 nls_utf8 isofs tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_nat nfsd auth_rpcgss nfs_acl lockd grace sunrpc openvswitch nf_conntrack_ipv6 nf_nat_ipv6 nf_nat_ipv4 nf_defrag_ipv6 nf_nat libcrc32c nf_log_ipv4 nf_log_common xt_LOG xt_limit nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel ppdev joydev ttm drm_kms_helper virtio_balloon parport_pc acpi_cpufreq pvpanic drm i2c_piix4 [ 2256.904461] tpm_tis parport tpm virtio_net virtio_blk crc32c_intel serio_raw virtio_pci virtio_ring ata_generic pata_acpi virtio [ 2256.904464] CPU: 3 PID: 6510 Comm: httpd Not tainted 4.8.0-0.rc0.git3.1.fc26.x86_64 #1 [ 2256.904464] Hardware name: OpenStack Foundation OpenStack Nova, BIOS Bochs 01/01/2011 [ 2256.904465] task: ffffa01bea8eb240 task.stack: ffffa01bea8fc000 [ 2256.904482] RIP: 0010:[<ffffffff8377ed64>] [<ffffffff8377ed64>] skb_panic+0x64/0x70 [ 2256.904483] RSP: 0000:ffffa01c37403c18 EFLAGS: 00010286 [ 2256.904484] RAX: 0000000000000091 RBX: ffffa01a899d4300 RCX: ffffa01bea8eb240 [ 2256.904484] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000286 [ 2256.904485] RBP: ffffa01c37403c38 R08: 0000000000000001 R09: 0000000000000001 [ 2256.904485] R10: ffffa01a0a0be000 R11: 000000000001cbe4 R12: ffffa01a45845480 [ 2256.904486] R13: ffffa01c37403cb0 R14: ffffffff83f75ca0 R15: 0000000000000000 [ 2256.904487] FS: 00007f265a42c700(0000) GS:ffffa01c37400000(0000) knlGS:0000000000000000 [ 2256.904488] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2256.904489] CR2: 00007f264dd13468 CR3: 00000001ea3ee000 CR4: 00000000001406e0 [ 2256.904493] Stack: [ 2256.904496] ffffa01a0a0ce010 0000000000010038 0000000000000ec0 ffffa01c354a4000 [ 2256.904497] ffffa01c37403c48 ffffffff8377efdd ffffa01c37403c80 ffffffff838a4348 [ 2256.904498] ffffa01a899d4300 ffffa01c37403cb0 ffffa01a45845480 ffffa01c354a4000 [ 2256.904498] Call Trace: [ 2256.904504] <IRQ> [ 2256.904504] [<ffffffff8377efdd>] skb_put+0x4d/0x50 [ 2256.904513] [<ffffffff838a4348>] ip6_mc_hdr.constprop.42+0x58/0xd0 [ 2256.904514] [<ffffffff838a44f1>] mld_newpack+0x131/0x1c0 [ 2256.904516] [<ffffffff838a460d>] add_grhead.isra.30+0x8d/0xa0 [ 2256.904518] [<ffffffff838a5ff0>] add_grec+0x390/0x4b0 [ 2256.904520] [<ffffffff838a6533>] mld_ifc_timer_expire+0x203/0x2a0 [ 2256.904521] [<ffffffff838a6330>] ? mld_dad_timer_expire+0x60/0x60 [ 2256.904533] [<ffffffff831390df>] call_timer_fn+0xaf/0x340 [ 2256.904534] [<ffffffff83139035>] ? call_timer_fn+0x5/0x340 [ 2256.904536] [<ffffffff838a6330>] ? mld_dad_timer_expire+0x60/0x60 [ 2256.904537] [<ffffffff8313a00e>] run_timer_softirq+0x26e/0x660 [ 2256.904544] [<ffffffff83037e89>] ? sched_clock+0x9/0x10 [ 2256.904549] [<ffffffff830ea307>] ? sched_clock_local+0x17/0x80 [ 2256.904555] [<ffffffff838f5846>] __do_softirq+0xd6/0x4c5 [ 2256.904559] [<ffffffff830b5fd1>] irq_exit+0xf1/0x100 [ 2256.904560] [<ffffffff838f5502>] smp_apic_timer_interrupt+0x42/0x50 [ 2256.904573] [<ffffffff838f3566>] apic_timer_interrupt+0x96/0xa0 [ 2256.904575] <EOI> [ 2256.904587] Code: cc 00 00 00 48 89 44 24 10 8b 87 c8 00 00 00 48 89 44 24 08 48 8b 87 d8 00 00 00 48 c7 c7 88 17 d1 83 48 89 04 24 e8 1a 5d a7 ff <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 e8 b7 42 cf ff [ 2256.904589] RIP [<ffffffff8377ed64>] skb_panic+0x64/0x70 [ 2256.904589] RSP <ffffa01c37403c18> --- Here's what I think is going on
[ 2061.740016] skbuff: skb_over_panic: text:ffffffff817a3f58 len:40 put:40 head:ffff880022d53000 data:ffff880022d63010 tail:0x10038 end:0xec0 dev:tap07c424ec-90
So tail is greater than end, by
0x10038 - 0xec0 = 0xf178
That is a rather suspicious number. This seems most likely to be set by skb_reserve(skb, hlen), where
hlen = LL_RESERVED_SPACE(dev)
Where LL_RESERVED_SPACE comes from
#define LL_RESERVED_SPACE(dev) \
257 ((((dev)->hard_header_len+(dev)->needed_headroom)&~(HH_DATA_MOD - 1)) + HH_DATA_MOD)
I pulled the skb reference out of the registers and checked out
dev->needed_headroom
crash> struct sk_buff.dev ffff880152d43600
dev = 0xffff880138885000
crash> struct net_device.needed_headroom 0xffff880138885000
needed_headroom = 65535
So there's our 0xffff
Now I notice in 871b642adebe300be2e50aa5f65a418510f636ec [1] we
introduced ndo_set_rx_headroom to net_device with the comment
+ * void (*ndo_set_rx_headroom)(struct net_device *dev, int needed_headroom);
+ * This function is used to specify the headroom that the skb must
+ * consider when allocation skb during packet reception. Setting
+ * appropriate rx headroom value allows avoiding skb head copy on
+ * forward. Setting a negative value reset the rx headroom to the
+ * default value.
So I suspect we're seeing a negative value here being propagated
into net_device->needed_headroom (incorectly?).
This is a tap device
crash> struct net_device.name 0xffff880138885000
name = "tap07c424ec-90\000"
But most certainly attached to an ovs bridge (not exactly sure how to go from a net_device to see what it's attached to)
The only place I can see this happening is in net/openvswitch/vport-internal_dev.c (via 3a927bc7cf9d0fbe8f4a8189dd5f8440228f64e7 [2]) where
ndo_set_rx_headroom has been implemented with
+void internal_set_rx_headroom(struct net_device *dev, int new_hr)
+{
+ dev->needed_headroom = new_hr;
+}
+
So I'm wondering, should that be
new_hr < 0 ? 0 : new_hdr
So that a negative value (i.e. the default mentioned in ndo_set_rx_headroom) doesn't get propagated?
Or, is this a miscalculation in mld_newpack?
[1] http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=871b642adebe300be2e50aa5f65a418510f636ec
[2] http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=3a927bc7cf9d0fbe8f4a8189dd5f8440228f64e7
Thanks to Ben for pointing out to me I should have followed the device ops which leads back to it being an ovs port. this makes even more sense --- struct net_device 0xffff880138885000 name = "tap07c424ec-90\000", [...] netdev_ops = 0xffffffffc0251be0, struct net_device_ops 0xffffffffc024e0e0 ndo_open = 0xffffffffc024e0e0 <internal_dev_open>, ndo_set_rx_headroom = 0xffffffffc024e140 <internal_set_rx_headroom> kmem 0xffffffffc024e0e0 ffffffffc024e0e0 (t) internal_dev_open [openvswitch] --- See http://marc.info/?l=linux-netdev&m=147018227805298&w=2 https://patchwork.ozlabs.org/patch/655184/ Fix included with https://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=5ef9f289c4e698054e5687edb54f0da3cdc9173a I dropped the patch in rawhide, should show up in a few days. I'm going to leave this bug open until verified that it is fixed. kernel-4.6.7-300.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2016-5e24d8c350 kernel-4.6.7-300.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-5e24d8c350 kernel-4.6.7-300.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report. |