Bug 1361414

Summary: kernel BUG at net/core/skbuff.c:104 during OpenStack tempest run
Product: [Fedora] Fedora Reporter: Ian Wienand <iwienand>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 24CC: gansalmon, itamar, jonathan, kernel-maint, labbott, madhu.chinakonda, mchehab
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Unspecified   
Whiteboard:
Fixed In Version: kernel-4.6.7-300.fc24 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-08-23 12:48:02 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Full log from host
none
Script to reproduce the tempest failure none

Description Ian Wienand 2016-07-29 04:21:24 UTC
I am seeing a kernel oops during tempest runs in net/core/skbuff.c, pasted below

This is Fedora 24 (4.6.4-301.fc24.x86_64) running as a guest in a KVM environment on the OSIC cloud.  I realise this isn't much help at this point ... tempest is doing a whole bunch of adding interfaces, routers, etc. as neutron goes through it's paces.  I'll try and work on getting something more repeatable...

Just as another data point, I believe the issue happens on RAX hosts as well (Xen guests) -- however due to their different infrastructure I can not seem to get the oops from the serial log or have it logged via netconsole.  But the host dies at a similar point of testing which is very suspicious.

---
[ 7162.053779] skbuff: skb_over_panic: text:ffffffff817a3f58 len:40 put:40 head:ffff88004d9b6000 data:ffff88004d9c6010 tail:0x10038 end:0xec0 dev:qg-0d7c6880-b1
[ 7162.053921] ------------[ cut here ]------------
[ 7162.053922] kernel BUG at net/core/skbuff.c:104!
[ 7162.053924] invalid opcode: 0000 [#1] SMP 
[ 7162.053944] Modules linked in: xt_CT xt_mac xt_physdev br_netfilter xt_set ip_set_hash_net ip_set nf_conntrack_netlink nfnetlink veth xt_nat xt_REDIRECT nf_nat_redirect xt_mark xt_connmark ip6table_raw ip6table_mangle xt_comment iptable_raw xfs loop xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 nls_utf8 isofs tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_nat nfsd auth_rpcgss nfs_acl lockd grace sunrpc openvswitch nf_conntrack_ipv6 nf_nat_ipv6 nf_nat_ipv4 nf_defrag_ipv6 nf_nat libcrc32c nf_log_ipv4 nf_log_common xt_LOG xt_limit nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack kvm_intel kvm ttm ppdev drm_kms_helper irqbypass drm crct10dif_pclmul crc32_pclmul acpi_cpufreq ghash_clmulni_intel virtio_net parport_pc i2c_piix4 joydev pvpanic parport tpm_tis tpm virtio_balloon virtio_blk crc32c_intel virtio_pci virtio_ring ata_generic virtio pata_acpi serio_raw
[ 7162.053956] CPU: 3 PID: 22738 Comm: ovs-vswitchd Not tainted 4.6.4-301.fc24.x86_64 #1
[ 7162.053956] Hardware name: OpenStack Foundation OpenStack Nova, BIOS Bochs 01/01/2011
[ 7162.053957] task: ffff880189a89e80 ti: ffff8801ac350000 task.ti: ffff8801ac350000
[ 7162.053978] RIP: 0010:[<ffffffff816ac334>]  [<ffffffff816ac334>] skb_panic+0x64/0x70
[ 7162.053988] RSP: 0018:ffff88023fd83c88  EFLAGS: 00010282
[ 7162.053988] RAX: 0000000000000091 RBX: ffff88002d5ca400 RCX: 0000000000000002
[ 7162.053989] RDX: 0000000000000000 RSI: 0000000000000296 RDI: 0000000000000296
[ 7162.053990] RBP: ffff88023fd83ca8 R08: 000000000f99a6c7 R09: 0000000000000091
[ 7162.053990] R10: ffff880237002f00 R11: 000000000000b1e4 R12: ffff88002d4b5e80
[ 7162.053991] R13: ffff88023fd83d20 R14: ffffffff81d07f00 R15: 0000000000000000
[ 7162.053992] FS:  00007f79b25e6a40(0000) GS:ffff88023fd80000(0000) knlGS:0000000000000000
[ 7162.053992] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 7162.053993] CR2: 000056163abd8008 CR3: 0000000189871000 CR4: 00000000001406e0
[ 7162.053998] Stack:
[ 7162.053999]  ffff88004d9c6010 0000000000010038 0000000000000ec0 ffff8801477b3000
[ 7162.054003]  ffff88023fd83cb8 ffffffff816ac84d ffff88023fd83cf0 ffffffff817a3f58
[ 7162.054004]  ffff88002d5ca400 ffff88023fd83d20 ffff88002d4b5e80 ffff8801477b3000
[ 7162.054004] Call Trace:
[ 7162.054011]  <IRQ> 
[ 7162.054011]  [<ffffffff816ac84d>] skb_put+0x4d/0x50
[ 7162.054019]  [<ffffffff817a3f58>] ip6_mc_hdr.constprop.39+0x58/0xd0
[ 7162.054020]  [<ffffffff817a4101>] mld_newpack+0x131/0x1c0
[ 7162.054032]  [<ffffffff810651ce>] ? kvm_clock_read+0x1e/0x20
[ 7162.054034]  [<ffffffff817a421d>] add_grhead.isra.27+0x8d/0xa0
[ 7162.054035]  [<ffffffff817a4f10>] add_grec+0x390/0x4b0
[ 7162.054036]  [<ffffffff817a5df2>] mld_ifc_timer_expire+0x1f2/0x290
[ 7162.054037]  [<ffffffff817a5c00>] ? igmp6_timer_handler+0x70/0x70
[ 7162.054044]  [<ffffffff81112eb5>] call_timer_fn+0x35/0x120
[ 7162.054045]  [<ffffffff817a5c00>] ? igmp6_timer_handler+0x70/0x70
[ 7162.054046]  [<ffffffff811134fe>] run_timer_softirq+0x21e/0x2d0
[ 7162.054051]  [<ffffffff817e34e2>] __do_softirq+0x112/0x2e7
[ 7162.054056]  [<ffffffff810ad145>] irq_exit+0x105/0x110
[ 7162.054057]  [<ffffffff817e32e2>] smp_apic_timer_interrupt+0x42/0x50
[ 7162.054058]  [<ffffffff817e141c>] apic_timer_interrupt+0x8c/0xa0
[ 7162.054063]  <EOI> 
[ 7162.054063]  [<ffffffff810ffb58>] ? console_unlock+0x348/0x590
[ 7162.054065]  [<ffffffff81100037>] vprintk_emit+0x297/0x490
[ 7162.054066]  [<ffffffff811003b9>] vprintk_default+0x29/0x40
[ 7162.054073]  [<ffffffff811b68cd>] printk+0x57/0x73
[ 7162.054086]  [<ffffffff81408f41>] ? __nla_reserve+0x41/0x50
[ 7162.054091]  [<ffffffff816c53f6>] __dev_set_promiscuity+0xf6/0x1d0
[ 7162.054092]  [<ffffffff81408fe6>] ? nla_put+0x36/0x40
[ 7162.054098]  [<ffffffffc0282fa1>] ? ovs_vport_get_upcall_portids+0x71/0x80 [openvswitch]
[ 7162.054100]  [<ffffffff816c5794>] dev_set_promiscuity+0x24/0x50
[ 7162.054102]  [<ffffffffc02832d1>] internal_dev_destroy+0x31/0x60 [openvswitch]
[ 7162.054104]  [<ffffffffc0282c66>] ovs_vport_del+0x46/0x50 [openvswitch]
[ 7162.054106]  [<ffffffffc027798b>] ovs_vport_cmd_del+0x12b/0x190 [openvswitch]
[ 7162.054112]  [<ffffffff816fb04b>] genl_family_rcv_msg+0x1db/0x3c0
[ 7162.054118]  [<ffffffff812234c1>] ? __slab_free+0xc1/0x2a0
[ 7162.054119]  [<ffffffff816ad04b>] ? __alloc_skb+0x5b/0x1f0
[ 7162.054120]  [<ffffffff816fb230>] ? genl_family_rcv_msg+0x3c0/0x3c0
[ 7162.054122]  [<ffffffff816fb2b7>] genl_rcv_msg+0x87/0xc0
[ 7162.054123]  [<ffffffff816f873b>] ? netlink_lookup+0xdb/0x140
[ 7162.054124]  [<ffffffff816fa7b4>] netlink_rcv_skb+0xa4/0xc0
[ 7162.054125]  [<ffffffff816fae58>] genl_rcv+0x28/0x40
[ 7162.054126]  [<ffffffff816fa1a1>] netlink_unicast+0x191/0x220
[ 7162.054127]  [<ffffffff816fa527>] netlink_sendmsg+0x2f7/0x3b0
[ 7162.054130]  [<ffffffff816a50b8>] sock_sendmsg+0x38/0x50
[ 7162.054131]  [<ffffffff816a5b6e>] ___sys_sendmsg+0x28e/0x2a0
[ 7162.054141]  [<ffffffff816a4a3f>] ? sock_destroy_inode+0x2f/0x40
[ 7162.054145]  [<ffffffff81264e6b>] ? destroy_inode+0x3b/0x60
[ 7162.054147]  [<ffffffff81264fac>] ? evict+0x11c/0x180
[ 7162.054148]  [<ffffffff8126045e>] ? dentry_free+0x4e/0x90
[ 7162.054148]  [<ffffffff812611dd>] ? __dentry_kill+0x13d/0x1b0
[ 7162.054149]  [<ffffffff81261416>] ? dput+0x1c6/0x200
[ 7162.054151]  [<ffffffff816a64c4>] __sys_sendmsg+0x54/0x90
[ 7162.054152]  [<ffffffff816a6512>] SyS_sendmsg+0x12/0x20
[ 7162.054159]  [<ffffffff81003d92>] do_syscall_64+0x62/0x110
[ 7162.054160]  [<ffffffff817e0821>] entry_SYSCALL64_slow_path+0x25/0x25
[ 7162.054170] Code: cc 00 00 00 48 89 44 24 10 8b 87 c8 00 00 00 48 89 44 24 08 48 8b 87 d8 00 00 00 48 c7 c7 20 58 ac 81 48 89 04 24 e8 42 a5 b0 ff <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 e8 c7 cb d3 ff 
[ 7162.054171] RIP  [<ffffffff816ac334>] skb_panic+0x64/0x70
[ 7162.054172]  RSP <ffff88023fd83c88>
[ 7162.054443] ---[ end trace 1314dd437df3d236 ]---
[ 7162.054444] Kernel panic - not syncing: Fatal exception in interrupt
[ 7162.054472] Kernel Offset: disabled
[ 7162.055439] ---[ end Kernel panic - not syncing: Fatal exception in interrupt
---

Comment 1 Ian Wienand 2016-07-29 04:22:32 UTC
Created attachment 1185355 [details]
Full log from host

Full serial log from host

Comment 2 Ian Wienand 2016-07-29 05:53:45 UTC
Created attachment 1185383 [details]
Script to reproduce the tempest failure

Script to reproduce tempest failure

Comment 3 Ian Wienand 2016-08-01 05:29:42 UTC
This replicates with the current rawhide kernel 4.8.0-0.rc0.git3.1.fc26.x86_64

---
[ 2256.904429] ------------[ cut here ]------------
[ 2256.904430] kernel BUG at net/core/skbuff.c:105!
[ 2256.904431] invalid opcode: 0000 [#1] SMP
[ 2256.904455] Modules linked in: nf_conntrack_netlink xt_CT xt_mac xt_physdev br_netfilter xt_set ip_set_hash_net ip_set nfnetlink veth xt_nat xt_REDIRECT nf_nat_redirect xt_mark xt_connmark ip6table_raw ip6table_mangle xt_comment iptable_raw xfs loop xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 nls_utf8 isofs tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_nat nfsd auth_rpcgss nfs_acl lockd grace sunrpc openvswitch nf_conntrack_ipv6 nf_nat_ipv6 nf_nat_ipv4 nf_defrag_ipv6 nf_nat libcrc32c nf_log_ipv4 nf_log_common xt_LOG xt_limit nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel ppdev joydev ttm drm_kms_helper virtio_balloon parport_pc acpi_cpufreq pvpanic drm i2c_piix4
[ 2256.904461]  tpm_tis parport tpm virtio_net virtio_blk crc32c_intel serio_raw virtio_pci virtio_ring ata_generic pata_acpi virtio
[ 2256.904464] CPU: 3 PID: 6510 Comm: httpd Not tainted 4.8.0-0.rc0.git3.1.fc26.x86_64 #1
[ 2256.904464] Hardware name: OpenStack Foundation OpenStack Nova, BIOS Bochs 01/01/2011
[ 2256.904465] task: ffffa01bea8eb240 task.stack: ffffa01bea8fc000
[ 2256.904482] RIP: 0010:[<ffffffff8377ed64>]  [<ffffffff8377ed64>] skb_panic+0x64/0x70
[ 2256.904483] RSP: 0000:ffffa01c37403c18  EFLAGS: 00010286
[ 2256.904484] RAX: 0000000000000091 RBX: ffffa01a899d4300 RCX: ffffa01bea8eb240
[ 2256.904484] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000286
[ 2256.904485] RBP: ffffa01c37403c38 R08: 0000000000000001 R09: 0000000000000001
[ 2256.904485] R10: ffffa01a0a0be000 R11: 000000000001cbe4 R12: ffffa01a45845480
[ 2256.904486] R13: ffffa01c37403cb0 R14: ffffffff83f75ca0 R15: 0000000000000000
[ 2256.904487] FS:  00007f265a42c700(0000) GS:ffffa01c37400000(0000) knlGS:0000000000000000
[ 2256.904488] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2256.904489] CR2: 00007f264dd13468 CR3: 00000001ea3ee000 CR4: 00000000001406e0
[ 2256.904493] Stack:
[ 2256.904496]  ffffa01a0a0ce010 0000000000010038 0000000000000ec0 ffffa01c354a4000
[ 2256.904497]  ffffa01c37403c48 ffffffff8377efdd ffffa01c37403c80 ffffffff838a4348
[ 2256.904498]  ffffa01a899d4300 ffffa01c37403cb0 ffffa01a45845480 ffffa01c354a4000
[ 2256.904498] Call Trace:
[ 2256.904504]  <IRQ> 
[ 2256.904504]  [<ffffffff8377efdd>] skb_put+0x4d/0x50
[ 2256.904513]  [<ffffffff838a4348>] ip6_mc_hdr.constprop.42+0x58/0xd0
[ 2256.904514]  [<ffffffff838a44f1>] mld_newpack+0x131/0x1c0
[ 2256.904516]  [<ffffffff838a460d>] add_grhead.isra.30+0x8d/0xa0
[ 2256.904518]  [<ffffffff838a5ff0>] add_grec+0x390/0x4b0
[ 2256.904520]  [<ffffffff838a6533>] mld_ifc_timer_expire+0x203/0x2a0
[ 2256.904521]  [<ffffffff838a6330>] ? mld_dad_timer_expire+0x60/0x60
[ 2256.904533]  [<ffffffff831390df>] call_timer_fn+0xaf/0x340
[ 2256.904534]  [<ffffffff83139035>] ? call_timer_fn+0x5/0x340
[ 2256.904536]  [<ffffffff838a6330>] ? mld_dad_timer_expire+0x60/0x60
[ 2256.904537]  [<ffffffff8313a00e>] run_timer_softirq+0x26e/0x660
[ 2256.904544]  [<ffffffff83037e89>] ? sched_clock+0x9/0x10
[ 2256.904549]  [<ffffffff830ea307>] ? sched_clock_local+0x17/0x80
[ 2256.904555]  [<ffffffff838f5846>] __do_softirq+0xd6/0x4c5
[ 2256.904559]  [<ffffffff830b5fd1>] irq_exit+0xf1/0x100
[ 2256.904560]  [<ffffffff838f5502>] smp_apic_timer_interrupt+0x42/0x50
[ 2256.904573]  [<ffffffff838f3566>] apic_timer_interrupt+0x96/0xa0
[ 2256.904575]  <EOI> 
[ 2256.904587] Code: cc 00 00 00 48 89 44 24 10 8b 87 c8 00 00 00 48 89 44 24 08 48 8b 87 d8 00 00 00 48 c7 c7 88 17 d1 83 48 89 04 24 e8 1a 5d a7 ff <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 e8 b7 42 cf ff 
[ 2256.904589] RIP  [<ffffffff8377ed64>] skb_panic+0x64/0x70
[ 2256.904589]  RSP <ffffa01c37403c18>
---

Comment 4 Ian Wienand 2016-08-02 04:44:28 UTC
Here's what I think is going on

[ 2061.740016] skbuff: skb_over_panic: text:ffffffff817a3f58 len:40 put:40 head:ffff880022d53000 data:ffff880022d63010 tail:0x10038 end:0xec0 dev:tap07c424ec-90

So tail is greater than end, by 

  0x10038 - 0xec0 = 0xf178

That is a rather suspicious number.  This seems most likely to be set by skb_reserve(skb, hlen), where

  hlen = LL_RESERVED_SPACE(dev)

Where LL_RESERVED_SPACE comes from

  #define LL_RESERVED_SPACE(dev) \
  257         ((((dev)->hard_header_len+(dev)->needed_headroom)&~(HH_DATA_MOD - 1)) + HH_DATA_MOD)

I pulled the skb reference out of the registers and checked out
dev->needed_headroom

 crash> struct sk_buff.dev ffff880152d43600
   dev = 0xffff880138885000
 crash> struct net_device.needed_headroom 0xffff880138885000
   needed_headroom = 65535

So there's our 0xffff

Now I notice in 871b642adebe300be2e50aa5f65a418510f636ec [1] we
introduced ndo_set_rx_headroom to net_device with the comment

 + * void (*ndo_set_rx_headroom)(struct net_device *dev, int needed_headroom);
 + *     This function is used to specify the headroom that the skb must
 + *     consider when allocation skb during packet reception. Setting
 + *     appropriate rx headroom value allows avoiding skb head copy on
 + *     forward. Setting a negative value reset the rx headroom to the
 + *     default value.

So I suspect we're seeing a negative value here being propagated
into net_device->needed_headroom (incorectly?).

This is a tap device 

 crash> struct net_device.name 0xffff880138885000
 name = "tap07c424ec-90\000"

But most certainly attached to an ovs bridge (not exactly sure how to go from a net_device to see what it's attached to)

The only place I can see this happening is in net/openvswitch/vport-internal_dev.c (via 3a927bc7cf9d0fbe8f4a8189dd5f8440228f64e7 [2]) where
ndo_set_rx_headroom has been implemented with

 +void internal_set_rx_headroom(struct net_device *dev, int new_hr)
 +{
 +       dev->needed_headroom = new_hr;
 +}
 +

So I'm wondering, should that be

 new_hr < 0 ? 0 : new_hdr

So that a negative value (i.e. the default mentioned in ndo_set_rx_headroom) doesn't get propagated?

Or, is this a miscalculation in mld_newpack?

[1] http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=871b642adebe300be2e50aa5f65a418510f636ec
[2] http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=3a927bc7cf9d0fbe8f4a8189dd5f8440228f64e7

Comment 5 Ian Wienand 2016-08-02 23:22:15 UTC
Thanks to Ben for pointing out to me I should have followed the device ops which leads back to it being an ovs port.  this makes even more sense

---
struct net_device 0xffff880138885000
  name = "tap07c424ec-90\000",
  [...]
  netdev_ops = 0xffffffffc0251be0,
 
struct net_device_ops 0xffffffffc024e0e0
  ndo_open = 0xffffffffc024e0e0 <internal_dev_open>,
  ndo_set_rx_headroom = 0xffffffffc024e140 <internal_set_rx_headroom>
 
kmem 0xffffffffc024e0e0
ffffffffc024e0e0 (t) internal_dev_open [openvswitch]
---

Comment 8 Laura Abbott 2016-08-12 22:05:09 UTC
I dropped the patch in rawhide, should show up in a few days. I'm going to leave this bug open until verified that it is fixed.

Comment 9 Fedora Update System 2016-08-18 15:43:38 UTC
kernel-4.6.7-300.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2016-5e24d8c350

Comment 10 Fedora Update System 2016-08-19 00:57:17 UTC
kernel-4.6.7-300.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-5e24d8c350

Comment 11 Fedora Update System 2016-08-23 12:47:34 UTC
kernel-4.6.7-300.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report.