Bug 1266601 - "hw csum failure"
"hw csum failure"
Status: CLOSED NEXTRELEASE
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
23
x86_64 Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: Kernel Maintainer List
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-09-25 15:49 EDT by Stefan Ring
Modified: 2016-03-28 08:19 EDT (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-03-28 08:19:56 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Stefan Ring 2015-09-25 15:49:44 EDT
Description of problem:

I've recently updated my kernel from 3.17.8 to 4.2.1. Unfortunately, I haven't been able to upgrade earlier due to my usage of ZFS on Linux and a long-standing incompatibility with anything 3.18+, so I don't know if this happened with earlier versions. There is one other report though where someone started seeing this with 4.2 as well: https://github.com/raspberrypi/linux/issues/1083

This is an unmodified kernel from http://pkgs.fedoraproject.org/cgit/kernel.git/commit/?h=f23&id=1dedebfc98c5866fcf6a3b65bed15fb594d957d1 rebuilt on/for F22.

Since the upgrade, I get this when another machine on the network does DHCP requests:

[  192.721601] p35p1: hw csum failure
[  192.722550] CPU: 0 PID: 0 Comm: swapper/0 Tainted: P           OE   4.2.1-300.str.fc22.x86_64 #1
[  192.722588] Hardware name: System manufacturer System Product Name/P5K WS, BIOS 1201    06/26/2008
[  192.722588]  0000000000000000 09b50d4da9aa776f ffff88022fc03998 ffffffff8177170a
[  192.722588]  0000000000000000 ffff8800365fa000 ffff88022fc039b8 ffffffff8165d250
[  192.722588]  ffffffff8164b190 ffff88021e62c600 ffff88022fc039e8 ffffffff81653dcc
[  192.722588] Call Trace:
[  192.722588]  <IRQ>  [<ffffffff8177170a>] dump_stack+0x45/0x57
[  192.722588]  [<ffffffff8165d250>] netdev_rx_csum_fault+0x40/0x50
[  192.722588]  [<ffffffff8164b190>] ? reqsk_fastopen_remove+0x160/0x160
[  192.722588]  [<ffffffff81653dcc>] __skb_checksum_complete+0xbc/0xd0
[  192.722588]  [<ffffffff81754278>] ipv6_mc_validate_checksum+0x98/0x150
[  192.722588]  [<ffffffff8164fffe>] skb_checksum_trimmed+0x9e/0x190
[  192.722588]  [<ffffffff81754459>] ipv6_mc_check_mld+0x129/0x340
[  192.722588]  [<ffffffffa0866447>] br_multicast_rcv+0x87/0xcc0 [bridge]
[  192.722588]  [<ffffffffa085d4d2>] br_handle_frame_finish+0x2a2/0x5f0 [bridge]
[  192.722588]  [<ffffffffa085d98a>] br_handle_frame+0x16a/0x290 [bridge]
[  192.722588]  [<ffffffff816602b4>] __netif_receive_skb_core+0x384/0xa00
[  192.722588]  [<ffffffff81752be0>] ? ipv6_gro_receive+0x230/0x320
[  192.722588]  [<ffffffff81660948>] __netif_receive_skb+0x18/0x60
[  192.722588]  [<ffffffff816609d0>] netif_receive_skb_internal+0x40/0xb0
[  192.722588]  [<ffffffff816615d5>] napi_gro_receive+0xb5/0xf0
[  192.722588]  [<ffffffffa0006373>] sky2_poll+0x613/0xda0 [sky2]
[  192.722588]  [<ffffffff81777b5e>] ? _raw_spin_unlock_irqrestore+0xe/0x10
[  192.722588]  [<ffffffff814a3c28>] ? credit_entropy_bits+0x258/0x320
[  192.722588]  [<ffffffff814a3006>] ? __mix_pool_bytes+0x36/0x80
[  192.722588]  [<ffffffff81660ebc>] net_rx_action+0x20c/0x310
[  192.722588]  [<ffffffff810a281b>] __do_softirq+0xfb/0x290
[  192.722588]  [<ffffffff810a2bc9>] irq_exit+0x119/0x120
[  192.722588]  [<ffffffff8177ad28>] do_IRQ+0x58/0xe0
[  192.722588]  [<ffffffff81778c2b>] common_interrupt+0x6b/0x6b
[  192.722588]  <EOI>  [<ffffffff8101f07c>] ? mwait_idle+0x8c/0x140
[  192.722588]  [<ffffffff8101f61f>] arch_cpu_idle+0xf/0x20
[  192.722588]  [<ffffffff810dfc3a>] default_idle_call+0x2a/0x40
[  192.722588]  [<ffffffff810dff79>] cpu_startup_entry+0x2c9/0x320
[  192.722588]  [<ffffffff81767e4c>] rest_init+0x7c/0x80
[  192.722588]  [<ffffffff81d5702d>] start_kernel+0x49d/0x4be
[  192.722588]  [<ffffffff81d56120>] ? early_idt_handler_array+0x120/0x120
[  192.722588]  [<ffffffff81d56339>] x86_64_start_reservations+0x2a/0x2c
[  192.722588]  [<ffffffff81d56485>] x86_64_start_kernel+0x14a/0x16d

Version-Release number of selected component (if applicable):

4.2.1-300

How reproducible:

Boot the Fedora machine. Let it connect to the network and remain idle. Plug network cable into my Macbook.

Steps to Reproduce:
1. Start the Fedora machine. Let it connect to the network and remain idle
2. Plug network cable into Macbook

Actual results:

Stack trace shown above

Expected results:

Nothing

Additional info:

This is a simple bridge configuration. The physical ethernet card is the only member attached to it.
Comment 1 Stefan Ring 2015-11-19 14:49:01 EST
Still there with 4.2.6-300.fc23.x86_64, this time using the binaries from the Fedora update mirror.
Comment 2 Stefan Ring 2015-11-30 15:05:10 EST
Since I just tried a 4.1.13 kernel for experimentation unrelated to this bug, I can confirm that the faulty behavior does not happen with this version and really seems to have started with 4.2.

I can see some activity regarding MLD message validation and bridges in the git log leading up to v4.2. My first guess would be towards these changes.
Comment 3 Stefan Ring 2016-03-25 14:56:50 EDT
This is fixed by mainline kernel commits:

9b368814b336b0a1a479135eb2815edbc00efd3c
f8ffad69c9f8b8dfb0b633425d4ef4d2493ba61a
fdc5432a7b44ab7de17141beec19d946b9344e91
Comment 4 Josh Boyer 2016-03-28 08:19:56 EDT
Those are all in the 4.5 upstream kernel release.  F23 will be rebased to 4.5 around the 4.5.2 timeframe.

Note You need to log in before you can comment on or make changes to this bug.