Bug 753890

Summary: sky2 module problem
Product: [Fedora] Fedora Reporter: Sergei LITVINENKO <sergei.litvinenko>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 16CC: fireblade1230, gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda, nhorman, sergei.litvinenko
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-12-14 12:01:47 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Sergei LITVINENKO 2011-11-14 19:05:06 UTC
Description of problem:

Nov 14 20:42:00 homedesk kernel: [   55.681845] eth0: hw csum failure.
Nov 14 20:42:00 homedesk kernel: [   55.681851] Pid: 0, comm: kworker/0:0 Tainted: P            3.1.1-1.fc16.i686.PAE #1
Nov 14 20:42:00 homedesk kernel: [   55.681854] Call Trace:
Nov 14 20:42:00 homedesk kernel: [   55.681862]  [<c082001f>] ? printk+0x2d/0x2f
Nov 14 20:42:00 homedesk kernel: [   55.681868]  [<c0763bc3>] netdev_rx_csum_fault+0x36/0x3b
Nov 14 20:42:00 homedesk kernel: [   55.681872]  [<c075e787>] __skb_checksum_complete_head+0x4c/0x5f
Nov 14 20:42:00 homedesk kernel: [   55.681876]  [<c075e7aa>] __skb_checksum_complete+0x10/0x12
Nov 14 20:42:00 homedesk kernel: [   55.681888]  [<f7e7ab80>] br_multicast_rcv+0x7d2/0xc44 [bridge]
Nov 14 20:42:00 homedesk kernel: [   55.681893]  [<c043abe3>] ? check_preempt_curr+0x3c/0x66
Nov 14 20:42:00 homedesk kernel: [   55.681898]  [<c0827c43>] ? _raw_spin_unlock_irqrestore+0x13/0x15
Nov 14 20:42:00 homedesk kernel: [   55.681902]  [<c0440c98>] ? try_to_wake_up+0x15f/0x169
Nov 14 20:42:00 homedesk kernel: [   55.681910]  [<f7e7899b>] ? br_nf_pre_routing+0x24/0x346 [bridge]
Nov 14 20:42:00 homedesk kernel: [   55.681915]  [<c06baa31>] ? ahci_qc_issue+0x15/0xc6
Nov 14 20:42:00 homedesk kernel: [   55.681919]  [<c06a8f1b>] ? ata_qc_issue+0x278/0x296
Nov 14 20:42:00 homedesk kernel: [   55.681924]  [<c0781da2>] ? nf_iterate+0x3b/0x61
Nov 14 20:42:00 homedesk kernel: [   55.681931]  [<f7e74192>] ? NF_HOOK.constprop.0+0x52/0x52 [bridge]
Nov 14 20:42:00 homedesk kernel: [   55.681939]  [<f7e72fe5>] ? br_fdb_update+0x6a/0xdd [bridge]
Nov 14 20:42:00 homedesk kernel: [   55.681942]  [<c0781e22>] ? nf_hook_slow+0x5a/0xfc
Nov 14 20:42:00 homedesk kernel: [   55.681950]  [<f7e74207>] br_handle_frame_finish+0x75/0x1d7 [bridge]
Nov 14 20:42:00 homedesk kernel: [   55.681958]  [<f7e74192>] ? NF_HOOK.constprop.0+0x52/0x52 [bridge]
Nov 14 20:42:00 homedesk kernel: [   55.681966]  [<f7e7418b>] NF_HOOK.constprop.0+0x4b/0x52 [bridge]
Nov 14 20:42:00 homedesk kernel: [   55.681973]  [<f7e74192>] ? NF_HOOK.constprop.0+0x52/0x52 [bridge]
Nov 14 20:42:00 homedesk kernel: [   55.681981]  [<f7e744d1>] br_handle_frame+0x168/0x17f [bridge]
Nov 14 20:42:00 homedesk kernel: [   55.681988]  [<f7e74192>] ? NF_HOOK.constprop.0+0x52/0x52 [bridge]
Nov 14 20:42:00 homedesk kernel: [   55.681996]  [<f7e74369>] ? br_handle_frame_finish+0x1d7/0x1d7 [bridge]
Nov 14 20:42:00 homedesk kernel: [   55.682000]  [<c0762049>] __netif_receive_skb+0x23d/0x371
Nov 14 20:42:00 homedesk kernel: [   55.682009]  [<c0411a24>] ? alternatives_smp_switch+0x168/0x168
Nov 14 20:42:00 homedesk kernel: [   55.682011]  [<c0411a24>] ? alternatives_smp_switch+0x168/0x168
Nov 14 20:42:00 homedesk kernel: [   55.682014]  [<c0764cb9>] netif_receive_skb+0x62/0x68
Nov 14 20:42:00 homedesk kernel: [   55.682016]  [<c0411a24>] ? alternatives_smp_switch+0x168/0x168
Nov 14 20:42:00 homedesk kernel: [   55.682018]  [<c0764d3d>] napi_skb_finish+0x23/0x39
Nov 14 20:42:00 homedesk kernel: [   55.682021]  [<c07650bf>] napi_gro_receive+0x25/0x29
Nov 14 20:42:00 homedesk kernel: [   55.682032]  [<f8372ca7>] sky2_poll+0x6f6/0x8c6 [sky2]
Nov 14 20:42:00 homedesk kernel: [   55.682035]  [<c0412779>] ? sched_clock+0x8/0xb
Nov 14 20:42:00 homedesk kernel: [   55.682039]  [<c0464a9f>] ? sched_clock_cpu+0x134/0x144
Nov 14 20:42:00 homedesk kernel: [   55.682041]  [<c07651c6>] net_rx_action+0x95/0x18c
Nov 14 20:42:00 homedesk kernel: [   55.682044]  [<c044cb44>] __do_softirq+0xa2/0x17b
Nov 14 20:42:00 homedesk kernel: [   55.682046]  [<c044caa2>] ? ftrace_define_fields_irq_handler_entry+0x71/0x71
Nov 14 20:42:00 homedesk kernel: [   55.682048]  <IRQ>  [<c044cda0>] ? irq_exit+0x45/0x98
Nov 14 20:42:00 homedesk kernel: [   55.682052]  [<c040ef16>] ? do_IRQ+0x7e/0x92
Nov 14 20:42:00 homedesk kernel: [   55.682054]  [<c044cdf1>] ? irq_exit+0x96/0x98
Nov 14 20:42:00 homedesk kernel: [   55.682058]  [<c0423a32>] ? smp_apic_timer_interrupt+0x69/0x76
Nov 14 20:42:00 homedesk kernel: [   55.682061]  [<c082dfb0>] ? common_interrupt+0x30/0x38
Nov 14 20:42:00 homedesk kernel: [   55.682063]  [<c041372c>] ? mwait_idle+0x6d/0x97
Nov 14 20:42:00 homedesk kernel: [   55.682067]  [<c040cbed>] ? cpu_idle+0x97/0xb1
Nov 14 20:42:00 homedesk kernel: [   55.682070]  [<c08194ce>] ? start_secondary+0x254/0x259

Version-Release number of selected component (if applicable):

kernel-PAE-3.1.1-1.fc16.i686

How reproducible:
From time to time

Steps to Reproduce:
1. Boot system
2.
3.
  
Actual results:
kernel module log

Expected results:
Module manage device as expected.

Additional info:

[root@homedesk ~]# cat /etc/sysconfig/network-scripts/ifcfg-br0
DEVICE=br0
ONBOOT=yes
BOOTPROTO=static
IPADDR=10.119.100.100
NETMASK=255.255.255.0
NETWORK=10.119.100.0
BROADCAST=10.119.100.255
GATEWAY=10.119.100.254
TYPE=Bridge
DELAY=0
STP=on
NM_CONTROLLED=no
USERCTL=yes
IPV6INIT=no

[root@homedesk ~]# cat /etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0
TYPE=Ethernet
ONBOOT=yes
BOOTPROTO=none
HWADDR=00:1B:FC:5C:0D:6C
BRIDGE=br0
NM_CONTROLLED=no
USERCTL=yes
IPV6INIT=no

[root@homedesk ~]# cat /etc/sysconfig/network-scripts/ifcfg-eth1
DEVICE=eth1
TYPE=Ethernet
ONBOOT=yes
BOOTPROTO=none
HWADDR=00:1B:FC:5B:EB:48
BRIDGE=br0
NM_CONTROLLED=no
USERCTL=yes
IPV6INIT=no

---

[root@homedesk ~]# ethtool -i eth0
driver: sky2
version: 1.29
firmware-version: N/A
bus-info: 0000:02:00.0
supports-statistics: yes
supports-test: no
supports-eeprom-access: yes
supports-register-dump: yes

[root@homedesk ~]# ethtool -i eth1
driver: r8169
version: 2.3LK-NAPI
firmware-version: N/A
bus-info: 0000:04:04.0
supports-statistics: yes
supports-test: no
supports-eeprom-access: no
supports-register-dump: yes

---

[root@homedesk ~]# lspci -nn | grep Eth
02:00.0 Ethernet controller [0200]: Marvell Technology Group Ltd. 88E8056 PCI-E Gigabit Ethernet Controller [11ab:4364] (rev 12)
04:04.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL-8110SC/8169SC Gigabit Ethernet [10ec:8167] (rev 10)

---

No NetworkManager...

Comment 1 Rex Dieter 2011-11-14 19:13:38 UTC
bouncing over to kernel (assuming thats the intended component)

Comment 2 Neil Horman 2011-11-14 19:28:25 UTC
Its not a critical failure, its just indicating that the hardware isn't checksumming your frames properly.  You can disable it with:
ethtool -K eth0 rx off tx off

Does this happen with all your sky2 devices, or just this one?  It may just be broken

Comment 3 Sergei LITVINENKO 2011-11-20 15:15:23 UTC
>> Does this happen with all your sky2 devices, or just this one?
>> It may just be broken

I have only one sky2 network card.
Usually card work as expected.

Errors there are in log, but not very often
---
Nov 19 21:34:39 homedesk kernel: [ 5247.406943] Pid: 0, comm: kworker/0:0 Tainted: P            3.1.1-1.fc16.i686.PAE #1
Nov 19 21:34:39 homedesk kernel: [ 5247.406946] Call Trace:
Nov 19 21:34:39 homedesk kernel: [ 5247.406954]  [<c082001f>] ? printk+0x2d/0x2f
Nov 19 21:34:39 homedesk kernel: [ 5247.406959]  [<c0763bc3>] netdev_rx_csum_fault+0x36/0x3b
Nov 19 21:34:39 homedesk kernel: [ 5247.406963]  [<c075e787>] __skb_checksum_complete_head+0x4c/0x5f
Nov 19 21:34:39 homedesk kernel: [ 5247.406967]  [<c075e7aa>] __skb_checksum_complete+0x10/0x12
...
---

Comment 4 ANEZAKI, Akira 2011-11-26 13:56:25 UTC
I'm in similar situation too. I can stop logging call trace with:
ethtool -K eth0 rx off
And no call trace shown in 6 days. But, I'm wondering on 4 points:

1.
/sbin/ifconfig eth0
shows neither error nor dropped packets while call trace is logged every few minutes.

2.
All call trace include [bridge] routines.

3.
Early 3.0.x and 2.6.x kernels don't cause stack trace.

4.
I configured a software bridge by openvpn. But, after I stop openvpn bridge service, call trace is not logged.

---
Nov 26 22:38:33 server1 kernel: [555335.265318] eth0: hw csum failure.
Nov 26 22:38:33 server1 kernel: [555335.265327] Pid: 0, comm: swapper Tainted: P            2.6.41.1-1.fc15.i686.PAE #1
Nov 26 22:38:33 server1 kernel: [555335.265333] Call Trace:
Nov 26 22:38:33 server1 kernel: [555335.265344]  [<c0807fa7>] ? printk+0x2d/0x2f
Nov 26 22:38:33 server1 kernel: [555335.265353]  [<c0775c54>] netdev_rx_csum_fault+0x36/0x3b
Nov 26 22:38:33 server1 kernel: [555335.265365]  [<c0770817>] __skb_checksum_complete_head+0x4c/0x5f
Nov 26 22:38:33 server1 kernel: [555335.265375]  [<c077083a>] __skb_checksum_complete+0x10/0x12
Nov 26 22:38:33 server1 kernel: [555335.265406]  [<fa76bb80>] br_multicast_rcv+0x7d2/0xc44 [bridge]
...
---

Comment 5 ANEZAKI, Akira 2011-11-26 15:32:34 UTC
(Supplimental Information to comment #4)

> 4.
> I configured a software bridge by openvpn. But, after I stop openvpn bridge
> service, call trace is not logged.

The openvpn service is starting as a server only but not connected. It's only waiting to connection from clients. But call traces are logged every few minutes.

Comment 6 ANEZAKI, Akira 2011-12-14 01:40:34 UTC
I finally upgraded to Fedora 16, and changed the way to create bridge device while booting. It's same way that Sergei showed us. And I tested again. Of course HW checksum is turned on.

When I use kernel 3.1.4-1.fc16.i686.PAE, Call Traces are logged while openvpn is connected and bridging every fiew minutes. But, No Call Traces are logged while openvpn is only waiting connections. It differs from what I reported before. 

When I use kernel 3.1.5-1.fc16.i686.PAE, No Call Traces are logged while openvpn is connected and bridging. I had tested more than 12 hours. 

About my system, the problem was solved. Thank you.