Bug 825556 - Marvell 88E8056 NIC dies on large data transfer with sky2 driver
Marvell 88E8056 NIC dies on large data transfer with sky2 driver
Status: CLOSED INSUFFICIENT_DATA
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
17
x86_64 Linux
unspecified Severity high
: ---
: ---
Assigned To: Kernel Maintainer List
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-05-27 14:51 EDT by Andy Keep
Modified: 2013-03-28 09:55 EDT (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-03-28 09:55:03 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Andy Keep 2012-05-27 14:51:31 EDT
Description of problem:

The sky2 driver for the onboard Marvell 88E8056 driver drops the network connection and cannot reliably restart it (even with an ifdown/ifup of the network interface).  The problem only seems to happen on large data transfers:  I first noticed the problem attempting to upgrade from Fedora 14 to 15, when it died trying to download upgrade packages through YUM.  I have since done a complete re-install and update of Fedora 17, and the problem does not seem as pronounced, but still occurs.  In the /var/log/messages log I begin to see:

May 26 13:30:01 firehawk kernel: [18352.907807] net_ratelimit: 4 callbacks suppressed
May 26 13:30:01 firehawk kernel: [18352.907813] sky2 0000:04:00.0: p37p1: rx error, status 0x56e0002 length 1390
May 26 13:30:01 firehawk kernel: [18352.990648] sky2 0000:04:00.0: p37p1: rx error, status 0x56e0002 length 1390
May 26 13:30:02 firehawk kernel: [18353.722948] sky2 0000:04:00.0: p37p1: rx error, status 0x56e0002 length 1390


Version-Release number of selected component (if applicable):
3.3.7-1.fc17.x86_64

How reproducible: Unfortunately it is difficult to reliably reproduce, but I suspect downloading several large files in a row should kill it (it has died on me while pulling down package files through yum, and while running an svn checkout on a project with about 264M of source and binary files in it).

Steps to Reproduce:
1. Boot system (normal graphical mode boot).
2. Initiate download of multiple files (through something like an svn checkout or yum update
3. Will sometimes hang network (seemingly taking X with it, since I've had to drop into a text console and restart X to get it to come back).
  
Actual results: Network dies


Expected results: Files would be downloaded


Additional info:

This bug seems to be related to ongoing problems with the sky2 driver on the Marvell 88E80XX chipset.  Prior to using Fedora 17 I had been using Fedora 13 with no problems, so this seems to be a bug that was fixed in the 2.6 kernel series, but has re-occurred in the 3.3 kernel series.

Here is the lspci with information about the devices installed on the machine:

00:00.0 Host bridge: Intel Corporation 5520/5500/X58 I/O Hub to ESI Port (rev 13)
00:01.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 1 (rev 13)
00:03.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 3 (rev 13)
00:07.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 7 (rev 13)
00:14.0 PIC: Intel Corporation 5520/5500/X58 I/O Hub System Management Registers (rev 13)
00:14.1 PIC: Intel Corporation 5520/5500/X58 I/O Hub GPIO and Scratch Pad Registers (rev 13)
00:14.2 PIC: Intel Corporation 5520/5500/X58 I/O Hub Control Status and RAS Registers (rev 13)
00:14.3 PIC: Intel Corporation 5520/5500/X58 I/O Hub Throttle Registers (rev 13)
00:1a.0 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #4
00:1a.1 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #5
00:1a.2 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #6
00:1a.7 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB2 EHCI Controller #2
00:1b.0 Audio device: Intel Corporation 82801JI (ICH10 Family) HD Audio Controller
00:1c.0 PCI bridge: Intel Corporation 82801JI (ICH10 Family) PCI Express Root Port 1
00:1c.2 PCI bridge: Intel Corporation 82801JI (ICH10 Family) PCI Express Root Port 3
00:1c.4 PCI bridge: Intel Corporation 82801JI (ICH10 Family) PCI Express Root Port 5
00:1c.5 PCI bridge: Intel Corporation 82801JI (ICH10 Family) PCI Express Root Port 6
00:1d.0 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #1
00:1d.1 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #2
00:1d.2 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #3
00:1d.7 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB2 EHCI Controller #1
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 90)
00:1f.0 ISA bridge: Intel Corporation 82801JIR (ICH10R) LPC Interface Controller
00:1f.2 IDE interface: Intel Corporation 82801JI (ICH10 Family) 4 port SATA IDE Controller #1
00:1f.3 SMBus: Intel Corporation 82801JI (ICH10 Family) SMBus Controller
00:1f.5 IDE interface: Intel Corporation 82801JI (ICH10 Family) 2 port SATA IDE Controller #2
02:00.0 VGA compatible controller: nVidia Corporation G92 [GeForce 9800 GTX+] (rev a2)
04:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8056 PCI-E Gigabit Ethernet Controller (rev 12)
05:00.0 IDE interface: Marvell Technology Group Ltd. 88SE6121 SATA II Controller (rev b2)
06:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8056 PCI-E Gigabit Ethernet Controller (rev 12)
08:02.0 FireWire (IEEE 1394): VIA Technologies, Inc. VT6306/7/8 [Fire II(M)] IEEE 1394 OHCI Controller (rev c0)
ff:00.0 Host bridge: Intel Corporation Xeon 5500/Core i7 QuickPath Architecture Generic Non-Core Registers (rev 05)
ff:00.1 Host bridge: Intel Corporation Xeon 5500/Core i7 QuickPath Architecture System Address Decoder (rev 05)
ff:02.0 Host bridge: Intel Corporation Xeon 5500/Core i7 QPI Link 0 (rev 05)
ff:02.1 Host bridge: Intel Corporation Xeon 5500/Core i7 QPI Physical 0 (rev 05)
ff:03.0 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller (rev 05)
ff:03.1 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Target Address Decoder (rev 05)
ff:03.4 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Test Registers (rev 05)
ff:04.0 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Channel 0 Control Registers (rev 05)
ff:04.1 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Channel 0 Address Registers (rev 05)
ff:04.2 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Channel 0 Rank Registers (rev 05)
ff:04.3 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Channel 0 Thermal Control Registers (rev 05)
ff:05.0 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Channel 1 Control Registers (rev 05)
ff:05.1 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Channel 1 Address Registers (rev 05)
ff:05.2 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Channel 1 Rank Registers (rev 05)
ff:05.3 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Channel 1 Thermal Control Registers (rev 05)
ff:06.0 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Channel 2 Control Registers (rev 05)
ff:06.1 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Channel 2 Address Registers (rev 05)
ff:06.2 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Channel 2 Rank Registers (rev 05)
ff:06.3 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Channel 2 Thermal Control Registers (rev 05)
Comment 1 Andy Keep 2012-05-27 17:59:48 EDT
I have just had the link go down again.  This time without any significant traffic.  I wonder if it is simply related to how many bits can be sent before the driver dies.

Again, I am unable to restart the device using ifdown/ifup.  I was able to restart it by unloading the sky2 kernel module and then reloading it.
Comment 2 Andrew Burgess 2012-07-21 10:06:55 EDT
i get the rx error but nothing dies or hangs. i have a xeon with 88E8057 Rev 10. 

two independent suggestions that i found while googling:

put interface into promiscuous mode:
  sudo tcpdump -i eth0 icmp
  (icmp just reduces amount of printed output)

disable hw checksum:
  sudo ethtool -K eth0 rx off

i am trying both now. i have two sky2 interfaces on this mb, one started this about a year ago and i just switched to the second. now the second is acting up. 

Marvell...
Comment 3 Josh Boyer 2013-03-14 14:20:02 EDT
Is this still a problem with the 3.8.2 kernel in updates-testing?
Comment 4 Josh Boyer 2013-03-28 09:55:03 EDT
This bug is being closed with INSUFFICIENT_DATA as there has not been a
response in 2 weeks.  If you are still experiencing this issue,
please reopen and attach the relevant data from the latest kernel you are
running and any data that might have been requested previously.

Note You need to log in before you can comment on or make changes to this bug.