Bug 180063 - network device (sky2 driver) stalls
Summary: network device (sky2 driver) stalls
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 5
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
Assignee: John W. Linville
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2006-02-05 10:12 UTC by Charles Lopes
Modified: 2007-11-30 22:11 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2006-03-31 14:31:42 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Charles Lopes 2006-02-05 10:12:59 UTC
After a certain period of time that can go from a couple of hours to a couple of
days, the device stops receiving and sending packets. There's no oops or any
other kernel message generated. tcpdump of the interface still show local
packets going to this interface, but "ip -s link" show no change in RX or TX.
The only way I have found to restore network connectivity is to unload and
reload the kernel module.
The network device is built-in an Asus A8V-E motherboard. "lspci" gives this
information:
05:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8053 PCI-E Gigabit
Ethernet Controller (rev 15)
05:00.0 0200: 11ab:4362 (rev 15)


Version-Release number of selected component (if applicable):
all development kernels with sky2 driver up to kernel-2.6.15-1.1884_FC5. I'm
trying out 1907 at the moment. I observed the same problem with version 0.9 and
0.10 of sky2 applied to rawhide kernels 2.6.14. I  didn't report it before
because I thought the problem could have been due to my patching. I haven't
tried it with a vanilla kernel yet.

Comment 1 Alexandre Oliva 2006-02-13 12:23:41 UTC
Is this an SMP box?  I'm experiencing very similar problems with my A8V Deluxe
motherboard, with Athlon64X2 processor.  Rawhide uses skge, FC4 uses sk98lin,
and both present the same problem.  Booting with maxcpus=1 appears to work
around it, but it's a shame to not be able to use the second core :-(

Comment 2 Charles Lopes 2006-02-13 12:55:43 UTC
Alexandre, if I'm not mistaken you are using the older sysconnect yukon chip
which is a quite different. The Yukon II chip doesn't work with the skge driver.
There is a version of sk98lin that supports it but it's not been accepted in the
kernel because it was felt that the new chip is too different to be handled by
the same driver. Otherwise to answer your question, my box is uniproc so your
work around will not work here. Anyway I'm using an entirely different driver,
sky2, so the issues are not the same.


Comment 3 Alexandre Oliva 2006-02-13 16:52:00 UTC
Ok, thanks, I've filed a separate report now, bug 181347.

Comment 4 Charles Lopes 2006-02-14 08:12:16 UTC
I have found a discussion on the netdev mailing list that could explain the
issue that I'm seeing as I'm using both netfilters and pppoe.

http://www.mail-archive.com/netdev@vger.kernel.org/msg06567.html

I'll try that patch in the next couple of days to see if that solves my problem.

Comment 5 John W. Linville 2006-02-20 18:40:15 UTC
FWIW, the above patch is already in rawhide... 

Comment 6 Charles Lopes 2006-02-20 21:40:08 UTC
Yes and the problem seems to be somewhere else. I've been trying sky2 0.16 with
the latest rawhide kernel over the week-end and it still hangs. I've collected
some data I've sent to Stephen Hemminger. I'll keep this bug updated.

Comment 7 Charles Lopes 2006-02-20 22:24:16 UTC
I'm now testing a pre-release version of sky2 1.0. I'll report the results here
as well.

Comment 8 Charles Lopes 2006-02-24 19:56:29 UTC
Version 1.0-rc1 seems to fix my problem. 80 hours without a stall so far.
Hopefully it'll make it into -rc5 or 2.6.16.



Comment 9 John W. Linville 2006-03-03 03:38:51 UTC
Test kernels w/ sky2 version 1.0-pre1 available here: 
 
   http://people.redhat.com/linville/kernels/fc5/ 
 
Please give those a try and post the results here...thanks! 

Comment 10 Charles Lopes 2006-03-09 21:54:19 UTC
I've been running kernel-2.6.15-1.2009.2.1_FC5.jwltest.12 for over 5 days now
without any problem. So that version fixes it my problem too. Thanks.


Comment 11 Kyrre Ness Sjøbæk 2006-03-23 18:07:24 UTC
I have a similar problem with the sky2 driver - internet connection works fine,
but if i try to connect to a host on the lan (ssh, nfs...), the connection hangs
within secounds.

The kernel shipped with fc5-t3(64-bit) was fine, but the one shipped with fc5
(32&64-bit) has the problem, and so has the kernel named 2.6.16-1.2070_FC5smp.

Running a Intel P4 system with hyperthreading.

Comment 12 Kyrre Ness Sjøbæk 2006-03-23 19:56:02 UTC
It seems to works just fine with the jwltest kernel - i have been transmitting a
large ammount of data (concurrent transfers, just to be evil) over NFS, using
Internet, and using evolution troughX-over-ssh, creating as much stress as i can
for the network card. It works, at least it works better. I have seen a few
hickups, but they went away after a few secounds (instead of minutes as with
other kernels).

Comment 13 Kyrre Ness Sjøbæk 2006-03-25 10:59:27 UTC
Just a small question: Will this patch be in the next update kernel?

Comment 14 Kyrre Ness Sjøbæk 2006-03-31 11:44:49 UTC
Bug is still present in (uname -a):
Linux storeulv 2.6.16-1.2080_FC5smp #1 SMP Tue Mar 28 03:55:15 EST 2006 i686
i686 i386 GNU/Linux

Will it be in the next update kernel then? Its quite obvious that the patch is
doing good - its replacing a totaly dysfunctional driver with a working one.

Comment 15 John W. Linville 2006-03-31 14:31:42 UTC
The patch will filter into the Fedora kernel from upstream.  Please be 
patient...thanks! 

Comment 16 Kyrre Ness Sjøbæk 2006-04-21 21:57:16 UTC
Hi.
I just updated to
Linux storeulv 2.6.16-1.2096_FC5smp #1 SMP Wed Apr 19 05:31:55 EDT 2006 i686
i686 i386 GNU/Linux
And the bug is still here. Any ETA for when it will hit fedora stable kerneles?


Note You need to log in before you can comment on or make changes to this bug.