181347 – network device (skge and sk98lin drivers) stalls

Bug 181347 - network device (skge and sk98lin drivers) stalls

Summary: network device (skge and sk98lin drivers) stalls

Keywords:
Status:	CLOSED DUPLICATE of bug 182618
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	kernel
Sub Component:
Version:	rawhide
Hardware:	x86_64
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	John W. Linville
QA Contact:	Brian Brock
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2006-02-13 16:51 UTC by Alexandre Oliva
Modified:	2007-11-30 22:11 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2006-03-01 15:46:17 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Alexandre Oliva 2006-02-13 16:51:36 UTC

+++ This bug was initially created as a clone of Bug #180063 +++

(Borrowed from Charles Lopes description, but adjusted for my similar hardware)

After a certain period of time that can go from a couple of hours to a couple of
days, the device stops receiving and sending packets. There's no oops or any
other kernel message generated. tcpdump of the interface still show local
packets going to this interface, but "ip -s link" show no change in RX or TX.
The only way I have found to restore network connectivity is to unload and
reload the kernel module.
The network device is built-in an Asus A8V Deluxe motherboard. "lspci" gives
this information:
00:0a.0 Ethernet controller: Marvell Technology Group Ltd. 88E8001 Gigabit
Ethernet Controller (rev 13)

Problem happens when using both cores of the Athlon64X2 processor.  Rawhide uses
skge, FC4 uses sk98lin, and both present the same problem.  Booting with
maxcpus=1 appears to work around it, but it's a shame to not be able to use the
second core :-(

Comment 1 John W. Linville 2006-02-20 18:41:20 UTC

Have you tried using the fedora-netdev kernels? 
 
   http://people.redhat.com/linville/kernels/fedora-netdev/ 
 
Please do so, and post the results here...thanks!

Comment 2 Alexandre Oliva 2006-02-20 19:41:27 UTC

I don't see fedora-netdev kernels for FC5 or devel.  Are your FC4 kernel builds
supposed to work on FC5T3 as well, or should I go back to FC4 for proper testing?

Comment 3 John W. Linville 2006-02-20 20:09:46 UTC

The FC4 kernels probably work on FC5 userland, but you may have to install the 
RPMs manually.

Comment 4 Alexandre Oliva 2006-02-21 03:33:06 UTC

It does work in general, but it appears to still fail in the same way.  In fact,
it seems to have got even worse; in some, better.  For some time I couldn't ping
the box at all.  When I got back in front of it, to restart the network card, I
found out the disk subsystem was also dead (as per bug 181310) and, for the
first time, the screen saver had blocked access to the system (probably because
I had an ongoing build and the screen saver code had to be paged in to unlock).
 I managed to switch to VT1, verify that the disk subsystem was dead and then
use Alt-SysRq to try to get some info.  I couldn't collect any useful info, but
after killing all processes and remounting filesystems read-only, I tried to
ping the box again from an external host and that worked, so somehow networking
seems to have recovered.  I can't tell whether that was because of my Alt-SysRq
interactions or because of the elapsed time, unfortunately.  I'll give that
kernel a try again later and try to determine whether I ever manage to trigger
the networking error without an associated disk error.

Nothing useful in /var/log/messages; it had been dead for hours, so maybe the
disk subsystem died before the network.  Too bad I couldn't even log in to
figure out what was going on :-(

Comment 5 John W. Linville 2006-03-01 15:46:17 UTC


*** This bug has been marked as a duplicate of 182618 ***

Note You need to log in before you can comment on or make changes to this bug.