Bug 181347

Summary: network device (skge and sk98lin drivers) stalls
Product: [Fedora] Fedora Reporter: Alexandre Oliva <oliva>
Component: kernelAssignee: John W. Linville <linville>
Status: CLOSED DUPLICATE QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: rawhideCC: davej, oliva, wtogami
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-03-01 15:46:17 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Alexandre Oliva 2006-02-13 16:51:36 UTC
+++ This bug was initially created as a clone of Bug #180063 +++

(Borrowed from Charles Lopes description, but adjusted for my similar hardware)

After a certain period of time that can go from a couple of hours to a couple of
days, the device stops receiving and sending packets. There's no oops or any
other kernel message generated. tcpdump of the interface still show local
packets going to this interface, but "ip -s link" show no change in RX or TX.
The only way I have found to restore network connectivity is to unload and
reload the kernel module.
The network device is built-in an Asus A8V Deluxe motherboard. "lspci" gives
this information:
00:0a.0 Ethernet controller: Marvell Technology Group Ltd. 88E8001 Gigabit
Ethernet Controller (rev 13)

Problem happens when using both cores of the Athlon64X2 processor.  Rawhide uses
skge, FC4 uses sk98lin, and both present the same problem.  Booting with
maxcpus=1 appears to work around it, but it's a shame to not be able to use the
second core :-(

Comment 1 John W. Linville 2006-02-20 18:41:20 UTC
Have you tried using the fedora-netdev kernels? 
 
   http://people.redhat.com/linville/kernels/fedora-netdev/ 
 
Please do so, and post the results here...thanks! 

Comment 2 Alexandre Oliva 2006-02-20 19:41:27 UTC
I don't see fedora-netdev kernels for FC5 or devel.  Are your FC4 kernel builds
supposed to work on FC5T3 as well, or should I go back to FC4 for proper testing?

Comment 3 John W. Linville 2006-02-20 20:09:46 UTC
The FC4 kernels probably work on FC5 userland, but you may have to install the 
RPMs manually. 

Comment 4 Alexandre Oliva 2006-02-21 03:33:06 UTC
It does work in general, but it appears to still fail in the same way.  In fact,
it seems to have got even worse; in some, better.  For some time I couldn't ping
the box at all.  When I got back in front of it, to restart the network card, I
found out the disk subsystem was also dead (as per bug 181310) and, for the
first time, the screen saver had blocked access to the system (probably because
I had an ongoing build and the screen saver code had to be paged in to unlock).
 I managed to switch to VT1, verify that the disk subsystem was dead and then
use Alt-SysRq to try to get some info.  I couldn't collect any useful info, but
after killing all processes and remounting filesystems read-only, I tried to
ping the box again from an external host and that worked, so somehow networking
seems to have recovered.  I can't tell whether that was because of my Alt-SysRq
interactions or because of the elapsed time, unfortunately.  I'll give that
kernel a try again later and try to determine whether I ever manage to trigger
the networking error without an associated disk error.

Nothing useful in /var/log/messages; it had been dead for hours, so maybe the
disk subsystem died before the network.  Too bad I couldn't even log in to
figure out what was going on :-(

Comment 5 John W. Linville 2006-03-01 15:46:17 UTC

*** This bug has been marked as a duplicate of 182618 ***