Bug 119664 - Hard lock with r8169 NIC module.
Hard lock with r8169 NIC module.
Status: CLOSED WORKSFORME
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
rawhide
i686 Linux
medium Severity high
: ---
: ---
Assigned To: Jeff Garzik
:
Depends On:
Blocks: FC2Target
  Show dependency treegraph
 
Reported: 2004-04-01 02:51 EST by Alejandro Mota
Modified: 2013-07-02 22:18 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2004-04-13 19:14:54 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Alejandro Mota 2004-04-01 02:51:42 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6)
Gecko/20040211 Firefox/0.8

Description of problem:
A hard lock occurs when using the r8169 NIC module and high outbound
traffic exists. No lock is observed with other NICs or when all
traffic is inbound. Same behavior observed in kernel-2.6.4-1.298.

A patch was recently applied to this module:

  http://bugzilla.kernel.org/show_bug.cgi?id=2123

Perhaps this is the source of the problem?

Version-Release number of selected component (if applicable):
kernel-2.6.4-1.300

How reproducible:
Always

Steps to Reproduce:
1. Any process that triggers a high amount of outbound traffic will
trigger the problem. For instance:
2. cd /usr/src/
3. scp -r linux-2.6.4-1.300/ othermachine:/tmp


    

Actual Results:  Hard lock occurs shortly after the copying starts.

Expected Results:  Normal transfer of files.

Additional info:

From /var/log/messages:

Mar 31 22:49:38 xx kernel: r8169 Gigabit Ethernet driver 1.2 loaded
Mar 31 22:49:39 xx kernel: eth1: RealTek RTL8169 Gigabit Ethernet at
0x4284a800, 00:90:f5:27:01:e1, IRQ 217
Mar 31 22:49:39 xx kernel: eth1: Auto-negotiation Enabled.
Mar 31 22:49:39 xx kernel: eth1: 100Mbps Full-duplex operation.
Comment 1 Alejandro Mota 2004-04-02 05:07:20 EST
Further testing shows that the problem happens on the SMP kernels only.
The single-processor kernels apparently are not affected by it.
Comment 2 Alejandro Mota 2004-04-13 19:14:54 EDT
Francois Romieu provided a patch that when applied to
kernel-source-2.6.5-1.319 solves the problem. The link to this patch is:
http://www.fr.zoreil.com/people/francois/misc/20040407-2.6.5-r8169.c-stable.patch
Comment 3 Fredrik Noring 2004-09-20 15:06:48 EDT
I too have a Realtek RTL-8169 and it bugs out every 2-4 days (Fedora 
Core 2, kernel 2.6.8-1.521). When it happens, I once saw the console
flooded with the message:

   eth0: Too much work on interrupt

The kernel appears to be locked up completely when this happens.
Network traffic has not been very high.

Should I open a new ticket for this?
Comment 4 Alejandro Mota 2004-09-20 15:23:03 EDT
I had this problem too on a Pentium 4 HT machine, running SMP kernels.
I solved it by adding the noapic option to the kernel at boot time.
Since I did this the problem stopped. I never experienced this bug
when running UP kernels.
Comment 5 Francois Romieu 2004-09-25 18:20:26 EDT
Please use the patch referred below to sync your kernel with a 
recent vanilla kernel. Amongst many things, napi could make a 
difference. If the symptoms do not disappear, consider opening 
a new ticket and Ccing me. 
 
Btw, assuming people are not hit by r8169 unrelated issues, the 
driver has already shown to be quite stable on real SMP systems. 
 
Patch available at: 
http://www.fr.zoreil.com/linux/kernel/2.6.x/2.6.8-1.521/r8169.c-2.6.8-1.521-to-2.6.9-rc2-dac.patch 

Note You need to log in before you can comment on or make changes to this bug.