Bug 219085

Summary: Silicon Integrated Systems [SiS 900] network card shuts down
Product: Red Hat Enterprise Linux 4 Reporter: David Tonhofer <bughunt>
Component: kernelAssignee: Neil Horman <nhorman>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.0CC: davem, jbaron, linville, tgraf
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHBA-2007-0791 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-11-15 16:17:02 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
patch to reoder refill operations in sis driver none

Description David Tonhofer 2006-12-10 17:06:33 UTC
Description of problem:
-----------------------

If have a machine with an integrated network card at *eth1*.
According to 'lspci', this is an:

00:04.0 Ethernet controller: Silicon Integrated Systems [SiS] SiS900 PCI Fast
Ethernet (rev 90)

Additionally, the machine has an additional network card at *eth2*.
According to 'lspci', this is an:

00:09.0 Ethernet controller: 3Com Corporation 3c905B 100BaseTX [Cyclone] (rev 34)

Finally (and not of consequence here), the machine has a third network card at
*eth0*. According to 'lspci', this is an:

00:0b.0 Ethernet controller: 3Com Corporation 3c905C-TX/TX-M [Tornado] (rev 6c)

I want to use bridge two networks using (eth1,eth2) using "brctl".

This succeeds and the bridge works for some time. However, after a few MBytes
have transited, the following errors show up in the kernel log:

Dec  7 16:25:17 greyhound kernel: eth1: Memory squeeze,deferring packet.
Dec  7 16:25:28 greyhound kernel: eth1: NULL pointer encountered in Rx ring,
skipping
Dec  7 16:25:59 greyhound last message repeated 262 times
Dec  7 16:27:00 greyhound last message repeated 239 times
Dec  7 16:28:01 greyhound last message repeated 227 times
Dec  7 16:29:02 greyhound last message repeated 157 times
Dec  7 16:30:03 greyhound last message repeated 165 times
Dec  7 16:30:29 greyhound last message repeated 50 times

...and the bridge shuts down (no packets seem to pass any longer).

This seems to be a known problem with the SiS 900:

http://lkml.org/lkml/2004/7/25/9

I need more time to play around with this...

Version-Release
---------------

Linux greyhound 2.6.9-42.0.3.ELsmp #1 SMP Mon Sep 25 17:28:02 EDT 2006 i686 i686
i386 GNU/Linux

How reproducible
----------------

Didn't try.

Comment 1 Neil Horman 2007-02-09 21:29:17 UTC
The SiS900 driver appears to do things backwards.  They unmap the received
buffer in sis900_rx, pass it to the network stack, and then try to refill the
emptied slot.  Instead they should be trying to allocate a new buffer to replace
the old one first.  If they fail, we drop the received packet and recycle the
skbuff we already have.  If we succeed in the allocation, then we recieve the
current packet and replace the hole in the rx_skbuff list with the newly
allocated skbuff.  That way we never run into the situation in which we have to
leave a hole in the buffer ring, which appears to be causing the problem.  I'll
post a patch for this early next week.

Comment 2 David Miller 2007-02-09 23:00:16 UTC
Indeed, this is exactly what every network driver should be doing
in this situation.


Comment 3 Neil Horman 2007-02-12 15:15:41 UTC
Created attachment 147898 [details]
patch to reoder refill operations in sis driver

This patch refills the rx buffer in the right oder to avoid holes in the sis
driver. I'll post a test kernel shortly.

Comment 4 Neil Horman 2007-02-12 18:39:56 UTC
kernel package with the above patch available for test at:
http://people.redhat.com/nhorman/rpms/kernel-smp-2.6.9-46.EL.bz219085.i686.rpm
Please let me know if it fixes your problem.  Thanks

Comment 5 David Tonhofer 2007-04-19 22:11:59 UTC
Update: "Still waiting for an opportunity to test this". Sorry.

Comment 6 Neil Horman 2007-04-20 13:50:46 UTC
No problem.  I'm sending this upstream, as I think its pretty straightforward. 
If it gets accepted, I'll just post it for RHEL inclusion.  Let me know what the
testing results are though, if you get to ASAP.  Thanks

Comment 8 RHEL Program Management 2007-06-19 03:34:09 UTC
This request was evaluated by Red Hat Kernel Team for inclusion in a Red
Hat Enterprise Linux maintenance release, and has moved to bugzilla 
status POST.

Comment 9 Jason Baron 2007-06-19 14:04:11 UTC
committed in stream U6 build 55.9. A test kernel with this patch is available
from http://people.redhat.com/~jbaron/rhel4/


Comment 12 errata-xmlrpc 2007-11-15 16:17:02 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0791.html