Bug 114205 - (3c59x) Network slows to a crawl when interface has received over 2^31 bytes
Summary: (3c59x) Network slows to a crawl when interface has received over 2^31 bytes
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: kernel
Version: 9
Hardware: i686
OS: Linux
medium
high
Target Milestone: ---
Assignee: Jeff Garzik
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2004-01-23 23:08 UTC by Need Real Name
Modified: 2013-07-03 02:17 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2004-05-08 21:36:46 UTC
Embargoed:


Attachments (Terms of Use)

Description Need Real Name 2004-01-23 23:08:59 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US;
rv:1.5) Gecko/20031007 Firebird/0.7

Description of problem:
RedHat 7.3 through 9, as well as recent Fedora kernels have shown that
network traffic beyond the local network slows to a crawl whenever any
given ethernet interface crosses the 2^31 byte boundary. The network
works again as expected if one ifdowns the interface and rmmod/insmod
the driver, followed by an ifup. A few servers at Stanford and a bus
desktop have experienced this in the last few weeks after upgrading to
the latest stable kernel (2.4.20-28.[7,8,9])

Version-Release number of selected component (if applicable):
kernel-2.4.20-28.x

How reproducible:
Always

Steps to Reproduce:
1. Send large amoungs of data over time. Notice traffic beyond subnet
becoming painfully slow
2. ifdown eth0, rmmod 3c59x/e100/etc, insmod 3c59x, ifup eth0
3. Counters reset, network is fine until threshold is met.
    

Actual Results:  continuing cycle of network hitting a brick wall
until rebooted or manual intervention

Expected Results:  network performance should not degrade so
drastically at any given point, but remain consistent.

Additional info:

(Notes from one admin within Stanford)

We upgraded a cluster of 20 RedHat 7.3 machines from 2.4.18 to 2.4.20,
and today (day 15 of uptime) we have multiple users complaining about
slow IP traffic to off campus sites.  Rebooting nodes fixes the
slow-up, but so does

    ifdown eth0
    rmmod 3c59x
    ismod 3c59x
    ifup eth0

The machines are all using the 3c59x ethernet driver.  This problem
acts like a memory leak; gradual creep until it becomes intolerable.

This is cured by reloading the module.

The following protocols were found to have significant slowdowns

  HTTP
  RTSP
  FTP

Separately a different user also had the problem with RedHat 9 and
Fedora using latest kernels w/ an e100 driver

Comment 1 Leif Harcke 2004-05-08 17:24:12 UTC
This should be marked NOTABUG.  We discovered it was an interaction
with the packet shapers that our central IT office uses to curb file
sharing traffic.  Ports 41000-41999 were marked by the packet shaper
as AudioGalaxy music sharing ports.  The Linux kernel uses 32768-61000
in a linear fashion for unspecified connections.  After a certain
amount of uptime, newly booted machines reach the 41000-41999 range,
and appeared to drop off the net.  Other OS's use port ranges above
(MacOS X) or below (Windows) this range, or randomly assign to a large
range (some BSD's) which include 41000-41999 as a subset.  For OS's
which didn't use the range, the problem never appeard.  For OS's that
randomly used the range, the problem fixed itself on the next randomly
generated port connection.  Since Linux assigns ports in a linear
fashion starting at 32768, once a newly booted kernel reached 41000,
all networking ground to a halt, and a reboot appeared to be the only fix.

Comment 2 Rik van Riel 2004-05-08 21:35:48 UTC
I believe you can work around firewalls such as yours by restricting
the local port numbers used by Linux, by tweaking
/proc/sys/net/ipv4/ip_local_port_range

I'm not a networking guy so I'm not sure, but this is probably worth a
try...

Comment 3 Rik van Riel 2004-05-08 21:36:46 UTC
As requested.


Note You need to log in before you can comment on or make changes to this bug.