Bug 89258 - CPU exhausted by route cache during random-source DoS
CPU exhausted by route cache during random-source DoS
Product: Red Hat Linux
Classification: Retired
Component: kernel (Show other bugs)
i686 Linux
medium Severity medium
: ---
: ---
Assigned To: David Miller
Brian Brock
Depends On:
  Show dependency treegraph
Reported: 2003-04-21 20:19 EDT by Aaron Hopkins
Modified: 2007-04-18 12:53 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2003-06-07 04:35:44 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Aaron Hopkins 2003-04-21 20:19:34 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20030225

Description of problem:
Due to poor defaults, any action which generates a large number of route-cache
entries will cause the CPU usage to hit 100% and the machine to stop responding
to the network.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. Enable syncookies.
2. Generate random-source-IP SYN flood against a service listening on a TCP
port.  (Definitely reproducable with 30000 packets/sec.)
3. Wait a minute.

Actual Results:  CPU usage hits 100% and machine stops responding to any network

Expected Results:  When the route-cache entries are expired aggressively,
network traffic is responded to normally (including syncookies for all packets),
and there is some CPU free.

Additional info:

The machine is hitting the max of the net.ipv4.route.max_size default of 8192
entries.  With "net.ipv4.route.max_size = 262144", the problem goes away.

This is because Linux only starts aggressively expiring route cache entries when
the total is over net.ipv4.inet_peer_threshold, defaulting to 65664.  If the
routing cache is full, however, a garbage collection will be triggered on every
packet, even if nothing is expired.

net.ipv4.inet_peer_threshold should default to half of net.ipv4.route.max_size,
or the code fixed to aggressively expire route cache entries if the limit is hit.

I'm seeing a 30000+ packet/sec SYN flood in the wild against production servers.
 Fixing this is important for high-profile sites that aren't aware of this
Comment 1 David Miller 2003-04-30 02:12:40 EDT
Your analysis of the relationship between inet_peer_threshold
and route cache expiration is incorrect.  These two things are
totally unrelated.

inetpeer entries do not create a reference to a route cache entry
when they are attached to one.  A route cache entry with an attached
inetpeer may be expired immediately.  You can take a look at the
places where inetpeer entries are grafted onto route cache entries
(net/ipv4/route.c:rt_bing_peer()) and you will see that indeed the
rt->peer does not increament the routing cache entry reference count
nor does the presence of a non-NULL rt->peer affect garbage collection
of such a route.

Note how increasing rt_cache max_size helps.  You would find that
decreasing inetpeer_threshold would have no effect on your problem
because of how inetpeer and rt_cache settings have no bearing upon
each other as described above.

What all of this means is that something else is referencing route
cache entries on your system, and thus preventing them from being
reclaimed.  The two main candidates are:

1) The network driver you use. If the driver has very deep transmit
   queues, or defers transmit net buffer reclaim in some way, this
   can cause routing cache entries to be held up for some time.

   What driver/card are you using for your network interfaces on
   this machine?

   If you are using some load balancing (proprietary or otherwise)
   or even netfilter on this machine, please indicate this.

2) Something else other than the SYN flood is holding on to routes
   to ~8000 or so destinations (by less than 65664).

Note You need to log in before you can comment on or make changes to this bug.