Bug 973605 - rcu_preempt detected stalls on CPUs/tasks / comm: irq/43-eth0-rx- Not tainted
Summary: rcu_preempt detected stalls on CPUs/tasks / comm: irq/43-eth0-rx- Not tainted
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: realtime-kernel
Version: 2.3
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Arnaldo Carvalho de Melo
QA Contact: MRG Quality Engineering
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-06-12 10:20 UTC by evcz
Modified: 2013-08-13 15:07 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-08-13 15:07:40 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
messages (26.88 KB, text/plain)
2013-06-12 10:20 UTC, evcz
no flags Details

Description evcz 2013-06-12 10:20:28 UTC
Created attachment 760071 [details]
messages

Description of problem:
During a denial of service attack the network "crashed" on an HP DL120 G7
Attack peak was 350000 packets per second and using less then 200mbit/sec bandwidth

Attack was a spoofed syn flood, syn cookies were disabled and we are running Abort_On_Overflow and ratelimiting new connections to specific ports

iptables ruleset that was in use (and that teared the network down?) was this one:

#SYN RATELIMIT TO SPECIFIC DESTINATION PORTS - no more then 100 syn/sec to dst port (6seconds tracking)
iptables -N INSYNPRELIMITER
iptables -A INSYNPRELIMITER -m hashlimit --hashlimit-htable-expire 6000 --hashlimit-htable-size 1048578 --hashlimit-htable-max 1048579 --hashlimit-mode dstport --hashlimit-name synDelimiter --hashlimit 100/s --hashlimit-burst 250 -j RETURN
iptables -A INSYNPRELIMITER -j DROP

iptables -I IN_SANITY -p tcp -m multiport --dports $ALLOWED_IG_TCP_CPORTS_LIST --syn -m state ! --state RELATED,ESTABLISHED -j INSYNPRELIMITER

It makes a big use of conntrack

This server is connected to an HP ProCurve J9147A 2910al-48G and set in LACP bonding (2 ports)

Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer3+4 (1)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

802.3ad info
LACP rate: slow
Min links: 0
Aggregator selection policy (ad_select): stable
Active Aggregator Info:
        Aggregator ID: 2
        Number of ports: 2
        Actor Key: 17
        Partner Key: 59
        Partner Mac Address: 78:ac:c0:xx:xx:xx

Slave Interface: eth0
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: a0:b3:cc:xx:xx:xx
Aggregator ID: 2
Slave queue ID: 0

Slave Interface: eth1
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: a0:b3:cc:xx:xx:xx
Aggregator ID: 2
Slave queue ID: 0

Version-Release number of selected component (if applicable):
3.6.11.2-rt33.39.el6rt.x86_64

How reproducible:
I'm not sure how to reproduce it, this is just the second time we see it happening (previous time we were still running the 3.2 kernel-rt)

Additional info:
I've attached to this report the messages log

Once the attack ended the network went back properly.
I'm not sure if this whole stacktrace log means the box was just overloaded or if it could have been avoided somehow

Comment 1 Luis Claudio R. Goncalves 2013-07-02 17:49:01 UTC
We are investigating a couple of issues that could be related to this one. Although my first impulse is to believe that the attack caused the increased load that led to the RCU stalls (and that could have been helped by the thread priorities in use on the system, if the were modified), we have one issue with bonding and one probable issue involving skb_gro_receive that could be related.

As soon as we have more information on these issues I will post the results here.

Meanwhile, could you please post here the version of rtctl you are using?

Comment 2 evcz 2013-07-02 19:20:27 UTC
rpm -qa rtctl
rtctl-1.11-2.el6rt.noarch

Comment 3 Arnaldo Carvalho de Melo 2013-07-29 18:17:16 UTC
From the backtraces:

Cpu0: irq/43-eth0-rx-  processing RX packet (netfilter)
Cpu1: idle
Cpu2: idle
Cpu3: irq/45-eth0      tick sched timer, updating process times
Cpu4: irq/46-eth1-rx-  processing RX packet (e1000 irq handler)
Cpu5: idle
Cpu6: idle
Cpu7: irq/45-eth0       processing RX packet (e1000 irq handler)

So 50% of the cpus are idle, three are associated with eth0 and one with eth1. On RT the IRQ handlers runs as threads, was there any kind of tuning performed on those threads? You can see the name of the threads above, e.g. "irq/43-eth0-rx-".

What is the CPU topology of this machine? Is irqbalance running?

Comment 4 evcz 2013-07-30 20:17:42 UTC
irqbalance not running (not installed at all), no tuning done

that machine has this cpu:
http://ark.intel.com/products/52273/

Comment 5 Arnaldo Carvalho de Melo 2013-08-06 14:44:56 UTC
The idle cpus are HT ones, as per the machine description, so the default tuning should be enough, i.e. no margin for attempting dedicating specific CPU sockets to each NIC.

The messages appear because the machine is just too overloaded to do RCU processing, not a sign of a bug.

Since the machine comes back normally after this extreme overload, perhaps a form of alleviating the problem would be to attempt some further tuning on the contrack setup, have you managed to reproduce this problem in a test setup?

Comment 6 evcz 2013-08-06 19:59:38 UTC
Hi,

I was not able to reproduce it (actually I did not do much tests with spoofed traffic up to that rate)

I tuned the iptables ruleset to add some filtering to reduce a lot the traffic reaching conntrak so I hope it will not happen anymore.

I've got the same sensation about the overload, so probably this might not be exactly a bug.

Thank you,
best regards

Comment 7 Clark Williams 2013-08-13 15:07:40 UTC
(In reply to evcz from comment #6)
> Hi,
> 
> I was not able to reproduce it (actually I did not do much tests with
> spoofed traffic up to that rate)
> 
> I tuned the iptables ruleset to add some filtering to reduce a lot the
> traffic reaching conntrak so I hope it will not happen anymore.
> 
> I've got the same sensation about the overload, so probably this might not
> be exactly a bug.
> 
> Thank you,
> best regards

We've looked at this a good bit and the consensus is that it worked-as-designed in that the kernel did recover after your network load dropped off to normal levels. I suspect that adjusting the iptables rules to catch DOS sequences will help but not sure there's actually a bug in here. 

Closing as NOTABUG, but if you encounter this again, please reopen and we'll look at it again.


Note You need to log in before you can comment on or make changes to this bug.