Bug 453291

Summary: Filters for tc and netem can no longer be defined
Product: [Fedora] Fedora Reporter: Karl Auerbach <karl>
Component: iprouteAssignee: Marcela Mašláňová <mmaslano>
Status: CLOSED NEXTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 9CC: kernel-maint, mmaslano, rvokal
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-12-10 04:36:24 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Shell script to trigger the problem.
none
Script output when things work (on Fedora 8 system)
none
Script output when things do not work (on a Fedora 9 system) none

Description Karl Auerbach 2008-06-29 00:21:57 UTC
Description of problem:

Netem/tc filters have stopped working, or more precisely, they are no longer
accepted into the kernel.  This used to work but it stopped working with recent
kernel releases.  It worked fine on Fedora 8.

Version-Release number of selected component (if applicable):

Linux A192-203-17-213.cavebear.com 2.6.25.6-55.fc9.x86_64 #1 SMP Tue Jun 10
16:05:21 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux

This same problem also occurs on non-Fedora distros that use very recent kernels.

How reproducible:

Always

Steps to Reproduce:
1. See attached shell script
2.
3.
  
Actual results:

See the attached shell script.

Expected results:


Additional info:

Comment 1 Karl Auerbach 2008-06-29 00:21:57 UTC
Created attachment 310524 [details]
Shell script to trigger the problem.

Comment 2 Chuck Ebbert 2008-07-03 04:18:27 UTC
(In reply to comment #1)
> Created an attachment (id=310524) [edit]
> Shell script to trigger the problem.
> 

Can you post the output messages from the script?

"no longer accepted into the kernel" is not a useful description of the problem.


Comment 3 Karl Auerbach 2008-07-03 20:07:56 UTC
Created attachment 310957 [details]
Script output when things work (on Fedora 8 system)

This is the output from the test script when run on a system that does not have
the problem (in particular a Fedora 8 system).

I will also attach a copy of the output on a system that has the problem (a
Fedora 9 box.)

Comment 4 Karl Auerbach 2008-07-03 20:08:44 UTC
Created attachment 310958 [details]
Script output when things do not work (on a Fedora 9 system)

Here is the companion output that shows what happens when the script is run on
a misbehaving system (Fedora 9)

Comment 5 Karl Auerbach 2008-07-03 20:16:40 UTC
I have attached two output files, these show the script output that happens on a
good system and a bad system.

The script itself contains the expected output and compares it to what it
actually got.

Sorry for not being more clear about what is going on, here is a reprise:

The "tc" command uses netlink to interact with the kernel to build various
packet queues and filters.  These queues and filters are used to to traffic
shaping and also things like netem.

There were some recent kernel changes with respect to netlink that broke the
ability to send some parameters to the sch_netem kernel module.  Those were
fixed.  These changes, and subsequent fixes, may be related to the bug I am
reporting now.

At the current time "tc" seems to be able to properly set up and report traffic
queues and to interact with the netem module.

However, the "tc" command seems to no longer have the ability to establish
filters in the kernel.

Until a few kernel versions ago "tc" was able to establish filters; as of the
current time that ability is gone.

The test script builds a basic traffic control framework and then tries to add a
couple of filters.

That script used to work.  But now several of the filter-building commands fail
and report an error from the kernel.

The kernel says nothing into dmesg and there's nothing in the normal system log
files.

Selinux is present but in "permissive" mode.  No selinux messages are emitted
into its logs.

Comment 6 Chuck Ebbert 2008-07-07 23:47:55 UTC
Seems to work in a 2.6.25.9 kernel...

Comment 7 Karl Auerbach 2008-07-08 09:23:27 UTC
Yes, I'm also seeing it fixed on both 32 and 64-bit platforms, at least with
Fedora 8.

By-the-way, 2.6.25.10 direct from kernel.org without the RH/Fedora patches still
fails.  I wonder which patch had the silver bullet.

I'll do some further testing later today (Tuesday/July 8) on a Fedora 9/x86-64
box and let you know.


Comment 8 Karl Auerbach 2008-07-08 20:41:01 UTC
Still fails on Fedora 9 on Linux A192-203-17-213.cavebear.com
2.6.25.9-76.fc9.x86_64 #1 SMP Fri Jun 27 15:58:30 EDT 2008 x86_64 x86_64 x86_64
GNU/Linux

So it seems fixed on the latest Fedora 8 kernel but not on the latest Fedora 9
kernel.


Comment 9 Chuck Ebbert 2008-07-09 18:25:51 UTC
Hmm, maybe the bug is in the tc program and not the kernel?


Comment 10 Karl Auerbach 2008-07-09 20:49:23 UTC
Could be.

However, the same problem occurs on a small distro I build for embedded systems
using iproute2-2.6.25 (the latest version of tc).  And there hasn't been a
change to 'tc' on the Fedora side through all of this.

My own guess is that some of the recent kernel changes to netlink and its kernel
macros are involved.  Those changes clobbered some of the parameters being sent
from 'tc' to the sch_netem module (this has been fixed) however it would not be
surprising to see the same kind of thing is clobbering the filter parameters
that 'tc' sends via the same mechanisms.

I tend to suspect that the fix is in one of those patches in the Fedora 8 kernel
chain that is not in the Fedora 9 kernel chain.


Comment 11 Karl Auerbach 2008-07-09 23:36:22 UTC
I ran some tests and got some interesting results to see if I could figure out
whether it is the kernel acting up or the 'tc' command.

On a F8 box I ran both the F8 and F8 binaries for 'tc".  The F8 one worked, the
F9 one showed the problem.

Then on an F9 box I again ran both the F8 and F9 binaries for 'tc'.  Again, the
F8 one worked and the F9 one showed the problem.

On the F8 box its:
root@klack(44): rpm -qf /sbin/tc
iproute-2.6.22-2.fc8

And on the F9:
iproute-2.6.25-1.fc9.x86_64

Hmmm ... on my embedded system, on which I'm seeing the problem, I'm also using
the 2.6.25 version of iproute2.





Comment 12 Chuck Ebbert 2008-08-12 02:34:29 UTC
Looks like this is fixed in iproute2 2.6.26.

Both F8 and F9 should get this version since that kernel version is going to be pushed to them soon.

Comment 13 Marcela Mašláňová 2008-08-12 12:54:31 UTC
I tried iproute-2.6.26 on my F-9 box with 2.6.25.11-97.fc9.i686 and no luck. I think it was something from tc also fixed in kernel. I'll push this update of iproute into testing and we'll see whether kernel and iproute fix this issue.

Comment 14 Fedora Update System 2008-08-12 14:17:57 UTC
iproute-2.6.26-1.fc9 has been submitted as an update for Fedora 9

Comment 15 Karl Auerbach 2008-08-16 00:45:01 UTC
I have not had a chance to check out iproute-2.6.26-1.fc9  (in fact I'm having trouble finding it), but my testing of a vanilla version if iproute2 2.6.26 on a vanilla 2.5.26-2 kernel indicates that the problem still exists on that pair.

If I run the following script right out of the netem page (http://www.linuxfoundation.org/en/Net:Netem) the kernel no longer emits error messages but the filters are not set up.  But if I change the script to use a version of 'tc' compiled using the iproute2-2.6.24-rc7 code base then things seem happy.

#!/bin/sh

DEV=eth1
TC=/usr/sbin/tc

echo ${TC} filter del dev ${DEV} prio 3
${TC} filter del dev ${DEV} prio 3
echo ${TC} filter add dev ${DEV} parent 1:0 prio 3 protocol ip u32
${TC} filter add dev ${DEV} parent 1:0 prio 3 protocol ip u32
echo ${TC} filter add dev ${DEV} parent 1:0 prio 3 handle 2: u32 divisor 1
${TC} filter add dev ${DEV} parent 1:0 prio 3 handle 2: u32 divisor 1
echo ${TC} filter add dev ${DEV} parent 1:0 handle ::1 prio 3 u32 match u8 0x40 0xF0 at 0 offset mask 0x0F00 at 0 shift 6 plus 0 ht 800:: link 2:
${TC} filter add dev ${DEV} parent 1:0 handle ::1 prio 3 u32 match u8 0x40 0xF0 at 0 offset mask 0x0F00 at 0 shift 6 plus 0 ht 800:: link 2:
echo ${TC} filter add dev ${DEV} protocol ip parent 1:0 prio 3 u32 match u8 0x11 0xff at 9 flowid 1:4 ht 2:0:
${TC} filter add dev ${DEV} protocol ip parent 1:0 prio 3 u32 match u8 0x11 0xff at 9 flowid 1:4 ht 2:0:

echo ${TC} qdisc show dev ${DEV}
${TC} qdisc show dev ${DEV}
echo ${TC} class show dev ${DEV}
${TC} class show dev ${DEV}
echo ${TC} filter show dev ${DEV}
${TC} filter show dev ${DEV}

Comment 16 Marcela Mašláňová 2008-08-19 13:14:06 UTC
The problem is also in rawhide: kernel-2.6.27 and latest iproute2 from upstream git.

Comment 17 Fedora Update System 2008-09-10 07:14:35 UTC
iproute-2.6.26-1.fc9 has been pushed to the Fedora 9 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update iproute'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F9/FEDORA-2008-7583

Comment 18 Marcela Mašláňová 2008-09-16 14:15:13 UTC
Ok, this update didn't fix tc.

Could you tell me please which kernel do you have with working iproute2-2.6.24-rc7? I tried different versions of kernel and iproute and now I can't find any functional pair ;-)

/tmp/iproute/devel/iproute-2.6.24/iproute2-2.6.24/tc/tc filter add dev eth0 parent 1:0 prio 3 protocol ip u32
RTNETLINK answers: Invalid argument
We have an error talking to the kernel

Comment 19 Marcela Mašláňová 2008-09-22 12:32:55 UTC
Your scripts are missing new mandatory argument "protocol". 

I changed the error message from "We have an error talking to the kernel" to "protocol is required" and I'll send it to upstream soon.

Comment 20 Marcela Mašláňová 2008-12-04 09:14:43 UTC
This should be fixed in iproute-2.6.27.

Comment 21 Fedora Update System 2008-12-04 10:17:37 UTC
iproute-2.6.27-1.fc9 has been submitted as an update for Fedora 9.
http://admin.fedoraproject.org/updates/iproute-2.6.27-1.fc9

Comment 22 Fedora Update System 2008-12-07 04:32:00 UTC
iproute-2.6.27-1.fc9 has been pushed to the Fedora 9 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing-newkey update iproute'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F9/FEDORA-2008-10904

Comment 23 Fedora Update System 2008-12-10 04:36:20 UTC
iproute-2.6.27-1.fc9 has been pushed to the Fedora 9 stable repository.  If problems still persist, please make note of it in this bug report.