Bug 447812 - Netlink messages from "tc" to sch_netem module are not interpreted correctly
Netlink messages from "tc" to sch_netem module are not interpreted correctly
Status: CLOSED CURRENTRELEASE
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
9
x86_64 Linux
low Severity medium
: ---
: ---
Assigned To: Kernel Maintainer List
Fedora Extras Quality Assurance
:
: 447809 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-05-21 17:38 EDT by Karl Auerbach
Modified: 2008-06-27 02:39 EDT (History)
2 users (show)

See Also:
Fixed In Version: 2.6.25.6-55.fc9
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-06-12 22:27:29 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Karl Auerbach 2008-05-21 17:38:24 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.8.1.14) Gecko/20080416 Fedora/2.0.0.14-1.fc8 Firefox/2.0.0.14

Description of problem:
When using the tc command to send netem settings to the netem module, the messages cause a kernel error to be emitted into dmesg and part of the data in the netlink message may be lost.

This makes the netem mechanism quietly unreliable when used on a 2.6.25 kernel.

I have tried this on all of the 2.6.25.X kernels and the results are the same.

This problem did not occur with the2.6.24 kernels.

The problem is the same on x86_64 and i386 architectures.

Version-Release number of selected component (if applicable):
2.6.25.3-18.fc9.x86_64

How reproducible:
Always


Steps to Reproduce:
Here is a shell script (needs to be run as root):

#!/bin/bash

DEV=eth0
TC=/sbin/tc

# Clean out any prior settings.
# This may generate some messages of the form:
#   RTNETLINK answers: No such file or directory
${TC} qdisc del dev ${DEV} root > /dev/null 2>&1
${TC} qdisc del dev ${DEV} ingress > /dev/null 2>&1

${TC} qdisc add dev ${DEV} root handle 1: prio bands 5 priomap 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
${TC} qdisc add dev ${DEV} parent 1:1 handle 10: netem

# If the kernel is acting up this will cause a kernel
# message of the following form to be emitted and visible
# via dmesg | tail
#  netlink: 12 bytes leftover after parsing attributes.
${TC} qdisc change dev ${DEV} parent 1:1 handle 10: netem delay 50ms 5ms 10% corrupt 8%
echo
echo
echo "Does the following contain a netlink message about leftover bytes?"
echo "If so, the the kernel code in .../net/netlink/attr.c"
echo "is unhappy with the netlink messages from the tc command."
dmesg | tail -3

# Take a look at the netem status and see whether a corruption
# value has been established or not.
echo
echo
echo "Does the following show a corruption setting or not?"
echo "If not then the kernel module .../net/sched/sch_netem.c"
echo "did not pick up all the pieces from the netlink message"
echo "complained of by .../net/netlink/attr.c"
echo "A GOOD response should look like this:"
echo "    qdisc prio 1: root bands 5 priomap  4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4"
echo "    qdisc netem 10: parent 1:1 limit 1000 delay 50.0ms  5.0ms 10% corrupt 8%"

${TC} qdisc show dev ${DEV}

${TC} qdisc show dev ${DEV} | grep netem | grep -q corrupt > /dev/null
RC=$?

if [ "${RC}" != 0 ] ; then
   echo
   echo "I did not see any corruption setting, did you?"
   echo "Seems like there is a bug in the tc-to-netem module netlink."
fi

# Clean up after ourselves.
# This may generate some messages of the form:
#   RTNETLINK answers: No such file or directory
${TC} qdisc del dev ${DEV} root > /dev/null 2>&1
${TC} qdisc del dev ${DEV} ingress > /dev/null 2>&1


Actual Results:
On all 2.6.25 kernels, whether i386 or x86_64, this causes a kernel message to be emitted about unused bytes.  In addition, the netem module does not pick up all of the data that was sent to it, such as the corruption settings.

All of this stuff worked in the 2.6.24 kernels.

Expected Results:
No kernel message should have been emitted.

The data sent by the user via the "tc" command should have been received by the netem module.

Additional info:
I tried to figure out whether the problem is in "tc" or in the kernel.

So I ran some old "tc" binaries on new kernels.  The result was the same as if I had run current "tc" binaries.  This suggests that the problem is in the kernel rather than in the "tc" command.

I also did some simple printk debugging on the way that .../net/netlink/attr.c was parsing the netlink messages.  The number of bytes that it reported unused always was the sizeof the first chunk of netem data that was lost in the netlink message.
Comment 1 Dave Jones 2008-05-21 17:42:28 EDT
*** Bug 447809 has been marked as a duplicate of this bug. ***
Comment 2 Chuck Ebbert 2008-05-27 21:49:01 EDT
fixed in 2.6.25.4-37
Comment 3 Fedora Update System 2008-06-11 21:38:43 EDT
kernel-2.6.25.6-55.fc9 has been submitted as an update for Fedora 9
Comment 4 Fedora Update System 2008-06-12 22:27:03 EDT
kernel-2.6.25.6-55.fc9 has been pushed to the Fedora 9 stable repository.  If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.