Description of Problem: After the kernel upgrade from 2.4.9-34 -> 2.4.18-17.7.x, the shapecfg based traffic shaping is not working as expected. A configuration that worked under 2.4.9-34 and limited the server to 40Mb/s output is no longer working correctly under 2.4.18-17.7.x. Steps to Reproduce: 1. Setup shapecfg to rate limit traffic 2. Boot into 2.4.9-34, run /sbin/cbq, and observe that traffic shaping works 3. Boot into 2.4.18-17.7.x, run /sbin/cbq, and observe that traffic shaping does not work. Additional Information: I have used /sbin/tc -s qdisc to verify that the kernel's traffic shaping tables are indeed getting populated, and they are in both cases. 2.4.18-17.7.x: qdisc tbf 8014: dev eth1 rate 40Mbit burst 0b lat 5.7s Sent 95405708292 bytes 65866884 pkts (dropped 1114107, overlimits 0) qdisc tbf 8013: dev eth1 rate 30Mbit burst 7679b lat 3.8s Sent 110648510 bytes 82238 pkts (dropped 8, overlimits 0) qdisc cbq 11: dev eth1 rate 100Mbit (bounded,isolated) prio no-transmit Sent 95553392562 bytes 65973876 pkts (dropped 1114136, overlimits 0) borrowed 0 overactions 0 avgidle 0 undertime 0 qdisc tbf 8012: dev eth3 rate 30Mbit burst 407037b lat 306.4s Sent 15523275 bytes 14198 pkts (dropped 0, overlimits 0) qdisc tbf 8011: dev eth3 rate 40Mbit burst 790Kb lat 3.8s Sent 371566 bytes 277 pkts (dropped 0, overlimits 0) qdisc cbq 13: dev eth3 rate 100Mbit (bounded,isolated) prio no-transmit Sent 65365073 bytes 54427 pkts (dropped 0, overlimits 0) borrowed 0 overactions 0 avgidle 0 undertime 0 2.4.9-34: qdisc tbf 8010: dev eth1 rate 40Mbit burst 800Kb lat 95.4ms Sent 6120072085514 bytes 1395106 pkts (dropped 521517270, overlimits 183082) backlog 1154401b 799p qdisc tbf 800f: dev eth1 rate 30Mbit burst 10Kb lat 1.6ms Sent 3015206901 bytes 3293553 pkts (dropped 3765, overlimits 594463) qdisc cbq 11: dev eth1 rate 100Mbit (bounded,isolated) prio no-transmit Sent 6126443204970 bytes 6975099 pkts (dropped 521521035, overlimits 3276727909) backlog 799p borrowed 0 overactions 0 avgidle 419 undertime 0 qdisc tbf 800e: dev eth3 rate 30Mbit burst 400Kb lat 190.7ms Sent 168569898167 bytes 112910190 pkts (dropped 0, overlimits 0) qdisc tbf 800d: dev eth3 rate 40Mbit burst 800Kb lat 1us Sent 23667003328 bytes 15937878 pkts (dropped 0, overlimits 0) qdisc cbq 13: dev eth3 rate 100Mbit (bounded,isolated) prio no-transmit Sent 201429878569 bytes 136267731 pkts (dropped 0, overlimits 214334) borrowed 0 overactions 0 avgidle 624 undertime 0 Note that the output above is from a slightly different configuration while I am trying to isolate the core problem. The 2.4.9 system has: BUFFER=800Kb/8 LIMIT=1200Kb in the CBQ config file, while the 2.4.18 does not. With those values in the cbq config file, the 2.4.18 kernel was not dropping any packets at all. Without those values, it appears that the kernel is performing some traffic shaping, just not what is expected.
Here is the CBQ config file in use: DEVICE=eth1,100Mbit,10Mbit RATE=40Mbit WEIGHT=4Mbit PRIO=5 BUFFER=800Kb/8 LIMIT=1200Kb RULE=<local ip>, And /sbin/cbq had AVPKT changed to 10000 so that shaping would work correctly in 2.4.9.
2.4.18-17.7.x i586 with HZ=100 appears to work as expected, so it looks like the higher HZ values in 2.4.18-17.7.x i686 kernels break some assumptions in the traffic shaping code.
The question is, is this a kernel bug, or is it a bug in shapecfg? If you can confirm it is the latter, reassign it to me arjan if you like.
sysstat is also afflicted by a similar problem in bug #73827
Oops! The above bug in my last comment is wrong. The sysstat bug is bug #74302
So, uh, this has been quiet for a couple of weeks, and wasn't fixed in the latest errata release. Is there a workaround? Andrew mentioned assumptions based on the HZ value in the kernel. Should values passed to the traffic shaper be adjusted for this? If so, how so? I *really* need this traffic shaping.
This appears to have been fixed as of 2.4.18-26.7.x. Shaping is working at least up to 70Mb/s now.
I've tested this on rhl 7.2 on kernel 2.4.18-27.7.x and it doesn't seem to be fixed. I'm seeing a lot higher load than I was wunder 2.4.9-31 and not seeing the bandwidth usage decrease. Are you sure you're seeing it work on -26? Have you tried -27 - maybe it was a regression?
This bug should really be reopened. The traffic shaping is NOT working in 2.4.18-27.7.x