Description of problem: When running very heavy UDP Tx stress traffic with 10/100 adapters, load- sharing collapses to only one slave after a few seconds. Caused by a unsigned/signed cast error in the TLB code. Version-Release number of selected component (if applicable): kernel-2.4.20-1.1931.2.231.2.11.ent How reproducible: Configure a bond team with only 10/100 adapters and run very heavy UDP Tx stress traffic to many clients. Monitor Tx/Rx activity of the slaves. Steps to Reproduce: 1. insmod bonding mode=5 2. ifconfig bond0 <ip-addr> 3. ifenslave bond0 eth0 eth1 eth2 4. start stress application (e.g iperf, netperf, etc.) Actual results: After a few seconds only one slave takes part of load sharing while others stay idle. Traffic may pass from slave to slave at 10 sec. intervals (re-balance timeout). Expected results: All slaves continuously take part of the load sharing. Additional info: A bug fix patch was sent by me on June 26th to bond-devel, linux-net and linux- netdev lists. It was already accepted by Jeff Garzik into his net-drivers-2.4 BK tree.
ISSUE TRACKER 25886 opened as sev 1
Jeff, does Taroon already have the patch for this or is it still in your queue ?
Appears to be fix implemented in RHEL 3 B1 candidate kernel (version 2.4.21- 1.1931.2.349.2.2.ent).