Bug 227005
Summary: | speed limit on bonding interface | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 4 | Reporter: | David Kostal <david.kostal> |
Component: | kernel | Assignee: | Andy Gospodarek <agospoda> |
Status: | CLOSED NOTABUG | QA Contact: | Brian Brock <bbrock> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 4.4 | CC: | peterm |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | ia32e | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2007-02-15 16:42:06 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
David Kostal
2007-02-02 16:11:28 UTC
Do you have any idea why is this happening? Are you able to reproduce this behaviour? Or could this be sort of configuration problem only? > I am able to receive 2x 1Gbps streams on bond0 (balance-alb) from two other hosts. This is working as expected for balance-alb, but rr mode would be limited to 1Gbps since the same MAC/IP combo would be used on all systems. > I am able to send 1Gbps stream to one other host via bond0. This is working as expected. > I am able to send 2x 1Gbps streams to host1 via bond0 and host3 via bond1. This is working as expected. > I am only able to send 2x 0.5Gbps streams to host1 and host2 via bond0 (both > slaves are used, each having traffic to single host, but only 0.5Gbps). Are you using tcp or udp? Does this change when switching to rr-mode and udp? > Is there any reason why the outgoing traffic via bond0 doesn't go over 1Gbps? > The host utilization is well under its limits. There are not any hard limitations in the driver that cap speeds at 1G. There are definitely some limitations one bonding and how much you can transmit and receive from a single host -- generally the limitation is on reception since the switch can't learn the destination MAC on multiple interfaces and stripe the traffic across them. This limitation is lifted on 802.3ad, xor, and balance-alb since the switch can hash different connections over different interfaces, but each tcp/udp stream (and in alb's case host) will still be limited to the speed of a slave interface. I'm using TCP (with alb,tlb and rr). Actually I have 2 identical setups on two PE2850. My modprobe.conf on host0: alias bond0 bonding alias bond1 bonding options bond0 miimon=50 mode=balance-alb max_bonds=2 options bond1 miimon=50 mode=balance-alb and on host1: install bond0 modprobe e1000; modprobe bonding --ignore-install -o bond0 \ mode=balance-alb miimon=50 primary=eth2 install bond1 modprobe e1000; modprobe bonding --ignore-install -o bond1 \ mode=balance-alb miimon=50 They are little bit different because I did some testing there (assitgning different physical interfaces to bonds, parameters for e1000, etc.) On 2.6.9-42.0.3.EL both behaved the same. Yesterday I upgraded to 2.6.9-42.0.8.ELsmp and host1 still behaves the same (config above), while host0 now behaves correctly, with 2 outgoing connections I get over 1.8Gbps. The host1 is now using 2 onboard e1000 for bond0 (which I test), host0 uses one onboard and one dual-port card in PCI-X slot (eth0+eth2). Bu I do not think that this is a problem, because I am(was) able to send at full speed with any two cards out of my four. I do not know when I'll be able to switch the config on host1 to see whether different modprobe.conf will help (machines isused by other people too). Is the configuration of host1 wrong? Actually this is the only way how to have two different bonding algorithms on bond0 and bond1, AFAIK. Can you try to use netcat (nc) and use udp traffic? One problem with tcp is that you often don't know if the limitation is on rx or tx since tcp will make the traffic back-off when the maximum throughput can't be reached. Glancing at your config it looks fine, though you should probably remove 'modprobe e1000;' from the 'install' line on host1. Could you also send the output of /proc/net/bonding/bond0 and /proc/net/bonding/bond1 on these systems as well? I am now testing woth both udp and tcp, same results. I upgraded bios on both PE2850s to be the same (A06), no change. I replugged caples on host1 (not working) to have the same assignements to bonds as on host0 (working), no change (they were different becaouse of my previous testing of this issue). I change /etc/modprobe.conf to be the same as on host0, loading only one "bonding" with max_bonds=2, no change. i am now confused because I have two very much same configurations and one is working as I expect, while the other one is not. Here is the output of /proc on (working) host0 [root@paris ~]# cat /proc/net/bonding/bond0 Ethernet Channel Bonding Driver: v2.6.3 (June 8, 2005) Bonding Mode: adaptive load balancing Primary Slave: None Currently Active Slave: eth0 MII Status: up MII Polling Interval (ms): 50 Up Delay (ms): 0 Down Delay (ms): 0 Slave Interface: eth0 MII Status: up Link Failure Count: 0 Permanent HW addr: 00:04:23:d8:30:4a Slave Interface: eth2 MII Status: up Link Failure Count: 0 Permanent HW addr: 00:13:72:54:99:81 [root@paris ~]# cat /proc/net/bonding/bond1 Ethernet Channel Bonding Driver: v2.6.3 (June 8, 2005) Bonding Mode: adaptive load balancing Primary Slave: None Currently Active Slave: eth1 MII Status: up MII Polling Interval (ms): 50 Up Delay (ms): 0 Down Delay (ms): 0 Slave Interface: eth1 MII Status: up Link Failure Count: 0 Permanent HW addr: 00:04:23:d8:30:4b Slave Interface: eth3 MII Status: down Link Failure Count: 0 Permanent HW addr: 00:13:72:54:99:82 [root@paris ~]# And on host1 (not working): [root@sofia ~]# cat /proc/net/bonding/bond0 Ethernet Channel Bonding Driver: v2.6.3 (June 8, 2005) Bonding Mode: adaptive load balancing Primary Slave: None Currently Active Slave: eth0 MII Status: up MII Polling Interval (ms): 50 Up Delay (ms): 0 Down Delay (ms): 0 Slave Interface: eth0 MII Status: up Link Failure Count: 0 Permanent HW addr: 00:04:23:d8:2c:3a Slave Interface: eth2 MII Status: up Link Failure Count: 0 Permanent HW addr: 00:11:43:d4:94:a2 [root@sofia ~]# cat /proc/net/bonding/bond1 Ethernet Channel Bonding Driver: v2.6.3 (June 8, 2005) Bonding Mode: adaptive load balancing Primary Slave: None Currently Active Slave: eth1 MII Status: up MII Polling Interval (ms): 50 Up Delay (ms): 0 Down Delay (ms): 0 Slave Interface: eth1 MII Status: up Link Failure Count: 0 Permanent HW addr: 00:04:23:d8:2c:3b Slave Interface: eth3 MII Status: down Link Failure Count: 0 Permanent HW addr: 00:11:43:d4:94:a3 [root@sofia ~]# i'm testing on bond0, of course. The modprobe.conf on both nodes contains: alias eth0 e1000 alias eth1 e1000 alias eth2 e1000 alias eth3 e1000 options e1000 FlowControl=1 alias bond0 bonding alias bond1 bonding options bond0 miimon=50 mode=balance-alb max_bonds=2 options bond1 miimon=50 mode=balance-alb kernel on both nodes is 2.6.9-42.0.8.ELsmp eth0 and eth1 are on PCI-X dual-port network card, eth2 and eth3 are onboard. None of these two hosts is overloaded when I do the tests. sysctl.conf is the same on both nodes. It seems the problem is not RH related, but some limitation on Cisco Catalys 4506. If I plug the cables to Catalyst ports which are not close enough (different blocks of 8 ports, as labeled on the Cisco board), I can get 2x 1Gbps on both machines. Please close this as not-a-bug (at least for RedHat:) Thanks for the update, David. I'll close this one out, but I'll remember that switches can cause problems sometimes too! :) |