Bug 971893
Summary: | bonding balance-tlb or balance-alb mode sending tons of null LLC packets. | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Nitin Sharma <nitinics> | ||||||||||
Component: | kernel | Assignee: | Neil Horman <nhorman> | ||||||||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||||
Severity: | high | Docs Contact: | |||||||||||
Priority: | unspecified | ||||||||||||
Version: | 18 | CC: | dcbw, gansalmon, itamar, jbyers, jklimes, jonathan, kernel-maint, madhu.chinakonda, nhorman, nitinics | ||||||||||
Target Milestone: | --- | ||||||||||||
Target Release: | --- | ||||||||||||
Hardware: | x86_64 | ||||||||||||
OS: | Linux | ||||||||||||
Whiteboard: | |||||||||||||
Fixed In Version: | kernel-3.10.13-101.fc18 | Doc Type: | Bug Fix | ||||||||||
Doc Text: | Story Points: | --- | |||||||||||
Clone Of: | Environment: | ||||||||||||
Last Closed: | 2013-10-01 01:57:14 UTC | Type: | Bug | ||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||
Documentation: | --- | CRM: | |||||||||||
Verified Versions: | Category: | --- | |||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
Embargoed: | |||||||||||||
Attachments: |
|
Description
Nitin Sharma
2013-06-07 13:59:52 UTC
Looks like these are LLC packets, but are all NULL LLC packets. I am trying to understand why this would be sent for balance-alb or tlb, as no reference to LLC is made on the bonding alb source code. Is this expected? Thanks Nitin Please attach the binary tcpdump output, and the output of the command 'ip addr show' Created attachment 782842 [details]
2nd slave Pcap
Created attachment 782843 [details]
1st slave Pcap
[root@localhost ~]# ip addr show 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: p255p1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP qlen 1000 link/ether 00:25:90:c0:bb:d5 brd ff:ff:ff:ff:ff:ff 3: p255p2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP qlen 1000 link/ether 00:25:90:c0:bb:d4 brd ff:ff:ff:ff:ff:ff 4: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP link/ether 00:25:90:c0:bb:d4 brd ff:ff:ff:ff:ff:ff inet 10.0.0.17/26 brd 10.0.0.63 scope global bond0 valid_lft forever preferred_lft forever inet6 fe80::225:90ff:fec0:bbd4/64 scope link valid_lft forever preferred_lft forever [root@localhost ~]# tcpdump -s 0 -i p255p1 -w firstslave.pcap tcpdump: WARNING: p255p1: no IPv4 address assigned tcpdump: listening on p255p1, link-type EN10MB (Ethernet), capture size 65535 bytes ^C36 packets captured 37 packets received by filter 0 packets dropped by kernel [root@localhost ~]# tcpdump -s 0 -i p255p2 -w secondslave.pcap tcpdump: WARNING: p255p2: no IPv4 address assigned tcpdump: listening on p255p2, link-type EN10MB (Ethernet), capture size 65535 bytes ^C38 packets captured 38 packets received by filter 0 packets dropped by kernel What are these ports connected to, and in what bonding mode? The frames also have the same source and destination mac addresses, which suggest they are receiving their own frames. Does this happen if you use each interface individually? i.e. without bonding them? (In reply to Neil Horman from comment #6) > What are these ports connected to, and in what bonding mode? The frames > also have the same source and destination mac addresses, which suggest they > are receiving their own frames. Does this happen if you use each interface > individually? i.e. without bonding them? Ports are connected to a switch. It is on bonding-mode balance-alb. They have the same src and dst mac-addr and these packets don't show up when not bonding. Also does not happen on active-backup bonding mode. These could be for the switch to learn the src-mac of the interface (that could change as per ALB implementation), however the frequency with which these happen is not tunable. Ideally, in our implementation we don't need to learn the src mac-addr with these NULL packets, rather we use a different host learning approach. So It would be great if we could have the frequency of when these events occur in the code tunable. Thanks Nitin You may be right, these do look like learning packets, in that it appears that their length of 96 (hex 0x60) should be their ethertype (which is ETH_P_LOOP), but somehow it is getting interpreted as an 802.3 ethernet frame with a length of 96. I'll see if I can recreate this and fix it up. wait a second, is there anything else going on here? i.e. Are you actually having any other problems with bonding in alb mode? I ask because the more I look at it, the more this clearly needs cleaning up and consolidation, but there doesn't appear to be anything wrong with this frame. I'm starting to think that its just wireshark that can't read the frame properly. I'll clean up the code, but unless you're looking at something else being a problem, I think this is notabug. Correct. It is not a bug, rather a feature request. To be able to tune the below frequency using /sys/class/net dynamically , i.e. 39 #define BOND_ALB_LP_INTERVAL 1 /* In seconds, periodic send of 40 * learning packets to the switch 41 */ The issue I was facing was specifically with Openflow Switch implementation. This packet is sent very often to the controller for learning (as it is supposed to) taking much of the OF Controller traffic. Ideally this mechanism seems to be used to speedup learning on the switch only in case of failover events. However, it is sent periodic as per the implementation. ah! I'm sorry, I wish you would have said that earlier. Ok, yeah, I can look into doing that. I imagine we can just make it a module parameter, and you can adjust it on the fly via sysfs then. Created attachment 787298 [details]
patch to make learning packet interval configurable
ehre you go, this exports the alb learning interval as a module parameter, modifyable in sysfs. Please give it a try and let me know if it suits your needs.
[root@localhost kernel-bond]# modinfo bonding | grep alb parm: mode:Mode of operation; 0 for balance-rr, 1 for active-backup, 2 for balance-xor, 3 for broadcast, 4 for 802.3ad, 5 for balance-tlb, 6 for balance-alb (charp) parm: alb_lp_interval:The interval on which learning packets are sent in ALB mode (int) I did apply the patch, however the frequency is still 3 lp burst per second. The configuration was added in BONDING_OPTS="miimon=50 mode=balance-alb alb_lp_interval=60" on ifcfg-bond0 Any other way to validate? you can use /sys/modules/bonding/paramters/alb_lp_interval to ensure that your settings got picked up properly. It sounds like they may not have. Sorry, my bonding module was compiled into the kernel, so couldn't do it on runtime. So had to change it from the Kernel CMD line. And it works as expected. Thanks for your help Understood, thanks. I'm not sure this will get accepted upstream, but I'll propose it and see how it flies. Created attachment 795648 [details]
[PATCH] bonding: Make alb learning packet interval configurable
running bonding in ALB mode requires that learning packets be sent periodically,
so that the switch knows where to send responding traffic. However, depending
on switch configuration, there may not be any need to send traffic at the
default rate of 3 packets per second, which represents little more than wasted
data. Allow the ALB learning packet interval to be made configurable via sysfs
Signed-off-by: Neil Horman <nhorman>
---
drivers/net/bonding/bond_alb.c | 2 +-
drivers/net/bonding/bond_alb.h | 8 ++++----
drivers/net/bonding/bond_main.c | 1 +
drivers/net/bonding/bond_sysfs.c | 39 +++++++++++++++++++++++++++++++++++++++
drivers/net/bonding/bonding.h | 1 +
5 files changed, 46 insertions(+), 5 deletions(-)
I'm sorry, could I ask you to test out this alternate patch I've written. I like it better than my first pass as it allows per-device configuration via sysfs. Thanks! Not a problem. I applied it and validated. echo 60 > /sys/class/net/bond0/bonding/lp_interval Thanks thanks, posted for review: http://marc.info/?l=linux-netdev&m=137882251119752&w=2 Fixed in the next F18 kernel release. (In reply to Neil Horman from comment #21) > Fixed in the next F18 kernel release. This should be applicable to F19 and F20 as well, right? (3.11.y based) it is, but I figured that F19 are still planning updates to 3.12, and they'd get this fix automatically, won't they? (In reply to Neil Horman from comment #23) > it is, but I figured that F19 are still planning updates to 3.12, and > they'd get this fix automatically, won't they? Yes, but not for quite a while. 3.12 is only at -rc1 now. I'll cherry-pick your fix. It's easy enough to drop the patch when we do wind up rebasing, and there's no reason to not carry it. kernel-3.11.2-201.fc19 has been submitted as an update for Fedora 19. https://admin.fedoraproject.org/updates/kernel-3.11.2-201.fc19 kernel-3.10.13-101.fc18 has been submitted as an update for Fedora 18. https://admin.fedoraproject.org/updates/kernel-3.10.13-101.fc18 Package kernel-3.11.2-201.fc19: * should fix your issue, * was pushed to the Fedora 19 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing kernel-3.11.2-201.fc19' as soon as you are able to, then reboot. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2013-17865/kernel-3.11.2-201.fc19 then log in and leave karma (feedback). kernel-3.11.2-301.fc20 has been submitted as an update for Fedora 20. https://admin.fedoraproject.org/updates/kernel-3.11.2-301.fc20 kernel-3.11.2-201.fc19 has been pushed to the Fedora 19 stable repository. If problems still persist, please make note of it in this bug report. kernel-3.11.2-301.fc20 has been pushed to the Fedora 20 stable repository. If problems still persist, please make note of it in this bug report. kernel-3.10.13-101.fc18 has been pushed to the Fedora 18 stable repository. If problems still persist, please make note of it in this bug report. |