Bug 684488

Summary: Bridging not Working as Expected with 8021q and bonding
Product: [Fedora] Fedora Reporter: Jonathan Steffan <jonathansteffan>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 14CC: drjohnson1, gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda, nhorman
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-08-16 13:47:22 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Jonathan Steffan 2011-03-13 00:21:16 UTC
Description of problem:
When using the bonding driver under an 802.1q interface inside of a bridge with more then one interface enslaved, bridge member tap devices don't get DHCP replies. With only one interface enslaved, everything works as expected. I've tried modes 1,5,6 all with the same results. We are going across switches, so we need to use one of the aforementioned modes. I have also tried 8021q and bonding independently and they both work.

Version-Release number of selected component (if applicable):
2.6.35.11-83.fc14.x86_64

How reproducible:
Always.

Steps to Reproduce:
This assumes booting to runlevel 1 and everything needs to be done. This also assumes that 'alias bond0 bonding' has been configured.

1. Setup a bond and enslave two devices:
rmmod bonding # make sure we un-configure the module
modprobe bonding mode=1 # active-backup
ifconfig eth0 up
ifconfig eth1 up
ifconfig bond0 up
ifenslave bond0 eth0
ifenslave bond0 eth1

2. Setup vlan device on bond:
modprobe 8021q
vconfig add bond0 vlanid
ifconfig bond0.vlanid up

3. Setup a bridge and add the bond0.vlanid to the bridge:
brctl addbr br0
brctl addif br0 bond0.vlanid

3. Configure STP/forwarding delay:
brctl setfd br0 0
brctl stp br0 off

2. Setup IP connectivity. Use real information for your network:
ifconfig br0 1.2.3.4 netmask 255.255.255.0
route add default gw 1.2.3.1

4. Boot a KVM (or otherwise add a tap interface to the bridge) guest attached to the br0 bridge.

  
Actual results:
On Host: tcpdump -i br0 port bootps or port bootpc
17:07:05.051350 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 52:54:00:ed:3d:0d (oui Unknown), length 387
17:07:05.051926 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 52:54:00:ed:3d:0d (oui Unknown), length 387
17:07:06.005151 IP dhcp-proxy-rtr1.example.org.bootps > 10.31.0.204.bootpc: BOOTP/DHCP, Reply, length 305
17:07:06.014174 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 52:54:00:ed:3d:0d (oui Unknown), length 387
17:07:06.014399 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 52:54:00:ed:3d:0d (oui Unknown), length 387
17:07:06.015180 IP dhcp-proxy-rtr1.example.org.bootps > 10.31.0.204.bootpc: BOOTP/DHCP, Reply, length 305
17:07:06.015730 IP dhcp-proxy-rtr2.example.org.bootps > 10.31.0.204.bootpc: BOOTP/DHCP, Reply, length 305
17:07:07.991558 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 52:54:00:ed:3d:0d (oui Unknown), length 387
17:07:07.991787 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 52:54:00:ed:3d:0d (oui Unknown), length 387
17:07:07.992772 IP dhcp-proxy-rtr1.example.org.bootps > 10.31.0.204.bootpc: BOOTP/DHCP, Reply, length 305
17:07:07.992805 IP dhcp-proxy-rtr2.example.org.bootps > 10.31.0.204.bootpc: BOOTP/DHCP, Reply, length 305
17:07:11.946144 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 52:54:00:ed:3d:0d (oui Unknown), length 387
17:07:11.946384 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 52:54:00:ed:3d:0d (oui Unknown), length 387
17:07:11.947431 IP dhcp-proxy-rtr1.example.org.bootps > 10.31.0.204.bootpc: BOOTP/DHCP, Reply, length 305
17:07:11.947546 IP dhcp-proxy-rtr2.example.org.bootps > 10.31.0.204.bootpc: BOOTP/DHCP, Reply, length 305

On Host right after guest is started (or vnet0 does not exist): tcpdump -i vnet0 port bootps or port bootpc
17:10:08.324099 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 52:54:00:ed:3d:0d (oui Unknown), length 387
17:10:08.324714 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 52:54:00:ed:3d:0d (oui Unknown), length 387
17:10:09.285528 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 52:54:00:ed:3d:0d (oui Unknown), length 387
17:10:09.285799 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 52:54:00:ed:3d:0d (oui Unknown), length 387
17:10:11.262897 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 52:54:00:ed:3d:0d (oui Unknown), length 387
17:10:11.263154 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 52:54:00:ed:3d:0d (oui Unknown), length 387
17:10:15.217493 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 52:54:00:ed:3d:0d (oui Unknown), length 387
17:10:15.217756 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 52:54:00:ed:3d:0d (oui Unknown), length 387


Expected results:
These expected results were generated by removing all but one interface from the bond0 device.
ifenslave -d bond0 eth1
 - or -
ifenslave -d bond0 eth0

On Host right after guest is started (or vnet0 does not exist): tcpdump -i vnet0 port bootps or port bootpc
17:12:48.974243 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 52:54:00:ed:3d:0d (oui Unknown), length 387
17:12:49.006019 IP dhcp-proxy-rtr1.example.org.bootps > 10.31.0.204.bootpc: BOOTP/DHCP, Reply, length 305
17:12:49.936032 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 52:54:00:ed:3d:0d (oui Unknown), length 387
17:12:49.947969 IP dhcp-proxy-rtr2.example.org.bootps > 10.31.0.204.bootpc: BOOTP/DHCP, Reply, length 305
17:12:49.948216 IP dhcp-proxy-rtr1.example.org.bootps > 10.31.0.204.bootpc: BOOTP/DHCP, Reply, length 305
17:12:51.913515 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 52:54:00:ed:3d:0d (oui Unknown), length 399
17:12:51.917175 IP dhcp-proxy-rtr1.example.org.bootps > 10.31.0.204.bootpc: BOOTP/DHCP, Reply, length 305
17:12:51.921076 IP dhcp-proxy-rtr2.example.org.bootps > 10.31.0.204.bootpc: BOOTP/DHCP, Reply, length 305


Additional info:
Forgive the terrible ASCII representation, but this is what has been tested and what does and does not work:

Not Working:
[eth0,eth1] <=> [bond0] <=> [bond0.vlantag] <=> [br0] <=> [vnet0]

Working:
[eth0] <=> [bond0] <=> [bond0.vlantag] <=> [br0] <=> [vnet0]
[eth1] <=> [bond0] <=> [bond0.vlantag] <=> [br0] <=> [vnet0]
[eth0,eth1] <=> [bond0] <=> [br0] <=> [vnet0]
[eth0] <=> [bond0] <=> [br0] <=> [vnet0]
[eth1] <=> [bond0] <=> [br0] <=> [vnet0]
[eth0] <=> [eth0.vlantag] <=> [br0] <=> [vnet0]
[eth1] <=> [eth1.vlantag] <=> [br0] <=> [vnet0]
[eth0] <=> [br0] <=> [vnet0]
[eth1] <=> [br0] <=> [vnet0]

Comment 1 Chuck Ebbert 2011-03-13 01:27:47 UTC
Is the switch configured for an 802.1q connection on those two ports?

Comment 2 Jonathan Steffan 2011-03-13 01:31:06 UTC
(In reply to comment #1)
> Is the switch configured for an 802.1q connection on those two ports?

Yes. The following both work:
[eth0] <=> [eth0.vlantag] <=> [br0] <=> [vnet0]
[eth1] <=> [eth1.vlantag] <=> [br0] <=> [vnet0]

Comment 3 Chuck Ebbert 2011-03-25 04:27:05 UTC
I'm pretty sure you do not want to do ifup on eth0 and eth1, and also you should do ifdown on bond0 before bringing up the vlan interfaces on it.

Comment 4 Jonathan Steffan 2011-06-14 18:02:31 UTC
ifconfig dev up does not do anything but activate the interface. It does not run the ifcfg scripts. The reason I'm doing an ifconfig dev up is because an interface needs to be up to join to a bond, iirc. Seeing as how I'm not using any ifcfg scripts to configure these interfaces (for the purpose of this bug report) I had to bring the interfaces up. Any updates here? This is still an issue.

Comment 5 Neil Horman 2011-06-14 18:12:38 UTC
check the bridge forwarding tables.  I expect you'll find that the mac addresses for your tun tap devices are associated with the incorrect interface.  If so, this is an artifact of the bonding driver getting frames looped back to itself from the peer switch.  To fix it you'll need to use a bonding mode that prevents such behavior (802.3ad is the best method).

Comment 6 Fedora End Of Life 2012-08-16 13:47:26 UTC
This message is a notice that Fedora 14 is now at end of life. Fedora 
has stopped maintaining and issuing updates for Fedora 14. It is 
Fedora's policy to close all bug reports from releases that are no 
longer maintained.  At this time, all open bugs with a Fedora 'version'
of '14' have been closed as WONTFIX.

(Please note: Our normal process is to give advanced warning of this 
occurring, but we forgot to do that. A thousand apologies.)

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, feel free to reopen 
this bug and simply change the 'version' to a later Fedora version.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we were unable to fix it before Fedora 14 reached end of life. If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora, you are encouraged to click on 
"Clone This Bug" (top right of this page) and open it against that 
version of Fedora.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping