Bug 1653041

Summary: guest connected to host bridge that is directly connected to a host ethernet (and from there to the physical network) can not get ip when host firewalld is active
Product: [Fedora] Fedora Reporter: Laine Stump <laine>
Component: iptablesAssignee: Phil Sutter <psutter>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 29CC: danken, egarver, jpopelka, laine, psutter, qe-baseos-daemons, todoleza, twoerner, yalzhang
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1650382 Environment:
Last Closed: 2019-05-29 08:53:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Laine Stump 2018-11-25 02:24:36 UTC
(NB: as noted further down in the description, the presence of an nwfilter rule is actually irrelevant (but the full text was included for purposes of context))

+++ This bug was initially created as a clone of Bug #1650382 +++

Description of problem:
guest with bridge type interface connected to host shared bridge and with 'vdsm-no-mac-spoofing' configured can not get ip when host firewalld is running.
Even the network filter is deleted after then, it can not get ip either.

Version-Release number of selected component (if applicable):
# rpm -q libvirt-client firewalld
libvirt-client-4.5.0-14.module+el8+2210+474b8474.x86_64
firewalld-0.6.3-3.el8.noarch
# uname -a
Linux xxx 4.18.0-39.el8.x86_64 #1 SMP Wed Nov 14 10:44:37 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

How reproducible:
100%

Steps to Reproduce:
1. Create a host shared bridge, then start a guest with bridge type interface:
# bash -x <<EOS
> nmcli con del em1
> nmcli con del 'Wired connection 1'
> nmcli con add type bridge ifname br0 con-name br0  autoconnect yes stp off
> nmcli con add type bridge-slave ifname em1 con-name em1 autoconnect yes master br0
> systemctl restart NetworkManager
> EOS

# systemctl restart firewalld
# systemctl restart libvirtd
# firewall-cmd --get-active-zones
libvirt
  interfaces: virbr0 virbr1 virbr2 virbr3 virbr4 virbr5 virbr6 virbr7 virbr8 virbr9
public
  interfaces: br0 em1

# virsh dumpxml rhel | grep /interface -B5
    <interface type='bridge'>
      <mac address='52:54:00:49:80:ae'/>
      <source bridge='br0'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </interface>
# virsh start rhel
Domain rhel started

login guest and check it can get ip address and network works well:
# ip addr
1: lo: ...
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 52:54:00:49:80:ae brd ff:ff:ff:ff:ff:ff
    inet 1*.7*.74.95/22 brd 10.73.75.255 scope global noprefixroute dynamic eth0
       valid_lft 43191sec preferred_lft 43191sec
    inet6 2620:52:0:4948:5054:ff:fe49:80ae/64 scope global noprefixroute dynamic 
       valid_lft 2591993sec preferred_lft 604793sec
    inet6 fe80::5054:ff:fe49:80ae/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever

2. add nwfilter and start the guest again
# virsh nwfilter-dumpxml vdsm-no-mac-spoofing
<filter name='vdsm-no-mac-spoofing' chain='root'>
  <uuid>06679ce1-a3be-43e4-bdaf-b06169780a35</uuid>
  <filterref filter='no-mac-spoofing'/>
  <filterref filter='no-arp-mac-spoofing'/>
</filter>

# virsh dumpxml rhel | grep /interface -B6
    <interface type='bridge'>
      <mac address='52:54:00:49:80:ae'/>
      <source bridge='br0'/>
      <model type='virtio'/>
      <filterref filter='vdsm-no-mac-spoofing'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </interface>

# virsh start rhel --console
....

3. login guest and found the guest can not get ip address during boot, try dhclient to request ip, but failed.
[root@localhost ~]# ip addr
1: lo: ...
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 52:54:00:49:80:ae brd ff:ff:ff:ff:ff:ff
    inet6 2620:52:0:4948:5054:ff:fe49:80ae/64 scope global noprefixroute dynamic 
       valid_lft 2591991sec preferred_lft 604791sec
    inet6 fe80::5054:ff:fe49:80ae/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever

[root@localhost ~]# dhclient -d eth0
Internet Systems Consortium DHCP Client 4.2.5
Copyright 2004-2013 Internet Systems Consortium.
All rights reserved.
For info, please visit https://www.isc.org/software/dhcp/

Listening on LPF/eth0/52:54:00:49:80:ae
Sending on   LPF/eth0/52:54:00:49:80:ae
Sending on   Socket/fallback
DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 8 (xid=0x2521d1f8)
DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 9 (xid=0x2521d1f8)
DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 9 (xid=0x2521d1f8)
DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 14 (xid=0x2521d1f8)
DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 18 (xid=0x2521d1f8)
DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 3 (xid=0x2521d1f8)
No DHCPOFFERS received.
No working leases in persistent database - sleeping.

4. destroy the guest and delete the nwfilter setting in the interface:
# virsh dumpxml rhel | grep /interface -B5
    <interface type='bridge'>
      <mac address='52:54:00:49:80:ae'/>
      <source bridge='br0'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </interface>
# virsh start rhel
Domain rhel started

5. login guest, it still can not get ip address even there is *no* nwfilter setting as in step 4.
[root@localhost ~]# dhclient -d eth0
Internet Systems Consortium DHCP Client 4.2.5
Copyright 2004-2013 Internet Systems Consortium.
All rights reserved.
For info, please visit https://www.isc.org/software/dhcp/

Listening on LPF/eth0/52:54:00:49:80:ae
Sending on   LPF/eth0/52:54:00:49:80:ae
Sending on   Socket/fallback
DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 6 (xid=0xdc15aa6)
DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 14 (xid=0xdc15aa6)
DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 7 (xid=0xdc15aa6)
DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 9 (xid=0xdc15aa6)
DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 9 (xid=0xdc15aa6)
DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 7 (xid=0xdc15aa6)
DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 9 (xid=0xdc15aa6)
No DHCPOFFERS received.
No working leases in persistent database - sleeping.

6. on the host, stop firewalld, then check guest can get ip
# systemctl stop firewalld

login guest then run dhclient, guest can get ip address now:
# dhclient -d eth0
Internet Systems Consortium DHCP Client 4.2.5
Copyright 2004-2013 Internet Systems Consortium.
All rights reserved.
For info, please visit https://www.isc.org/software/dhcp/

Listening on LPF/eth0/52:54:00:49:80:ae
Sending on   LPF/eth0/52:54:00:49:80:ae
Sending on   Socket/fallback
DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 5 (xid=0x4bcd7ec9)
DHCPREQUEST on eth0 to 255.255.255.255 port 67 (xid=0x4bcd7ec9)
DHCPOFFER from 1*.7*.75.254
DHCPACK from 1*.7*.75.254 (xid=0x4bcd7ec9)
bound to 1*.7*.74.95 -- renewal in 18231 seconds.


Actual results:
guest with bridge type interface connected to host shared bridge and with 'vdsm-no-mac-spoofing' configured can not get ip when host firewalld is running.
Even the network filter is deleted after then, it can not get ip either.

Expected results:
guest can get ip address

Additional info:
The 'vdsm-no-mac-spoofing' is created by vdsm, and is used in uplayer RHV. The host is a beaker system, the guest on the host share the same dhcp server and should get ip address within the same subnet as the host.

--- Additional comment from Eric Garver on 2018-11-16 11:32:56 EST ---

Laine, isn't this expected? libvirt is attaching the guest to br0 as per user config, which will get assigned to the default zone in firewalld. The default zone has a --set-target of "default" and as such the traffic gets rejected. There are no rules added to firewalld to allow the traffic.

If the user is creating their own bridge (br0), then they also need to create their own firewall rules to allow the VM traffic.

yalzhang@, does it work if you do the following?

   # firewall-cmd --set-target=accept

--- Additional comment from Laine Stump on 2018-11-16 18:24:03 EST ---

In the past, as long as br_netfilter wasn't loaded, iptables hooks would never be encountered by packets being bridged to a guest. It's puzzling to me that putting br0 in a different zone has any effect at all for traffic that isn't being received for local IP stack processing by the host.

Now that my RHEL8 machine is running again (after a power-failure-induced disk trashing, and a yum update that killed X11) I'm hoping to try this myself, as it just doesn't make sense.

--- Additional comment from Laine Stump on 2018-11-19 10:29:35 EST ---

Okay, in my testing, the presence/absence of an nwfilter rule has *no effect* on the problem - whether or not nwfilter is involved, traffic going through a host bridge has the rules for the bridge's zone applied to it.

This is an incompatible change in behavior from, e.g., RHEL7. Testing on the latest RHEL7 with firewalld enabled, and a bridge directly attached to an ethernet (with the bridge in the default "public" zone), all traffic to/from the guest passes through the bridge with no problems.

Again, the guest traffic is not being *routed* by the host, it is being *bridged*, so IP-level rules should not come into play (unless the br_netfilter module is loaded (is that even a thing anymore, when nftables is in use?).

AFAICT, this change in behavior means that anyone who wants bridged connections from their guests to the local LAN will need to put their bridge devices in a zone that has a default accept policy, which will be completely unacceptable, since that will leave the host open on all port *to the physical network*.

If this really is what is happening, then this BZ needs a blocker+.

--- Additional comment from Laine Stump on 2018-11-21 10:12:27 EST ---

Dan Kenigsberg - I thought you should be alerted to this change in behavior, since I'm pretty sure most RHV guests use a bridged connection to the network.

--- Additional comment from Laine Stump on 2018-11-24 21:21:41 EST ---

NB: I just noticed that Fedora 29 exhibits the same behavior (even though the firewalld backend is not set to nftables). F27 did *not* (I know this because a working bridged network config on an F27 machine was broken by upgrading to F29, "unbroken" by putting the bridge into a zone with a default accept policy).

Due to this discovery, I'm cloning this BZ to Fedora so they'll be aware of the situation.

Comment 1 Eric Garver 2018-11-27 15:06:34 UTC
Reassigning to iptables as per bug 1650382 comment 6 and bug 1650382 comment 7.

Comment 2 Phil Sutter 2019-05-29 08:53:45 UTC
Identified kernel commit in bug 1611161 resolving the issue is present in upstream kernels starting with v5.1. Given that F30 ships v5.1.5 I'm closing this ticket. Feel free to reopen in case you think we have to backport the fix to F29.