Hide Forgot
Description of problem: If create a passthru mode macvtap interface, the system will set the physical ineterface promiscuous by default. Version-Release number of selected component (if applicable): # uname -r 3.10.0-29.el7.x86_64 # rpm -q iptables iptables-1.4.19.1-1.el7.x86_64 How reproducible: 100% Steps to Reproduce: 1.Create a passthru mode macvtap interface, and set it up. # ip link add link p4096p4 name ptru1 type macvtap mode passthru # ip link set ptru1 up 2.Check in dmesg 3. Actual results: dmesg shows: device p4096p4 entered promiscuous mode Expected results: physical interface should not entered promiscuous mode. Additional info:
Test w/ virt-manager, physical interface entered promiscuous mode. Aattach the xml infos # virsh dumpxml rhel7 <domain type='kvm' id='5'> <name>rhel7</name> <uuid>7fd345c7-9e97-44c3-9c7f-0980b22cff30</uuid> <memory unit='KiB'>4194304</memory> <currentMemory unit='KiB'>4194304</currentMemory> <vcpu placement='static'>4</vcpu> <resource> <partition>/machine</partition> </resource> <os> <type arch='x86_64' machine='pc-i440fx-rhel7.0.0'>hvm</type> <boot dev='hd'/> </os> <features> <acpi/> <apic/> <pae/> </features> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>restart</on_crash> <devices> <emulator>/usr/libexec/qemu-kvm</emulator> <disk type='file' device='disk'> <driver name='qemu' type='qcow2' cache='none'/> <source file='/home/rhel7cp1.qcow2_v3'/> <target dev='vda' bus='virtio'/> <alias name='virtio-disk0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </disk> <controller type='usb' index='0'> <alias name='usb0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> </controller> <controller type='pci' index='0' model='pci-root'> <alias name='pci.0'/> </controller> <controller type='virtio-serial' index='0'> <alias name='virtio-serial0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> </controller> <interface type='direct'> <mac address='52:54:00:f1:3b:75'/> <source dev='p4096p4' mode='passthrough'/> <target dev='macvtap0'/> <model type='virtio'/> <alias name='net0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </interface> <serial type='pty'> <source path='/dev/pts/5'/> <target port='0'/> <alias name='serial0'/> </serial> <console type='pty' tty='/dev/pts/5'> <source path='/dev/pts/5'/> <target type='serial' port='0'/> <alias name='serial0'/> </console> <channel type='spicevmc'> <target type='virtio' name='com.redhat.spice.0'/> <alias name='channel0'/> <address type='virtio-serial' controller='0' bus='0' port='1'/> </channel> <input type='tablet' bus='usb'> <alias name='input0'/> </input> <input type='mouse' bus='ps2'/> <graphics type='spice' port='5900' autoport='yes' listen='127.0.0.1'> <listen type='address' address='127.0.0.1'/> </graphics> <sound model='ich6'> <alias name='sound0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </sound> <video> <model type='qxl' ram='65536' vram='65536' heads='1'/> <alias name='video0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> </video> <memballoon model='virtio'> <alias name='balloon0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/> </memballoon> </devices> <seclabel type='dynamic' model='selinux' relabel='yes'> <label>system_u:system_r:svirt_t:s0:c442,c719</label> <imagelabel>system_u:object_r:svirt_image_t:s0:c442,c719</imagelabel> </seclabel> </domain>
And how is this libvirt related? As you said in bug description, libvirt is not needed to reproduce this issue. Also, iptables have nothing to do with this, the "ip" command comes from iproute package. Anyway, with my knowledge of networking, I'd say this is inevitable since the physical network interface needs to be able to catch frames addressed to several different MAC addresses. However, I have to admit I haven't really followed all the new virt-related networking techniques and thus I'm moving this bug to kernel component to decide whether this is a bug or not (I guess it isn't).
It's not inevitable, and it's not a kernel bug. macvlans have a flag MACVLAN_FLAG_NOPROMISC which let users disable promisc in passthrough mode: default can not be changed without breaking ABIs. iproute does not seem to know how to activate this flag, and neither does libvirt. they both should learn to set this flag. so moving back to libvirt, and will clone to iproute.
note: libvirt will need to program mac addresses for it to be really useful
Reassigning this to me, as I recently did some related work (Bug 848199)
Micheal, I have a few questions about MACVLAN_FLAG_NOPROMISC: 1) Is it only for passthru mode, or is it also useful for other modes when we are monitoring guest rxfilter events and updating the macvtap device mac address and multicast list appropriately? 2) If the answer to (1) is "works for other modes too", does the physdev promiscuous mode get turned back on if one of the attached macvtaps is put into promiscuous mode? Or would that lead to a guest in promiscuous mode that wasn't really seeing all the traffic? 3) I see the #define for the flag in if_link.h, and the enum for the attribute IFLA_MACVLAN_FLAGS, but there is no corresponding struct ifla_macvlan_flags. So where does attribute go in the netlink packet? Is it a separate attribute like ifla_vlan_flags? Or does it go into the ifinfomsg.ifi_flags (doesn't make sense, as there would then be no use for the IFLA_MACVLAN_FLAGS attribute type) Or is it a simple u32 attribute?
(In reply to Laine Stump from comment #8) > Micheal, > > I have a few questions about MACVLAN_FLAG_NOPROMISC: > > 1) Is it only for passthru mode, or is it also useful for other modes when > we are monitoring guest rxfilter events and updating the macvtap device mac > address and multicast list appropriately? This particular flag is only affects passthru devices, so you can only set it then. This also appears to be the only way to control the promiscuity setting of the lower level device through the macv* device (which is a bit odd). As such, I can think 2 possible scenarios: 1) User explicitely specify "nopromisc" in some xml setting. This might be problematic as it would allow the user to shoot him/her self in the foot by creating a possibly broken configuration. 2) Allow this setting to take place only when we trust and apply the guest rxfilter settings. This would allow non-promisc passthru devices to exist as long as all of guest filters are passed through to device so that the guest would receive all the traffic it is interested in. > > 2) If the answer to (1) is "works for other modes too", does the physdev > promiscuous mode get turned back on if one of the attached macvtaps is put > into promiscuous mode? Or would that lead to a guest in promiscuous mode > that wasn't really seeing all the traffic? > > 3) I see the #define for the flag in if_link.h, and the enum for the > attribute IFLA_MACVLAN_FLAGS, but there is no corresponding struct > ifla_macvlan_flags. So where does attribute go in the netlink packet? Is it > a separate attribute like ifla_vlan_flags? Or does it go into the > ifinfomsg.ifi_flags (doesn't make sense, as there would then be no use for > the IFLA_MACVLAN_FLAGS attribute type) Or is it a simple u32 attribute? It's just a u16 attribute. -vlad
(In reply to Vlad Yasevich from comment #9) > > This particular flag is only affects passthru devices, so you can only set > it then. This also appears to be the only way to control the promiscuity > setting > of the lower level device through the macv* device (which is a bit odd). What if you set NOPROMISC when the macvtap device is created, and the guest later sets promisc mode (which libvirt responds to by setting IFF_PROMISC on the macvtap interface, using ioctl(SOICSIFFLAGS))? Would the physdev correctly switch to promisc mode? Or would this be a broken config? (same question for what happens when the guest later turns *off* promisc mode)
(In reply to Laine Stump from comment #10) > (In reply to Vlad Yasevich from comment #9) > > > > > This particular flag is only affects passthru devices, so you can only set > > it then. This also appears to be the only way to control the promiscuity > > setting > > of the lower level device through the macv* device (which is a bit odd). > > What if you set NOPROMISC when the macvtap device is created, and the guest > later sets promisc mode (which libvirt responds to by setting IFF_PROMISC on > the macvtap interface, using ioctl(SOICSIFFLAGS))? Would the physdev > correctly switch to promisc mode? Or would this be a broken config? (same > question for what happens when the guest later turns *off* promisc mode) This is what I was referring to by my comment above. It would appear that setting IFF_PROMISC on the macvtap device does not change the promiscuity setting on the physical device. The only way to affect change on the physical device would be: 1) Directly control the setting on the physical device. 2) Use the nopromisc option only on passthru macv* devices. I am looking to see if this is a bug or oversight.. -vlad
Okay. To get back to basics, there are 3 devices involved: G - guest emulated ethernet device M - macvtap device on the host P - physical device on the host (assuming passthrough mode in the following discussion) If we don't set MACVLAN_FLAG_NOPROMISC when M is created, then P and M will just always be IFF_PROMISC. Would it do any good to change them both at some later time (using a separate ioctl(SOICSIFFLAGS)? If we *do* set MACVLAN_FLAG_NOPROMISC when M is created, then we can (and do, if configured) follow the IFF_PROMISC flag from G by paying attention to RX_FILTER_CHANGED. Currently we set M to mirror the state of this flag from G. Is it true that we should *also* set P to mirror the state of the flag, and is that enough for proper operation?
(In reply to Laine Stump from comment #12) > Okay. To get back to basics, there are 3 devices involved: > > G - guest emulated ethernet device > M - macvtap device on the host > P - physical device on the host > > (assuming passthrough mode in the following discussion) > > If we don't set MACVLAN_FLAG_NOPROMISC when M is created, then P and M will > just always be IFF_PROMISC. Would it do any good to change them both at some > later time (using a separate ioctl(SOICSIFFLAGS)? I am not sure what you mean by "would it do any good"... If you are asking if we could enable promiscuity using the ioctls, then yes, absolutely, you can do that and it would sort of emulate setting of MACVLAN_FLAG_NOPROMISC. > > If we *do* set MACVLAN_FLAG_NOPROMISC when M is created, then we can (and > do, if configured) follow the IFF_PROMISC flag from G by paying attention to > RX_FILTER_CHANGED. Currently we set M to mirror the state of this flag from > G. Is it true that we should *also* set P to mirror the state of the flag, > and is that enough for proper operation? Yes, that should be enough. Mirroring the flag to M allows macvtap to properly forward multicast traffic. Mirroring the flag P would allow the hw to actually receive said traffic and pass it to M. I think though that we should fix macvtap driver to properly propagate flag changes so that we don't have do the workaround. I am testing patches to do just that. -vlad
I don't see how this should block bug 1013584 as iproute does not depend on libvirt in any way. Therefore removing the probably incorrect dependency.
Yep, that was just an artifact of Bug 1013584 being created by cloning this bug - a newly cloned bug is automatically marked as dependent on the bug it was cloned from, which in my experience is wrong much more than half the time, but it's easy to forget to remove the dependency before hitting save...
Hi Laine, (In reply to Laine Stump from comment #20) > Yep, that was just an artifact of Bug 1013584 being created by cloning this > bug - a newly cloned bug is automatically marked as dependent on the bug it > was cloned from, which in my experience is wrong much more than half the > time, but it's easy to forget to remove the dependency before hitting save... Indeed, but in my opinion wrong dependencies are still better than none at all, as the information where a ticket originated from is often vital. Sure, failure to fix this up is one of the likely pitfalls of using the system. :)
Bug 1727263 is adding support for a Linux host bridge option BR_ISOLATED, which can be applied to the bridge port of a guest's conventional) tap device. This is a completely unrelated thing, except that it's "an option to be applied to a port", and so possibly could live in the same config space.
This bug was closed deferred as a result of bug triage. Please reopen if you disagree and provide justification why this bug should get enough priority. Most important would be information about impact on customer or layered product. Please indicate requested target release.