Description of problem: Cannot create a working bond with vlan on top within a guest because mac adresses can't be updated. In order to support this, nova/libvirt driver should be rewritten or at least allow to use the following: <hostdev mode='subsystem' type='pci' managed='yes'> </hostdev> <hostdev mode='subsystem' type='pci' managed='yes'> </hostdev> instead of: <interface type='hostdev' managed='yes'> <interface type='hostdev' managed='yes'> From: https://libvirt.org/formatdomain.html A PCI network device (specified by the <source> element) is directly assigned to the guest using generic device passthrough, after first optionally setting the device's MAC address to the configured value, and associating the device with an 802.1Qbh capable switch using an optionally specified <virtualport> element (see the examples of virtualport given above for type='direct' network devices). Note that - due to limitations in standard single-port PCI ethernet card driver design - only SR-IOV (Single Root I/O Virtualization) virtual function (VF) devices can be assigned in this manner; to assign a standard single-port PCI or PCIe ethernet card to a guest, use the traditional <hostdev> device definition and Since 0.9.11 To use VFIO device assignment rather than traditional/legacy KVM device assignment (VFIO is a new method of device assignment that is compatible with UEFI Secure Boot), a type='hostdev' interface can have an optional driver sub-element with a name attribute set to "vfio". To use legacy KVM device assignment you can set name to "kvm" (or simply omit the <driver> element, since "kvm" is currently the default). Since 1.0.5 (QEMU and KVM only, requires kernel 3.6 or newer) Note that this "intelligent passthrough" of network devices is very similar to the functionality of a standard <hostdev> device, the difference being that this method allows specifying a MAC address and <virtualport> for the passed-through device. If these capabilities are not required, if you have a standard single-port PCI, PCIe, or USB network card that doesn't support SR-IOV (and hence would anyway lose the configured MAC address during reset after being assigned to the guest domain), or if you are using a version of libvirt older than 0.9.11, you should use standard <hostdev> to assign the device to the guest instead of <interface type='hostdev'/>. Similar to the functionality of a standard <hostdev> device, when managed is "yes", it is detached from the host before being passed on to the guest, and reattached to the host after the guest exits. If managed is omitted or "no", the user is responsible to call virNodeDeviceDettach (or virsh nodedev-detach) before starting the guest or hot-plugging the device, and virNodeDeviceReAttach (or virsh nodedev-reattach) after hot-unplug or stopping the guest. ==== Here is a suggestion : I think that in order to configure the virtual guest to support that, the virtual hardware should be configured in such a way it's actually representing their current logical configuration within the VM. I'm pretty sure we're using the new PCI passthrough method instead of the legacy one for good reasons. The behavior are not the same. What would be best? To add a property flavor that would make use of the "legacy" pci passthrough or change it globally so any VMs that use the new method might break. Or maybe that this should be adapated for sr-iov nics only ... but then again, we might break some behavior that the people now expect the system to have. Here are some KCS that were written in order to adress some of this: How to bond SR-IOV Virtual Functions (VFs) which have been passed through to a guest VM using libvirt <interface type='hostdev'> https://access.redhat.com/solutions/355853 How to bond SR-IOV Virtual Functions (VFs) which have been passed through to a guest VM using libvirt <hostdev> https://access.redhat.com/solutions/2679351 Bond with VLAN tagging does not consistently work for SR-IOV VFs inside a VM guest https://access.redhat.com/solutions/2661961 And we've determinted that this could be supported with "trust" support in RHEL 7.3. So my guess here is that we could backport part of this to RHOSP 8.0 and integrate trust support within nova/libvirt driver ... The patches mentioned that resolve this in 7.3 enable a "trust" support in all network devices, where the user can enable "trust" on a driver to allow it to correctly alter MAC's and update it's table. This in and of itself is a feature since without this, nothing is inherently "broken", we just have to be specific in how we define these virtual devices when creating VM's. So if RHOSP was capable of changing the XML used in creating the guest when using the Nova libvirt driver, we could work around this, or otherwise we can enable trust support for network devices when the feature is added (in 7.3). Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1. Create a guest with 2 sr-iov nics 2. Create a bond out of those two nics 3. Create a VLAN using that bond 4. ifdown / ifenslave / disconnect the sr-iov nic Actual results: Doesn't failover, pings stop, traffic stops Expected results: Should be supported Additional info:
I can see some problems we should to discuss prior to move forward; If we use the plain </hostdev> for SRIOV cards, the guests will be responsible to assign MAC addresses to the devices ? How that is going to be achieve, is there some tweaks to do on modern operating systems ? From what I know Neutron is always assigning a MAC address when building port, that is to ensure no conflicts on the broadcast layer, also Neutron set some anti-spoofing rules which means for a guest to use an other address than the one assigned is prohibited.
I agree with Sahid. It doesn't seems logical to change the default way we are assigning sriov network interfaces, as for most of the cases we are required to set the mac address of the device, vlan / virtual port before the boot. On the other hand backporting the "trusted" option to RHEL7 would still require additional work to be implemented upstream and backported to RHOS8, however, we didn't start this work yet. As an alternative, wouldn't it be possible to use openstack pci passthrough [1]? It assign pci devices using the <hostdev> configuration, however, doesn't have network awareness, the network should be configured manually, not using neutron.. Thanks, Vladik
Well, is this path possible: 1) Upgrade 7.2 kernel to 7.3 kernel 2) Implement "trust" use in nova-compute/libvirt driver over SR-IOV nics 3) Create a metadata in images/flavor in order to enable trust for selected nics
(In reply to David Hill from comment #5) > Well, is this path possible: > From my point of view, it can be possible as RFE. We will need to implement it first upstream. > 1) Upgrade 7.2 kernel to 7.3 kernel > 2) Implement "trust" use in nova-compute/libvirt driver over SR-IOV nics This would probably be implemented in libvirt. > 3) Create a metadata in images/flavor in order to enable trust for selected > nics We will need to implement this in the API (not in images/flavor), hence it will require a spec to me approved first. However, even then, I doubt it will easy to backport to RHOS8 Vladik
This was essentially duplicated in https://bugzilla.redhat.com/show_bug.cgi?id=1402584, which targeted OSP 14 and has now been completed. Because of this, I've renamed this and targeted it to OSP 10. However, as noted in https://bugzilla.redhat.com/show_bug.cgi?id=1636395#c9, we don't consider this a good candidate for backporting to OSP 10. As a result, I'm also going to close this as a WONTFIX.