Bug 1413372
| Summary: | libvirt overwrites externally set vlan tags in macvtap passthrough VFs since 2.x so Nova needs to craft the XML to include vlan tag | |||
|---|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Pablo Iranzo Gómez <pablo.iranzo> | |
| Component: | openstack-nova | Assignee: | Vladik Romanovsky <vromanso> | |
| Status: | CLOSED ERRATA | QA Contact: | Joe H. Rahme <jhakimra> | |
| Severity: | urgent | Docs Contact: | ||
| Priority: | urgent | |||
| Version: | 8.0 (Liberty) | CC: | aguetta, awaugama, berrange, dasmith, eglynn, kchamart, laine, libvirt-maint, panbalag, pbarta, pmannidi, rbalakri, sbauza, sferdjao, sgordon, srevivo, vromanso, xuzhang | |
| Target Milestone: | async | Keywords: | ZStream | |
| Target Release: | 8.0 (Liberty) | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | openstack-nova-12.0.6-9.el7ost | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1420877 1420879 1420880 (view as bug list) | Environment: | ||
| Last Closed: | 2017-03-08 17:47:42 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1420877, 1420879, 1420880 | |||
| At the moment I'm checking the differences between the package sets related to nova, neutron and libvirt for the working and non working system
# diff sosreport-20170110-113046/overcloud-compute-1.localdomain/installed-rpms sosreport-20170115-145302/overcloud-compute-2.localdomain/installed-rpms |cut -f1,2|egrep 'neutron|libvirt|nova'
< libvirt-2.0.0-10.el7_3.2.x86_64                             Fri Dec 30 16:37:27 2016
< libvirt-client-2.0.0-10.el7_3.2.x86_64                      Fri Dec 30 16:37:04 2016
< libvirt-daemon-2.0.0-10.el7_3.2.x86_64                      Fri Dec 30 16:37:05 2016
< libvirt-daemon-config-network-2.0.0-10.el7_3.2.x86_64       Fri Dec 30 16:37:05 2016
< libvirt-daemon-config-nwfilter-2.0.0-10.el7_3.2.x86_64      Fri Dec 30 16:37:05 2016
< libvirt-daemon-driver-interface-2.0.0-10.el7_3.2.x86_64     Fri Dec 30 16:37:05 2016
< libvirt-daemon-driver-lxc-2.0.0-10.el7_3.2.x86_64           Fri Dec 30 16:37:05 2016
< libvirt-daemon-driver-network-2.0.0-10.el7_3.2.x86_64       Fri Dec 30 16:37:05 2016
< libvirt-daemon-driver-nodedev-2.0.0-10.el7_3.2.x86_64       Fri Dec 30 16:37:05 2016
< libvirt-daemon-driver-nwfilter-2.0.0-10.el7_3.2.x86_64      Fri Dec 30 16:37:05 2016
< libvirt-daemon-driver-qemu-2.0.0-10.el7_3.2.x86_64          Fri Dec 30 16:37:06 2016
< libvirt-daemon-driver-secret-2.0.0-10.el7_3.2.x86_64        Fri Dec 30 16:37:05 2016
< libvirt-daemon-driver-storage-2.0.0-10.el7_3.2.x86_64       Fri Dec 30 16:37:06 2016
< libvirt-daemon-kvm-2.0.0-10.el7_3.2.x86_64                  Fri Dec 30 16:37:27 2016
< libvirt-python-2.0.0-2.el7.x86_64                           Mon Oct 31 20:23:33 2016
< openstack-neutron-7.1.1-7.el7ost.noarch                     Mon Oct 31 20:25:56 2016
< openstack-neutron-bigswitch-agent-2015.3.8-1.el7ost.noarch  Mon Oct 31 20:13:28 2016
< openstack-neutron-bigswitch-lldp-2015.3.8-1.el7ost.noarch   Mon Oct 31 20:13:27 2016
< openstack-neutron-common-7.1.1-7.el7ost.noarch              Mon Oct 31 20:13:26 2016
< openstack-neutron-lbaas-7.1.1-1.el7ost.noarch               Mon Oct 31 20:25:56 2016
< openstack-neutron-metering-agent-7.1.1-7.el7ost.noarch      Mon Oct 31 20:28:30 2016
< openstack-neutron-ml2-7.1.1-7.el7ost.noarch                 Mon Oct 31 20:19:07 2016
< openstack-neutron-openvswitch-7.1.1-7.el7ost.noarch         Mon Oct 31 20:25:59 2016
< openstack-nova-api-12.0.4-16.el7ost.noarch                  Mon Oct 31 20:26:05 2016
< openstack-nova-cert-12.0.4-16.el7ost.noarch                 Mon Oct 31 20:26:05 2016
< openstack-nova-common-12.0.4-16.el7ost.noarch               Mon Oct 31 20:23:41 2016
< openstack-nova-compute-12.0.4-16.el7ost.noarch              Mon Oct 31 20:25:56 2016
< openstack-nova-conductor-12.0.4-16.el7ost.noarch            Mon Oct 31 20:26:04 2016
< openstack-nova-console-12.0.4-16.el7ost.noarch              Mon Oct 31 20:26:05 2016
< openstack-nova-novncproxy-12.0.4-16.el7ost.noarch           Mon Oct 31 20:26:04 2016
< openstack-nova-scheduler-12.0.4-16.el7ost.noarch            Mon Oct 31 20:26:04 2016
< python-neutron-7.1.1-7.el7ost.noarch                        Mon Oct 31 20:13:25 2016
< python-neutron-lbaas-7.1.1-1.el7ost.noarch                  Mon Oct 31 20:20:51 2016
< python-neutronclient-3.1.0-1.el7ost.noarch                  Mon Oct 31 20:13:20 2016
< python-nova-12.0.4-16.el7ost.noarch                         Mon Oct 31 20:23:41 2016
< python-novaclient-3.1.0-2.el7ost.noarch                     Mon Oct 31 20:13:22 2016
> libvirt-1.2.17-13.el7_2.5.x86_64                            Sun Dec 18 18:01:58 2016	1482076918
> libvirt-client-1.2.17-13.el7_2.5.x86_64                     Sun Dec 18 18:01:27 2016	1482076887
> libvirt-daemon-1.2.17-13.el7_2.5.x86_64                     Sun Dec 18 18:01:28 2016	1482076888
> libvirt-daemon-config-network-1.2.17-13.el7_2.5.x86_64      Sun Dec 18 18:01:30 2016	1482076890
> libvirt-daemon-config-nwfilter-1.2.17-13.el7_2.5.x86_64     Sun Dec 18 18:01:30 2016	1482076890
> libvirt-daemon-driver-interface-1.2.17-13.el7_2.5.x86_64    Sun Dec 18 18:01:28 2016	1482076888
> libvirt-daemon-driver-lxc-1.2.17-13.el7_2.5.x86_64          Sun Dec 18 18:01:30 2016	1482076890
> libvirt-daemon-driver-network-1.2.17-13.el7_2.5.x86_64      Sun Dec 18 18:01:28 2016	1482076888
> libvirt-daemon-driver-nodedev-1.2.17-13.el7_2.5.x86_64      Sun Dec 18 18:01:28 2016	1482076888
> libvirt-daemon-driver-nwfilter-1.2.17-13.el7_2.5.x86_64     Sun Dec 18 18:01:28 2016	1482076888
> libvirt-daemon-driver-qemu-1.2.17-13.el7_2.5.x86_64         Sun Dec 18 18:01:35 2016	1482076895
> libvirt-daemon-driver-secret-1.2.17-13.el7_2.5.x86_64       Sun Dec 18 18:01:28 2016	1482076888
> libvirt-daemon-driver-storage-1.2.17-13.el7_2.5.x86_64      Sun Dec 18 18:01:43 2016	1482076903
> libvirt-daemon-kvm-1.2.17-13.el7_2.5.x86_64                 Sun Dec 18 18:01:43 2016	1482076903
> libvirt-python-1.2.17-2.el7.x86_64                          Fri Jun  3 20:42:02 2016	1464975722
> openstack-neutron-7.1.1-4.el7ost.noarch                     Sun Dec 18 18:01:29 2016	1482076889
> openstack-neutron-bigswitch-agent-2015.3.8-1.el7ost.noarch  Fri Jun  3 20:27:18 2016	1464974838
> openstack-neutron-bigswitch-lldp-2015.3.8-1.el7ost.noarch   Fri Jun  3 20:27:18 2016	1464974838
> openstack-neutron-common-7.1.1-4.el7ost.noarch              Sun Dec 18 18:01:29 2016	1482076889
> openstack-neutron-lbaas-7.1.1-1.el7ost.noarch               Sun Dec 18 18:02:00 2016	1482076920
> openstack-neutron-metering-agent-7.1.1-4.el7ost.noarch      Sun Dec 18 18:02:02 2016	1482076922
> openstack-neutron-ml2-7.1.1-4.el7ost.noarch                 Sun Dec 18 18:02:02 2016	1482076922
> openstack-neutron-openvswitch-7.1.1-4.el7ost.noarch         Sun Dec 18 18:01:58 2016	1482076918
> openstack-nova-api-12.0.4-4.el7ost.noarch                   Sun Dec 18 18:01:58 2016	1482076918
> openstack-nova-cert-12.0.4-4.el7ost.noarch                  Sun Dec 18 18:01:58 2016	1482076918
> openstack-nova-common-12.0.4-4.el7ost.noarch                Sun Dec 18 18:01:33 2016	1482076893
> openstack-nova-compute-12.0.4-4.el7ost.noarch               Sun Dec 18 18:01:58 2016	1482076918
> openstack-nova-conductor-12.0.4-4.el7ost.noarch             Sun Dec 18 18:01:58 2016	1482076918
> openstack-nova-console-12.0.4-4.el7ost.noarch               Sun Dec 18 18:01:58 2016	1482076918
> openstack-nova-novncproxy-12.0.4-4.el7ost.noarch            Sun Dec 18 18:01:58 2016	1482076918
> openstack-nova-scheduler-12.0.4-4.el7ost.noarch             Sun Dec 18 18:01:58 2016	1482076918
> python-neutron-7.1.1-4.el7ost.noarch                        Sun Dec 18 18:01:29 2016	1482076889
> python-neutron-lbaas-7.1.1-1.el7ost.noarch                  Sun Dec 18 18:01:30 2016	1482076890
> python-neutronclient-3.1.0-1.el7ost.noarch                  Fri Jun  3 20:27:10 2016	1464974830
> python-nova-12.0.4-4.el7ost.noarch                          Sun Dec 18 18:01:33 2016	1482076893
> python-novaclient-3.1.0-2.el7ost.noarch                     Fri Jun  3 20:27:12 2016	1464974832
From the code, the patch listed at the mailing list:
diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c
index bb17b84..7db4497 100644
--- a/src/util/virnetdev.c
+++ b/src/util/virnetdev.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2007-2015 Red Hat, Inc.
+ * Copyright (C) 2007-2016 Red Hat, Inc.
  *
  * This library is free software; you can redistribute it and/or
  * modify it under the terms of the GNU Lesser General Public
@@ -2547,7 +2547,8 @@ virNetDevRestoreVfConfig(const char *pflinkdev,
  */
 int
 virNetDevReplaceNetConfig(const char *linkdev, int vf,
-                          const virMacAddr *macaddress, int vlanid,
+                          const virMacAddr *macaddress,
+                          virNetDevVlanPtr vlan,
                           const char *stateDir)
 {
     int ret = -1;
@@ -2566,11 +2567,29 @@ virNetDevReplaceNetConfig(const char *linkdev, int vf,
         linkdev = pfdevname;
     }
 
-    if (vf == -1)
+    if (vf == -1) {
+        if (vlan) {
+            virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
+                           _("vlan can only be set for SR-IOV VFs, but "
+                             "%s is not a VF"), linkdev);
+            goto cleanup;
+        }
         ret = virNetDevReplaceMacAddress(linkdev, macaddress, stateDir);
-    else
+    } else {
+        int vlanid = 0; /* assure any current vlan tag is reset */
+
+        if (vlan) {
+            if (vlan->nTags != 1 || vlan->trunk) {
+                virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
+                               _("vlan trunking is not supported "
+                                 "by SR-IOV network devices"));
+                goto cleanup;
+            }
+            vlanid = vlan->tag[0];
+        }
         ret = virNetDevReplaceVfConfig(linkdev, vf, macaddress, vlanid,
                                        stateDir);
+    }
 
And the code in the .srpm for our version (https://access.redhat.com/downloads/content/rhel---7/x86_64/2456/libvirt/2.0.0-10.el7_3.2/x86_64/fd431d51/package)
    if (vf == -1) {
        if (vlan) {
            virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
                           _("vlan can only be set for SR-IOV VFs, but "
                             "%s is not a VF"), linkdev);
            goto cleanup;
        }
        ret = virNetDevReplaceMacAddress(linkdev, macaddress, stateDir);
    } else {
        int vlanid = 0; /* assure any current vlan tag is reset */
        if (vlan) {
            if (vlan->nTags != 1 || vlan->trunk) {
                virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
                               _("vlan trunking is not supported "
                                 "by SR-IOV network devices"));
So latest, is theoretically containing the patch.
Downgrading the packages for libvirt to: libvirt-1.2.17-13.el7_2.5.x86_64 libvirt-client-1.2.17-13.el7_2.5.x86_64 libvirt-daemon-1.2.17-13.el7_2.5.x86_64 libvirt-daemon-config-network-1.2.17-13.el7_2.5.x86_64 libvirt-daemon-config-nwfilter-1.2.17-13.el7_2.5.x86_64 libvirt-daemon-driver-interface-1.2.17-13.el7_2.5.x86_64 libvirt-daemon-driver-lxc-1.2.17-13.el7_2.5.x86_64 libvirt-daemon-driver-network-1.2.17-13.el7_2.5.x86_64 libvirt-daemon-driver-nodedev-1.2.17-13.el7_2.5.x86_64 libvirt-daemon-driver-nwfilter-1.2.17-13.el7_2.5.x86_64 libvirt-daemon-driver-qemu-1.2.17-13.el7_2.5.x86_64 libvirt-daemon-driver-secret-1.2.17-13.el7_2.5.x86_64 libvirt-daemon-driver-storage-1.2.17-13.el7_2.5.x86_64 libvirt-daemon-kvm-1.2.17-13.el7_2.5.x86_64 libvirt-python-1.2.17-2.el7.x86_64 The newly created VM does show the vlan tag on ip link|grep $MAC so this might be a regression. The libvirt config for the macvtap interfaces in the sosreportsdoesn't have any vlan tag info. Instead, openstack itself is calling "ip link set $PF vf $VF vlan $tag" prior to telling libvirt to start the guest (it's done this way because openstack decided to add vlan tagging support for macvtap passthrough with a hack external to libvirt sometime prior to libvirt adding support (with the patch you mention, which was added upstream in v1.3.5). So the patch is actually adding code that you *don't* want.
The problem is that openstack sets the vlan tag, then tels libvirt to start the guest. Prior to starting qemu libvirt configures the interface's mac address, and since it's now aware of the vlan tag in macvtap passthrough mode, it also recognizes when a tag isn't set, and makes sure to reset any existing tag on the interface (see the comment "assure any current vlan tag is reset" in the patch). That's what wasn't happening before, and is now causing the vlan tag to be reset.
I guess that in order to allow for backward compatibility in such a case, we will need to only reset the existing vlan tag if there is an explicit "untagged" flag in the config. The unfortunate thing here is that the behavior of resetting existing vlan tags for a VF when the new guest has no vlan tag is something we've done with vfio assignment of VFs (ie <interface type='hostdev'>) to guests since the beginning, so we will have to have different bahavior for the same config item based on macvtap vs. hostdev.
(Trying to get openstack to work around it by independently setting vlan tag (to satisfy old libvirt versions that don't support setting the vlan tag) as well as setting it in the libvirt config (to satisfy new libvirt versions) likely isn't going to fly. What openstack did in the past to set vlan tag worked only by chance, but it did work, so I suppose we need to preserve that behavior in libvirt.)
I'm not sure what's the best way to preserve the original behavior (not messing with the vlan tag when no <vlan> element is present) while still allowing a setup to sometimes specify that the vlan tag *should* be reset - perhaps setting <vlan><tag id='0'/></vlan> in the config? This isn't *exactly* the same thing as "no vlan tag" in the general case (all non-SRIOV documentation about vlan tags says that packets tagged for vlan 0 will have information about Class of Service, but no tag id), but for SRIOV VFs specifically, setting the tag to 0 effectively removes vlan tagging from the interface.
Dan - do you have an opinion about the "best" ("least bad") way to deal with this? (Can you think of something less ugly that what I'm suggesting?)
I don't think there is anything we should be fixing in libvirt. The libvirt implemented behaviour is doing the right thing here when no vlan tag is present in the XML. The problem here is that OpenStack was trying to configure stuff behind libvirt's back and unsurprisingly this broke when libvirt started to support this feature. If you go outside libvirt to configure things, it is fully expected that you may break when libvirt configures it itself. We simply need to fix openstack to use libvirt XML to do this - OpenStack can check libvirt version to determine when libvirt supports this feature or not, which is something Openstack has done in the past for detecting new NIC features in libvirt. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2017-0465.html | 
Description of problem: Difference of behavior between: DOESN'T - libvirt-2.0.0-10.el7_3.2.x86_64.rpm WORKS - libvirt-1.2.17-13.el7_2.5.x86_64 How reproducible: [stack@undercloud ~]$ neutron net-show SB-X +---------------------------+--------------------------------------+ | Field | Value | +---------------------------+--------------------------------------+ | admin_state_up | True | | id | 693c8b1f-a026-409a-afba-f942198c23a5 | | mtu | 0 | | name | SB-X | | port_security_enabled | True | | provider:network_type | vlan | | provider:physical_network | physnet1 | | provider:segmentation_id | 691 | | qos_policy_id | | | router:external | False | | shared | True | | status | ACTIVE | | subnets | 80d0c575-1f06-4ab3-87a6-4b3e67b09b68 | | tenant_id | f40773c568164ac1871a54bdb0e4d33d | +---------------------------+--------------------------------------+ [stack@undercloud ~]$ neutron port-show portX +-----------------------+--------------------------------------------------------------------------------------------------------+ | Field | Value | +-----------------------+--------------------------------------------------------------------------------------------------------+ | admin_state_up | True | | allowed_address_pairs | | | binding:host_id | | | binding:profile | {} | | binding:vif_details | {} | | binding:vif_type | unbound | | binding:vnic_type | macvtap | | device_id | | | device_owner | | | dns_assignment | {"hostname": "host-10-10-10-5", "ip_address": "10.10.10.5", "fqdn": "host-10-10-10-5.openstacklocal."} | | dns_name | | | extra_dhcp_opts | | | fixed_ips | {"subnet_id": "80d0c575-1f06-4ab3-87a6-4b3e67b09b68", "ip_address": "10.10.10.5"} | | id | bf3a2bc0-1e71-475b-ad1a-8be70bebd9c6 | | mac_address | fa:16:3e:f6:d6:ac | | name | portX | | network_id | 693c8b1f-a026-409a-afba-f942198c23a5 | | port_security_enabled | True | | qos_policy_id | | | security_groups | 7924a67b-3ae6-45ca-af2f-d31667c5e41b | | status | DOWN | | tenant_id | f40773c568164ac1871a54bdb0e4d33d | +-----------------------+--------------------------------------------------------------------------------------------------------+ [stack@undercloud ~]$ nova boot --flavor m1.medium --image vm_tas_stable_328 --nic port-id=bf3a2bc0-1e71-475b-ad1a-8be70bebd9c6 SB-xWx [stack@undercloud ~]$ nova show sb-xwx +--------------------------------------+-------------------------------------------------------------+ | Property | Value | +--------------------------------------+-------------------------------------------------------------+ | OS-DCF:diskConfig | MANUAL | | OS-EXT-AZ:availability_zone | ntas05 | | OS-EXT-SRV-ATTR:host | overcloud-compute-2.localdomain | | OS-EXT-SRV-ATTR:hostname | sb-xwx | | OS-EXT-SRV-ATTR:hypervisor_hostname | overcloud-compute-2.localdomain | | OS-EXT-SRV-ATTR:instance_name | instance-00000767 | root@overcloud-compute-2 ~]# virsh list Id Name State ---------------------------------------------------- 46 instance-00000767 running [root@overcloud-compute-2 ~]# virsh dumpxml 46 | grep SB <nova:name>SB-xWx</nova:name> [root@overcloud-compute-2 ~]# [root@overcloud-compute-2 ~]# virsh dumpxml 46 | grep -A4 interf <interface type='direct'> <mac address='fa:16:3e:f6:d6:ac'/> <source dev='enp3s31f4' mode='passthrough'/> <target dev='macvtap0'/> <model type='virtio'/> -- [root@overcloud-compute-2 ~]# ip link | grep 691 vf 62 MAC fa:16:3e:f6:d6:ac, vlan 691, spoof checking on, link-state auto On the non working: nova boot --flavor SMALL_1 --image myImage --nic port-id=1b8b7186-501f-4f8d-8cf0-308c23d5f58f XXX [stack@undercloud ~]$ nova show XXX +--------------------------------------+----------------------------------------------------------+ | Property | Value | +--------------------------------------+----------------------------------------------------------+ | OS-DCF:diskConfig | MANUAL | | OS-EXT-AZ:availability_zone | sriov_zone | | OS-EXT-SRV-ATTR:host | overcloud-compute-2.localdomain | | OS-EXT-SRV-ATTR:hostname | xxx | | OS-EXT-SRV-ATTR:hypervisor_hostname | overcloud-compute-2.localdomain | | OS-EXT-SRV-ATTR:instance_name | instance-0000003f | | OS-EXT-SRV-ATTR:kernel_id | | | OS-EXT-SRV-ATTR:launch_index | 0 | | OS-EXT-SRV-ATTR:ramdisk_id | | | OS-EXT-SRV-ATTR:reservation_id | r-sgv22z54 | [root@overcloud-compute-2 ~]# virsh list Id Name State ---------------------------------------------------- 2 instance-0000003f running [root@overcloud-compute-2 ~]# virsh dumpxml 2 | grep XXX <nova:name>XXX</nova:name> [root@overcloud-compute-2 ~]# virsh dumpxml 2 | grep -A3 interf <interface type='direct'> <mac address='fa:16:3e:b1:15:9e'/> <source dev='enp9s31f2' mode='passthrough'/> <target dev='macvtap0'/> -- and the VF without the VLAN Tag vf 58 MAC 00:00:00:00:00:00, spoof checking on, link-state auto, trust off vf 59 MAC 00:00:00:00:00:00, spoof checking on, link-state auto, trust off vf 60 MAC 00:00:00:00:00:00, spoof checking on, link-state auto, trust off vf 61 MAC fa:16:3e:b1:15:9e, spoof checking on, link-state auto, trust off Actual results: On newer libvirt, vlan tag is not kept Expected results: Vlan tag should be kept Additional info: Customer initially appointed to https://www.redhat.com/archives/libvir-list/2016-May/msg00423.html