Description of problem: Difference of behavior between: DOESN'T - libvirt-2.0.0-10.el7_3.2.x86_64.rpm WORKS - libvirt-1.2.17-13.el7_2.5.x86_64 How reproducible: [stack@undercloud ~]$ neutron net-show SB-X +---------------------------+--------------------------------------+ | Field | Value | +---------------------------+--------------------------------------+ | admin_state_up | True | | id | 693c8b1f-a026-409a-afba-f942198c23a5 | | mtu | 0 | | name | SB-X | | port_security_enabled | True | | provider:network_type | vlan | | provider:physical_network | physnet1 | | provider:segmentation_id | 691 | | qos_policy_id | | | router:external | False | | shared | True | | status | ACTIVE | | subnets | 80d0c575-1f06-4ab3-87a6-4b3e67b09b68 | | tenant_id | f40773c568164ac1871a54bdb0e4d33d | +---------------------------+--------------------------------------+ [stack@undercloud ~]$ neutron port-show portX +-----------------------+--------------------------------------------------------------------------------------------------------+ | Field | Value | +-----------------------+--------------------------------------------------------------------------------------------------------+ | admin_state_up | True | | allowed_address_pairs | | | binding:host_id | | | binding:profile | {} | | binding:vif_details | {} | | binding:vif_type | unbound | | binding:vnic_type | macvtap | | device_id | | | device_owner | | | dns_assignment | {"hostname": "host-10-10-10-5", "ip_address": "10.10.10.5", "fqdn": "host-10-10-10-5.openstacklocal."} | | dns_name | | | extra_dhcp_opts | | | fixed_ips | {"subnet_id": "80d0c575-1f06-4ab3-87a6-4b3e67b09b68", "ip_address": "10.10.10.5"} | | id | bf3a2bc0-1e71-475b-ad1a-8be70bebd9c6 | | mac_address | fa:16:3e:f6:d6:ac | | name | portX | | network_id | 693c8b1f-a026-409a-afba-f942198c23a5 | | port_security_enabled | True | | qos_policy_id | | | security_groups | 7924a67b-3ae6-45ca-af2f-d31667c5e41b | | status | DOWN | | tenant_id | f40773c568164ac1871a54bdb0e4d33d | +-----------------------+--------------------------------------------------------------------------------------------------------+ [stack@undercloud ~]$ nova boot --flavor m1.medium --image vm_tas_stable_328 --nic port-id=bf3a2bc0-1e71-475b-ad1a-8be70bebd9c6 SB-xWx [stack@undercloud ~]$ nova show sb-xwx +--------------------------------------+-------------------------------------------------------------+ | Property | Value | +--------------------------------------+-------------------------------------------------------------+ | OS-DCF:diskConfig | MANUAL | | OS-EXT-AZ:availability_zone | ntas05 | | OS-EXT-SRV-ATTR:host | overcloud-compute-2.localdomain | | OS-EXT-SRV-ATTR:hostname | sb-xwx | | OS-EXT-SRV-ATTR:hypervisor_hostname | overcloud-compute-2.localdomain | | OS-EXT-SRV-ATTR:instance_name | instance-00000767 | root@overcloud-compute-2 ~]# virsh list Id Name State ---------------------------------------------------- 46 instance-00000767 running [root@overcloud-compute-2 ~]# virsh dumpxml 46 | grep SB <nova:name>SB-xWx</nova:name> [root@overcloud-compute-2 ~]# [root@overcloud-compute-2 ~]# virsh dumpxml 46 | grep -A4 interf <interface type='direct'> <mac address='fa:16:3e:f6:d6:ac'/> <source dev='enp3s31f4' mode='passthrough'/> <target dev='macvtap0'/> <model type='virtio'/> -- [root@overcloud-compute-2 ~]# ip link | grep 691 vf 62 MAC fa:16:3e:f6:d6:ac, vlan 691, spoof checking on, link-state auto On the non working: nova boot --flavor SMALL_1 --image myImage --nic port-id=1b8b7186-501f-4f8d-8cf0-308c23d5f58f XXX [stack@undercloud ~]$ nova show XXX +--------------------------------------+----------------------------------------------------------+ | Property | Value | +--------------------------------------+----------------------------------------------------------+ | OS-DCF:diskConfig | MANUAL | | OS-EXT-AZ:availability_zone | sriov_zone | | OS-EXT-SRV-ATTR:host | overcloud-compute-2.localdomain | | OS-EXT-SRV-ATTR:hostname | xxx | | OS-EXT-SRV-ATTR:hypervisor_hostname | overcloud-compute-2.localdomain | | OS-EXT-SRV-ATTR:instance_name | instance-0000003f | | OS-EXT-SRV-ATTR:kernel_id | | | OS-EXT-SRV-ATTR:launch_index | 0 | | OS-EXT-SRV-ATTR:ramdisk_id | | | OS-EXT-SRV-ATTR:reservation_id | r-sgv22z54 | [root@overcloud-compute-2 ~]# virsh list Id Name State ---------------------------------------------------- 2 instance-0000003f running [root@overcloud-compute-2 ~]# virsh dumpxml 2 | grep XXX <nova:name>XXX</nova:name> [root@overcloud-compute-2 ~]# virsh dumpxml 2 | grep -A3 interf <interface type='direct'> <mac address='fa:16:3e:b1:15:9e'/> <source dev='enp9s31f2' mode='passthrough'/> <target dev='macvtap0'/> -- and the VF without the VLAN Tag vf 58 MAC 00:00:00:00:00:00, spoof checking on, link-state auto, trust off vf 59 MAC 00:00:00:00:00:00, spoof checking on, link-state auto, trust off vf 60 MAC 00:00:00:00:00:00, spoof checking on, link-state auto, trust off vf 61 MAC fa:16:3e:b1:15:9e, spoof checking on, link-state auto, trust off Actual results: On newer libvirt, vlan tag is not kept Expected results: Vlan tag should be kept Additional info: Customer initially appointed to https://www.redhat.com/archives/libvir-list/2016-May/msg00423.html
At the moment I'm checking the differences between the package sets related to nova, neutron and libvirt for the working and non working system # diff sosreport-20170110-113046/overcloud-compute-1.localdomain/installed-rpms sosreport-20170115-145302/overcloud-compute-2.localdomain/installed-rpms |cut -f1,2|egrep 'neutron|libvirt|nova' < libvirt-2.0.0-10.el7_3.2.x86_64 Fri Dec 30 16:37:27 2016 < libvirt-client-2.0.0-10.el7_3.2.x86_64 Fri Dec 30 16:37:04 2016 < libvirt-daemon-2.0.0-10.el7_3.2.x86_64 Fri Dec 30 16:37:05 2016 < libvirt-daemon-config-network-2.0.0-10.el7_3.2.x86_64 Fri Dec 30 16:37:05 2016 < libvirt-daemon-config-nwfilter-2.0.0-10.el7_3.2.x86_64 Fri Dec 30 16:37:05 2016 < libvirt-daemon-driver-interface-2.0.0-10.el7_3.2.x86_64 Fri Dec 30 16:37:05 2016 < libvirt-daemon-driver-lxc-2.0.0-10.el7_3.2.x86_64 Fri Dec 30 16:37:05 2016 < libvirt-daemon-driver-network-2.0.0-10.el7_3.2.x86_64 Fri Dec 30 16:37:05 2016 < libvirt-daemon-driver-nodedev-2.0.0-10.el7_3.2.x86_64 Fri Dec 30 16:37:05 2016 < libvirt-daemon-driver-nwfilter-2.0.0-10.el7_3.2.x86_64 Fri Dec 30 16:37:05 2016 < libvirt-daemon-driver-qemu-2.0.0-10.el7_3.2.x86_64 Fri Dec 30 16:37:06 2016 < libvirt-daemon-driver-secret-2.0.0-10.el7_3.2.x86_64 Fri Dec 30 16:37:05 2016 < libvirt-daemon-driver-storage-2.0.0-10.el7_3.2.x86_64 Fri Dec 30 16:37:06 2016 < libvirt-daemon-kvm-2.0.0-10.el7_3.2.x86_64 Fri Dec 30 16:37:27 2016 < libvirt-python-2.0.0-2.el7.x86_64 Mon Oct 31 20:23:33 2016 < openstack-neutron-7.1.1-7.el7ost.noarch Mon Oct 31 20:25:56 2016 < openstack-neutron-bigswitch-agent-2015.3.8-1.el7ost.noarch Mon Oct 31 20:13:28 2016 < openstack-neutron-bigswitch-lldp-2015.3.8-1.el7ost.noarch Mon Oct 31 20:13:27 2016 < openstack-neutron-common-7.1.1-7.el7ost.noarch Mon Oct 31 20:13:26 2016 < openstack-neutron-lbaas-7.1.1-1.el7ost.noarch Mon Oct 31 20:25:56 2016 < openstack-neutron-metering-agent-7.1.1-7.el7ost.noarch Mon Oct 31 20:28:30 2016 < openstack-neutron-ml2-7.1.1-7.el7ost.noarch Mon Oct 31 20:19:07 2016 < openstack-neutron-openvswitch-7.1.1-7.el7ost.noarch Mon Oct 31 20:25:59 2016 < openstack-nova-api-12.0.4-16.el7ost.noarch Mon Oct 31 20:26:05 2016 < openstack-nova-cert-12.0.4-16.el7ost.noarch Mon Oct 31 20:26:05 2016 < openstack-nova-common-12.0.4-16.el7ost.noarch Mon Oct 31 20:23:41 2016 < openstack-nova-compute-12.0.4-16.el7ost.noarch Mon Oct 31 20:25:56 2016 < openstack-nova-conductor-12.0.4-16.el7ost.noarch Mon Oct 31 20:26:04 2016 < openstack-nova-console-12.0.4-16.el7ost.noarch Mon Oct 31 20:26:05 2016 < openstack-nova-novncproxy-12.0.4-16.el7ost.noarch Mon Oct 31 20:26:04 2016 < openstack-nova-scheduler-12.0.4-16.el7ost.noarch Mon Oct 31 20:26:04 2016 < python-neutron-7.1.1-7.el7ost.noarch Mon Oct 31 20:13:25 2016 < python-neutron-lbaas-7.1.1-1.el7ost.noarch Mon Oct 31 20:20:51 2016 < python-neutronclient-3.1.0-1.el7ost.noarch Mon Oct 31 20:13:20 2016 < python-nova-12.0.4-16.el7ost.noarch Mon Oct 31 20:23:41 2016 < python-novaclient-3.1.0-2.el7ost.noarch Mon Oct 31 20:13:22 2016 > libvirt-1.2.17-13.el7_2.5.x86_64 Sun Dec 18 18:01:58 2016 1482076918 > libvirt-client-1.2.17-13.el7_2.5.x86_64 Sun Dec 18 18:01:27 2016 1482076887 > libvirt-daemon-1.2.17-13.el7_2.5.x86_64 Sun Dec 18 18:01:28 2016 1482076888 > libvirt-daemon-config-network-1.2.17-13.el7_2.5.x86_64 Sun Dec 18 18:01:30 2016 1482076890 > libvirt-daemon-config-nwfilter-1.2.17-13.el7_2.5.x86_64 Sun Dec 18 18:01:30 2016 1482076890 > libvirt-daemon-driver-interface-1.2.17-13.el7_2.5.x86_64 Sun Dec 18 18:01:28 2016 1482076888 > libvirt-daemon-driver-lxc-1.2.17-13.el7_2.5.x86_64 Sun Dec 18 18:01:30 2016 1482076890 > libvirt-daemon-driver-network-1.2.17-13.el7_2.5.x86_64 Sun Dec 18 18:01:28 2016 1482076888 > libvirt-daemon-driver-nodedev-1.2.17-13.el7_2.5.x86_64 Sun Dec 18 18:01:28 2016 1482076888 > libvirt-daemon-driver-nwfilter-1.2.17-13.el7_2.5.x86_64 Sun Dec 18 18:01:28 2016 1482076888 > libvirt-daemon-driver-qemu-1.2.17-13.el7_2.5.x86_64 Sun Dec 18 18:01:35 2016 1482076895 > libvirt-daemon-driver-secret-1.2.17-13.el7_2.5.x86_64 Sun Dec 18 18:01:28 2016 1482076888 > libvirt-daemon-driver-storage-1.2.17-13.el7_2.5.x86_64 Sun Dec 18 18:01:43 2016 1482076903 > libvirt-daemon-kvm-1.2.17-13.el7_2.5.x86_64 Sun Dec 18 18:01:43 2016 1482076903 > libvirt-python-1.2.17-2.el7.x86_64 Fri Jun 3 20:42:02 2016 1464975722 > openstack-neutron-7.1.1-4.el7ost.noarch Sun Dec 18 18:01:29 2016 1482076889 > openstack-neutron-bigswitch-agent-2015.3.8-1.el7ost.noarch Fri Jun 3 20:27:18 2016 1464974838 > openstack-neutron-bigswitch-lldp-2015.3.8-1.el7ost.noarch Fri Jun 3 20:27:18 2016 1464974838 > openstack-neutron-common-7.1.1-4.el7ost.noarch Sun Dec 18 18:01:29 2016 1482076889 > openstack-neutron-lbaas-7.1.1-1.el7ost.noarch Sun Dec 18 18:02:00 2016 1482076920 > openstack-neutron-metering-agent-7.1.1-4.el7ost.noarch Sun Dec 18 18:02:02 2016 1482076922 > openstack-neutron-ml2-7.1.1-4.el7ost.noarch Sun Dec 18 18:02:02 2016 1482076922 > openstack-neutron-openvswitch-7.1.1-4.el7ost.noarch Sun Dec 18 18:01:58 2016 1482076918 > openstack-nova-api-12.0.4-4.el7ost.noarch Sun Dec 18 18:01:58 2016 1482076918 > openstack-nova-cert-12.0.4-4.el7ost.noarch Sun Dec 18 18:01:58 2016 1482076918 > openstack-nova-common-12.0.4-4.el7ost.noarch Sun Dec 18 18:01:33 2016 1482076893 > openstack-nova-compute-12.0.4-4.el7ost.noarch Sun Dec 18 18:01:58 2016 1482076918 > openstack-nova-conductor-12.0.4-4.el7ost.noarch Sun Dec 18 18:01:58 2016 1482076918 > openstack-nova-console-12.0.4-4.el7ost.noarch Sun Dec 18 18:01:58 2016 1482076918 > openstack-nova-novncproxy-12.0.4-4.el7ost.noarch Sun Dec 18 18:01:58 2016 1482076918 > openstack-nova-scheduler-12.0.4-4.el7ost.noarch Sun Dec 18 18:01:58 2016 1482076918 > python-neutron-7.1.1-4.el7ost.noarch Sun Dec 18 18:01:29 2016 1482076889 > python-neutron-lbaas-7.1.1-1.el7ost.noarch Sun Dec 18 18:01:30 2016 1482076890 > python-neutronclient-3.1.0-1.el7ost.noarch Fri Jun 3 20:27:10 2016 1464974830 > python-nova-12.0.4-4.el7ost.noarch Sun Dec 18 18:01:33 2016 1482076893 > python-novaclient-3.1.0-2.el7ost.noarch Fri Jun 3 20:27:12 2016 1464974832
From the code, the patch listed at the mailing list: diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index bb17b84..7db4497 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -1,5 +1,5 @@ /* - * Copyright (C) 2007-2015 Red Hat, Inc. + * Copyright (C) 2007-2016 Red Hat, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public @@ -2547,7 +2547,8 @@ virNetDevRestoreVfConfig(const char *pflinkdev, */ int virNetDevReplaceNetConfig(const char *linkdev, int vf, - const virMacAddr *macaddress, int vlanid, + const virMacAddr *macaddress, + virNetDevVlanPtr vlan, const char *stateDir) { int ret = -1; @@ -2566,11 +2567,29 @@ virNetDevReplaceNetConfig(const char *linkdev, int vf, linkdev = pfdevname; } - if (vf == -1) + if (vf == -1) { + if (vlan) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("vlan can only be set for SR-IOV VFs, but " + "%s is not a VF"), linkdev); + goto cleanup; + } ret = virNetDevReplaceMacAddress(linkdev, macaddress, stateDir); - else + } else { + int vlanid = 0; /* assure any current vlan tag is reset */ + + if (vlan) { + if (vlan->nTags != 1 || vlan->trunk) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("vlan trunking is not supported " + "by SR-IOV network devices")); + goto cleanup; + } + vlanid = vlan->tag[0]; + } ret = virNetDevReplaceVfConfig(linkdev, vf, macaddress, vlanid, stateDir); + } And the code in the .srpm for our version (https://access.redhat.com/downloads/content/rhel---7/x86_64/2456/libvirt/2.0.0-10.el7_3.2/x86_64/fd431d51/package) if (vf == -1) { if (vlan) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("vlan can only be set for SR-IOV VFs, but " "%s is not a VF"), linkdev); goto cleanup; } ret = virNetDevReplaceMacAddress(linkdev, macaddress, stateDir); } else { int vlanid = 0; /* assure any current vlan tag is reset */ if (vlan) { if (vlan->nTags != 1 || vlan->trunk) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("vlan trunking is not supported " "by SR-IOV network devices")); So latest, is theoretically containing the patch.
Downgrading the packages for libvirt to: libvirt-1.2.17-13.el7_2.5.x86_64 libvirt-client-1.2.17-13.el7_2.5.x86_64 libvirt-daemon-1.2.17-13.el7_2.5.x86_64 libvirt-daemon-config-network-1.2.17-13.el7_2.5.x86_64 libvirt-daemon-config-nwfilter-1.2.17-13.el7_2.5.x86_64 libvirt-daemon-driver-interface-1.2.17-13.el7_2.5.x86_64 libvirt-daemon-driver-lxc-1.2.17-13.el7_2.5.x86_64 libvirt-daemon-driver-network-1.2.17-13.el7_2.5.x86_64 libvirt-daemon-driver-nodedev-1.2.17-13.el7_2.5.x86_64 libvirt-daemon-driver-nwfilter-1.2.17-13.el7_2.5.x86_64 libvirt-daemon-driver-qemu-1.2.17-13.el7_2.5.x86_64 libvirt-daemon-driver-secret-1.2.17-13.el7_2.5.x86_64 libvirt-daemon-driver-storage-1.2.17-13.el7_2.5.x86_64 libvirt-daemon-kvm-1.2.17-13.el7_2.5.x86_64 libvirt-python-1.2.17-2.el7.x86_64 The newly created VM does show the vlan tag on ip link|grep $MAC so this might be a regression.
The libvirt config for the macvtap interfaces in the sosreportsdoesn't have any vlan tag info. Instead, openstack itself is calling "ip link set $PF vf $VF vlan $tag" prior to telling libvirt to start the guest (it's done this way because openstack decided to add vlan tagging support for macvtap passthrough with a hack external to libvirt sometime prior to libvirt adding support (with the patch you mention, which was added upstream in v1.3.5). So the patch is actually adding code that you *don't* want. The problem is that openstack sets the vlan tag, then tels libvirt to start the guest. Prior to starting qemu libvirt configures the interface's mac address, and since it's now aware of the vlan tag in macvtap passthrough mode, it also recognizes when a tag isn't set, and makes sure to reset any existing tag on the interface (see the comment "assure any current vlan tag is reset" in the patch). That's what wasn't happening before, and is now causing the vlan tag to be reset. I guess that in order to allow for backward compatibility in such a case, we will need to only reset the existing vlan tag if there is an explicit "untagged" flag in the config. The unfortunate thing here is that the behavior of resetting existing vlan tags for a VF when the new guest has no vlan tag is something we've done with vfio assignment of VFs (ie <interface type='hostdev'>) to guests since the beginning, so we will have to have different bahavior for the same config item based on macvtap vs. hostdev. (Trying to get openstack to work around it by independently setting vlan tag (to satisfy old libvirt versions that don't support setting the vlan tag) as well as setting it in the libvirt config (to satisfy new libvirt versions) likely isn't going to fly. What openstack did in the past to set vlan tag worked only by chance, but it did work, so I suppose we need to preserve that behavior in libvirt.) I'm not sure what's the best way to preserve the original behavior (not messing with the vlan tag when no <vlan> element is present) while still allowing a setup to sometimes specify that the vlan tag *should* be reset - perhaps setting <vlan><tag id='0'/></vlan> in the config? This isn't *exactly* the same thing as "no vlan tag" in the general case (all non-SRIOV documentation about vlan tags says that packets tagged for vlan 0 will have information about Class of Service, but no tag id), but for SRIOV VFs specifically, setting the tag to 0 effectively removes vlan tagging from the interface. Dan - do you have an opinion about the "best" ("least bad") way to deal with this? (Can you think of something less ugly that what I'm suggesting?)
I don't think there is anything we should be fixing in libvirt. The libvirt implemented behaviour is doing the right thing here when no vlan tag is present in the XML. The problem here is that OpenStack was trying to configure stuff behind libvirt's back and unsurprisingly this broke when libvirt started to support this feature. If you go outside libvirt to configure things, it is fully expected that you may break when libvirt configures it itself. We simply need to fix openstack to use libvirt XML to do this - OpenStack can check libvirt version to determine when libvirt supports this feature or not, which is something Openstack has done in the past for detecting new NIC features in libvirt.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2017-0465.html