Description of problem: One of Juniper's plugins relies on nova to create interfaces with the correct mode - multi-queue or single-queue. Can we backport nova networking - multiqueue feature + the respective code within vrouter create? For Juniper's Contrail implementation, called from https://github.com/openstack/nova/blob/newton-eol/nova/virt/libvirt/vif.py#L749 Compare upstream Newton: /openstack/nova/blob/newton-eol/nova/network/linux_net.py#L1314 ~~~ (...) def create_tap_dev(dev, mac_address=None): (...) ~~~ To upstream master: https://github.com/openstack/nova/blob/master/nova/network/linux_utils.py#L74 ~~~ def create_tap_dev(dev, mac_address=None, multiqueue=False): ~~~ https://github.com/openstack/nova/blob/master/nova/virt/libvirt/vif.py#L757 ~~~ multiqueue = self._is_multiqueue_enabled(instance.image_meta, instance.flavor) linux_net_utils.create_tap_dev(dev, multiqueue=multiqueue) ~~~ https://github.com/openstack/nova/blob/master/nova/virt/libvirt/vif.py#L192 ~~~ def _is_multiqueue_enabled(self, image_meta, flavor): _, vhost_queues = self._get_virtio_mq_settings(image_meta, flavor) return vhost_queues > 1 if vhost_queues is not None else False def _get_virtio_mq_settings(self, image_meta, flavor): """A methods to set the number of virtio queues, if it has been requested in extra specs. """ driver = None vhost_queues = None if not isinstance(image_meta, objects.ImageMeta): image_meta = objects.ImageMeta.from_dict(image_meta) img_props = image_meta.properties if img_props.get('hw_vif_multiqueue_enabled'): driver = 'vhost' max_tap_queues = self._get_max_tap_queues() if max_tap_queues: vhost_queues = (max_tap_queues if flavor.vcpus > max_tap_queues else flavor.vcpus) else: vhost_queues = flavor.vcpus return (driver, vhost_queues) def _get_max_tap_queues(self): # NOTE(kengo.sakai): In kernels prior to 3.0, # multiple queues on a tap interface is not supported. # In kernels 3.x, the number of queues on a tap interface # is limited to 8. From 4.0, the number is 256. # See: https://bugs.launchpad.net/nova/+bug/1570631 kernel_version = int(os.uname()[2].split(".")[0]) if kernel_version <= 2: return 1 elif kernel_version == 3: return 8 elif kernel_version == 4: return 256 else: return None ~~~
We need to ensure a few things: * MQ support in `create_tap_dev` * enable `multiqueue` flag in all neccessary places in `vif.py` (probably not only `plug_vrouter`) * update nova/rootwrap.d/compute.filter / nova/rootwrap.d/network.filters (???)
To give some more background: if nova does set up the tap interface as single queue, then libvirt / qemu will not start: ~~~ error: Failed to start domain instance-00000008 error: Unable to create tap device tap0003: Invalid argument ~~~ The reason is that the instance XML is created with e.g.: ~~~ <interface type='ethernet'> + <mac address='02:d9:f2:fb:00:03'/> + <target dev='tap0003'/> + <model type='virtio'/> + <driver name='vhost' queues='2'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x1'/> </interface> ~~~ However, nova prior to this creates that tap interface in mode single queue. This can be reproduced manually: I modified an instance that was booted with nova and added a second interface to it. ## Test 1 ## Just creating a multi queue ethernet port: ~~~ [root@overcloud-compute-1 ~]# diff -u instance-00000008.xml instance-00000008.ethernet.xml --- instance-00000008.xml 2018-07-25 15:36:57.435311864 +0000 +++ instance-00000008.ethernet.xml 2018-07-25 15:44:54.886835979 +0000 @@ -75,22 +75,22 @@ <target dev='tapb87f9248-45'/> <model type='virtio'/> <driver name='vhost' queues='2'/> - <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0' multifunction='on'/> + </interface> + <interface type='ethernet'> + <mac address='02:d9:f2:fb:00:03'/> + <target dev='tap0003'/> + <model type='virtio'/> + <driver name='vhost' queues='2'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x1'/> </interface> - <serial type='file'> - <source path='/var/lib/nova/instances/f00c48ca-6824-4585-a4b0-4aede96438d7/console.log'/> - <target type='isa-serial' port='0'> - <model name='isa-serial'/> - </target> - </serial> <serial type='pty'> <target type='isa-serial' port='1'> <model name='isa-serial'/> </target> </serial> - <console type='file'> - <source path='/var/lib/nova/instances/f00c48ca-6824-4585-a4b0-4aede96438d7/console.log'/> - <target type='serial' port='0'/> + <console type='pty'> + <target type='serial' port='1'/> </console> <input type='tablet' bus='usb'> <address type='usb' bus='0' port='1'/> ~~~ The below works, as seen from the instance: ~~~ [root@rhel-test2 ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1446 qdisc mq state UP qlen 1000 link/ether fa:16:3e:c5:20:ba brd ff:ff:ff:ff:ff:ff inet 192.168.0.11/24 brd 192.168.0.255 scope global dynamic eth0 valid_lft 86373sec preferred_lft 86373sec inet6 2000:192:168:1:f816:3eff:fec5:20ba/64 scope global noprefixroute dynamic valid_lft 86376sec preferred_lft 14376sec inet6 fe80::f816:3eff:fec5:20ba/64 scope link valid_lft forever preferred_lft forever 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000 link/ether 02:d9:f2:fb:00:03 brd ff:ff:ff:ff:ff:ff [root@rhel-test2 ~]# ethtool -l eth1 Channel parameters for eth1: Pre-set maximums: RX: 0 TX: 0 Other: 0 Combined: 2 Current hardware settings: RX: 0 TX: 0 Other: 0 Combined: 1 ~~~ And as seen from the hypervisor: ~~~ [root@overcloud-compute-1 ~]# ip -d link ls tap0003 30: tap0003: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UNKNOWN mode DEFAULT group default qlen 1000 link/ether fe:d9:f2:fb:00:03 brd ff:ff:ff:ff:ff:ff promiscuity 0 tun addrgenmode eui64 numtxqueues 256 numrxqueues 256 gso_max_size 65536 gso_max_segs 65535 ~~~ ## Test 2 ## Same XML definition as above. ~~~ [root@overcloud-compute-1 ~]# ip tuntap add tap0003 mode tap [root@overcloud-compute-1 ~]# ip link set tap0003 up [root@overcloud-compute-1 ~]# virsh start instance-00000008 error: Failed to start domain instance-00000008 error: Unable to create tap device tap0003: Invalid argument ~~~ ## Test 3 ## ~~~ [root@overcloud-compute-1 ~]# ip tuntap add tap0003 mode tap multi_queue [root@overcloud-compute-1 ~]# ip link set dev tap0003 up [root@overcloud-compute-1 ~]# virsh start instance-00000008 Domain instance-00000008 started ~~~ ## Conclusion ## tuntap interfaces cannot change their type from single queue to multi-queue after creation. Because you predefine the tap interface as single-queue, libvirt cannot create it as type multiqueue and fails. You either need to pre-create the interface with the `multi_queue` (`mq`) flag set, or not pre-create the tap interface at all because libvirt will create it for you.
To finalize the theoretical part of this, I created the following small C binary: ~~~ /* Author: akaris Taking some inspiration from https://www.kernel.org/doc/Documentation/networking/tuntap.txt Copy this code into taptest.c Compile and run: gcc taptest.c -o taptest ./taptest <interface name> <mode> # mode = [ sq | mq] */ #include <fcntl.h> #include <string.h> /* memset */ #include <unistd.h> /* close */ #include <stdio.h> #include <stdlib.h> #include <sys/ioctl.h> #include <sys/socket.h> #include <linux/if.h> #include <linux/if_tun.h> #include <string.h> int tun_alloc(char *dev) { struct ifreq ifr; int fd, err; if( (fd = open("/dev/net/tun", O_RDWR)) < 0 ) { printf("Cannot open /dev/net/tun\n"); exit(1); } memset(&ifr, 0, sizeof(ifr)); /* Flags: IFF_TUN - TUN device (no Ethernet headers) * IFF_TAP - TAP device * * IFF_NO_PI - Do not provide packet information */ ifr.ifr_flags = IFF_TAP; if( *dev ) strncpy(ifr.ifr_name, dev, IFNAMSIZ); if( (err = ioctl(fd, TUNSETIFF, (void *) &ifr)) < 0 ){ close(fd); return err; } strcpy(dev, ifr.ifr_name); return fd; } int tun_alloc_mq(char *dev, int queues, int *fds) { struct ifreq ifr; int fd, err, i; if (!dev) return -1; memset(&ifr, 0, sizeof(ifr)); /* Flags: IFF_TUN - TUN device (no Ethernet headers) * IFF_TAP - TAP device * * IFF_NO_PI - Do not provide packet information * IFF_MULTI_QUEUE - Create a queue of multiqueue device */ ifr.ifr_flags = IFF_TAP | IFF_NO_PI | IFF_MULTI_QUEUE; strcpy(ifr.ifr_name, dev); for (i = 0; i < queues; i++) { if ((fd = open("/dev/net/tun", O_RDWR)) < 0) goto err; err = ioctl(fd, TUNSETIFF, (void *)&ifr); if (err) { close(fd); goto err; } fds[i] = fd; } return 0; err: for (--i; i >= 0; i--) close(fds[i]); return err; } int main(int argc, char **argv) { if(argc > 1) { char * device_name = argv[1]; char * mode = "sq"; if(argc > 2) { mode = argv[2]; } if(!strcmp("sq",mode)) { int fd = tun_alloc(device_name); if(fd < 0) { printf("Cannot create tunnel %s\n",device_name); exit(1); } printf("FD is %d\n",fd); sleep(300); } else if(!strcmp("mq",mode)) { int fds[2]; int ret_val = tun_alloc_mq(device_name,2,fds); if(ret_val < 0) { printf("Cannot create tunnel %s\n",device_name); exit(1); } printf("Multiqueu FDs are %d,%d\n",fds[0],fds[1]); sleep(300); } else { printf("Mode '%s' is not supported\n",mode); } } else { printf("Please provide the name of the tunnel interface to be created\n"); exit(1); } } ~~~ Test: ~~~ [root@overcloud-compute-1 ~]# ip tuntap add sq mode tap [root@overcloud-compute-1 ~]# ip tuntap add mq mode tap multi_queue [root@overcloud-compute-1 ~]# ./taptest sq sq FD is 3 [root@overcloud-compute-1 ~]# ./taptest sq mq Cannot create tunnel sq [root@overcloud-compute-1 ~]# ./taptest mq sq Cannot create tunnel mq [root@overcloud-compute-1 ~]# ./taptest mq mq Multiqueu FDs are 3,4 ~~~
It's *not* in OSP 10 downstream: ~~~ 799 def plug_vrouter(self, instance, vif): 800 """Plug into Contrail's network port 801 802 Bind the vif to a Contrail virtual port. 803 """ 804 dev = self.get_vif_devname(vif) 805 ip_addr = '0.0.0.0' 806 ip6_addr = None 807 subnets = vif['network']['subnets'] 808 for subnet in subnets: 809 if not subnet['ips']: 810 continue 811 ips = subnet['ips'][0] 812 if not ips['address']: 813 continue 814 if (ips['version'] == 4): 815 if ips['address'] is not None: 816 ip_addr = ips['address'] 817 if (ips['version'] == 6): 818 if ips['address'] is not None: 819 ip6_addr = ips['address'] 820 821 ptype = 'NovaVMPort' 822 if (CONF.libvirt.virt_type == 'lxc'): 823 ptype = 'NameSpacePort' 824 825 cmd_args = ("--oper=add --uuid=%s --instance_uuid=%s --vn_uuid=%s " 826 "--vm_project_uuid=%s --ip_address=%s --ipv6_address=%s" 827 " --vm_name=%s --mac=%s --tap_name=%s --port_type=%s " 828 "--tx_vlan_id=%d --rx_vlan_id=%d" % (vif['id'], 829 instance.uuid, vif['network']['id'], 830 instance.project_id, ip_addr, ip6_addr, 831 instance.display_name, vif['address'], 832 vif['devname'], ptype, -1, -1)) 833 try: 834 linux_net.create_tap_dev(dev) 835 utils.execute('vrouter-port-control', cmd_args, run_as_root=True) 836 except processutils.ProcessExecutionError: 837 LOG.exception(_LE("Failed while plugging vif"), instance=instance) ~~~ It's in OSP 11 downstream: ~~~ 808 def plug_vrouter(self, instance, vif): 809 """Plug into Contrail's network port 810 811 Bind the vif to a Contrail virtual port. 812 """ 813 dev = self.get_vif_devname(vif) 814 ip_addr = '0.0.0.0' 815 ip6_addr = None 816 subnets = vif['network']['subnets'] 817 for subnet in subnets: 818 if not subnet['ips']: 819 continue 820 ips = subnet['ips'][0] 821 if not ips['address']: 822 continue 823 if (ips['version'] == 4): 824 if ips['address'] is not None: 825 ip_addr = ips['address'] 826 if (ips['version'] == 6): 827 if ips['address'] is not None: 828 ip6_addr = ips['address'] 829 830 ptype = 'NovaVMPort' 831 if (CONF.libvirt.virt_type == 'lxc'): 832 ptype = 'NameSpacePort' 833 834 cmd_args = ("--oper=add --uuid=%s --instance_uuid=%s --vn_uuid=%s " 835 "--vm_project_uuid=%s --ip_address=%s --ipv6_address=%s" 836 " --vm_name=%s --mac=%s --tap_name=%s --port_type=%s " 837 "--tx_vlan_id=%d --rx_vlan_id=%d" % (vif['id'], 838 instance.uuid, vif['network']['id'], 839 instance.project_id, ip_addr, ip6_addr, 840 instance.display_name, vif['address'], 841 vif['devname'], ptype, -1, -1)) 842 try: 843 multiqueue = self._is_multiqueue_enabled(instance.image_meta, 844 instance.flavor) 845 linux_net.create_tap_dev(dev, multiqueue=multiqueue) 846 utils.execute('vrouter-port-control', cmd_args, run_as_root=True) 847 except processutils.ProcessExecutionError: 848 LOG.exception(_LE("Failed while plugging vif"), instance=instance) ~~~
Hum perhaps we should just remove line 843 and let libvirt to create the TAP device so we ensure that the multiqueue flag will be set correctly in all places. I guess we have added that code upstream to support libvirt version under 1.3.1 but that is not necessary for OSP10 which is shipped with RHEL7. 843 multiqueue = self._is_multiqueue_enabled(instance.image_meta, 844 instance.flavor)
Hi Sahid, I don't know *why* we are creating that tap interface with nova. From my tests, it looks redundant ... libvirt seems to create it quite alright. However, I don't see how: ~~~ 843 multiqueue = self._is_multiqueue_enabled(instance.image_meta, 844 instance.flavor) ~~~ would be related to the creation of the tap interface ;-) - Andreas
I suppose you mean: https://github.com/openstack/nova/blob/newton-eol/nova/virt/libvirt/vif.py#L784 ?
(In reply to Andreas Karis from comment #10) > I suppose you mean: > https://github.com/openstack/nova/blob/newton-eol/nova/virt/libvirt/vif. > py#L784 > > ? Yes right :-) Is that possible you make a try without this line?
Guys, few things: 1) Why we dont want to simply backport fix? Instead You are trying to invent new change which requires some additional tests? 2) If You plan to remove that, imho you should also fix other places - not only vrouter part...
Hi, From what I understand: * Red Hat does not support nova networking in Red Hat OpenStack Platform 10 with our own networking solutions. It's pretty much only vrouter that's a) using nova networking and b) needing the multi queue feature, so we only need to address this particular part of code, and node all other places. * Backporting the fix is more complex than simply removing that one line. And given that we control the libvirt version here which creates the tap interface itself, we don't need to keep that particular part of code which creates the tap interface. * Backporting that fix is a feature backport (nova networking did not have the feature for multiqueue until mitaka) OpenStack Platform 10 entered maintenance support in June of this year: https://access.redhat.com/support/policy/updates/openstack/platform The above link also stipulates the terms of support for maintenance support: ~~~ Full Support: During the Production Phase, qualified Critical and Important Security Advisories (RHSAs) and urgent and selected High Priority Bug Fix Advisories (RHBAs) may be released as they become available. Other errata advisories may be delivered as appropriate. If available, select enhanced software functionality may be provided at the discretion of Red Hat. Maintenance releases will also include available and qualified errata advisories (RHSAs, RHBAs, and RHEAs). Maintenance releases are cumulative and include the contents of previously released updates. The focus for minor releases during this phase lies on resolving defects of medium or higher priority. In addition, and only during Full Support: Customers and Partners have the ability to request new features which are introduced upstream to be selectively backported, pending the review of Red Hat Product Management and Engineering, until the end of this phase The installer components may be updated until the end of this phase Partner may introduce new plugins to be certified with the version until the end of this phase Maintenance Support: Same as Full Support, excluding: introduction of new features through backports introduction of additional partner plugins ~~~ Kind regards, Andreas
So project maintenance/support is good reason :) I will try to check this fix on env where we have problem and come back with info if it helps.
renaming as this is not related to nova networks. vrouter integration with openstack is provided via neutron however in osp 10 vrouter support in nova is not delegated to os-vif. the changes required to enable this multiqueu suppor from nova appear to be confined to the plug logic in vif.py, the helper functions in linux_net.py and the related test code. in Rocky these will be facorted out into a separate vrouter os-vif plugin but they are in tree in this release. the plug logic in vif.py is common to nova networks and newton however the code paths used for vrouter is only executed in a neutron deployment.
Hi, What's the status for this one? Thanks! - Andreas
Check with Andrew and/or Joe, we could consider releasing this as OtherQA.
The customer can wait for the official release. Thanks for the work on this!
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0074