Bug 1360519
| Summary: | RFE: vhost-user multi-queue and live migration support | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | bigswitch <rhosp-bugs-internal> |
| Component: | qemu-kvm-rhev | Assignee: | Amnon Ilan <ailan> |
| Status: | CLOSED NEXTRELEASE | QA Contact: | Pei Zhang <pezhang> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 7.2 | CC: | atelang, atragler, chayang, chegu_vinod, fbaudin, jraju, juzhang, knoel, marcandre.lureau, mburns, mkletzan, pezhang, plancast, rbalakri, rhosp-bugs-internal, sgordon, sherold, smerrow, virt-maint, xfu |
| Target Milestone: | rc | Keywords: | FutureFeature, OtherQA, TestOnly |
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-01-25 09:31:56 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1389435, 1402132, 1411879 | ||
|
Description
bigswitch
2016-07-27 00:54:39 UTC
Your request came directly to engineering. Who is the product manager or partner manager you are working with? What features or fixes in qemu are you looking for? The plan is to support RHEL 7.3 compute nodes with qemu-kvm-rhev based on upstream QEMU 2.6. However, Red Hat also backports upstream features and fixes from later QEMU versions. Therefore, we prefer that software does not rely on QEMU version numbers to discover the availability of features/fixes. Also, there are no plans to rebase the version of qemu-kvm shipped with RHEL. Thanks. (In reply to bigswitch from comment #0) > Description of problem: > We have this package dependency to support Big Switch's DPDK based > NFVswitch. Please package it in next RHEL release Can you be more specific as to exactly which QEMU capabilities the solution requires, ideally with links to commits? Simply requesting a given version number does not guarantee that functionality will be enabled in our build of it if you don't specify what is actually needed. Hi,
This is the requirement:
We need vhost-user multiqueue and live migration.
b931bfbf0429 ("vhost-user: add multiple queue support")
f6f56291de87 ("vhost user: add support of live migration")
Plus the bugfixes and other patches required for these features.
multi-queue is needed to get better networking performance when deploying with BSN's DPDK based virtual switch (NFVSwitch) with NFV workloads. For BigSwitch: The current state of the multi-queue support is that it is included in 7.2.z. It was backported on top of the version included in 7.2 (so the fact that the version is lower than upstream is OK). It is considered Tech Preview in 7.2. It will move to full support in RHEL 7.3. BigSwitch is going to test this directly and verify. Update on multi-queue support:
We testing the following workflow:
1. Attach a vhostuser interface to the VM when bringing up the VM
This works!
[Output from virsh dumpxml instanceXYZ]
<interface type='vhostuser'>
<mac address='fa:16:3e:f6:7c:3f'/>
<source type='unix' path='/run/vhost/vhost2' mode='server'/>
<model type='virtio'/>
<driver name='vhost' queues='4'/>
<alias name='net0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
</interface>
2. Bring up a VM, and then attach the vhostuser interface to it
This doesn't work.
The following error is observed in the log: /var/log/nova/nova-compute.log
2016-10-18:35:53.000 20623 ERROR nova.virt.libvirt.driver [instance: 330532f9-4e36-401f-8f3a-4caaa8ebd4d0] Traceback (most recent call last):
2016-10-18 18:35:53.000 20623 ERROR nova.virt.libvirt.driver [instance: 330532f9-4e36-401f-8f3a-4caaa8ebd4d0] File "/usr/lib/python2.7/site-packages/nova/virt/libvir
t/driver.py", line 1504, in attach_interface
2016-10-18 18:35:53.000 20623 ERROR nova.virt.libvirt.driver [instance: 330532f9-4e36-401f-8f3a-4caaa8ebd4d0] guest.attach_device(cfg, persistent=True, live=live)
2016-10-18 18:35:53.000 20623 ERROR nova.virt.libvirt.driver [instance: 330532f9-4e36-401f-8f3a-4caaa8ebd4d0] File "/usr/lib/python2.7/site-packages/nova/virt/libvir
t/guest.py", line 250, in attach_device
2016-10-18 18:35:53.000 20623 ERROR nova.virt.libvirt.driver [instance: 330532f9-4e36-401f-8f3a-4caaa8ebd4d0] self._domain.attachDeviceFlags(conf.to_xml(), flags=f
lags)
2016-10-18 18:35:53.000 20623 ERROR nova.virt.libvirt.driver [instance: 330532f9-4e36-401f-8f3a-4caaa8ebd4d0] File "/usr/lib/python2.7/site-packages/eventlet/tpool.p
y", line 183, in doit
2016-10-18 18:35:53.000 20623 ERROR nova.virt.libvirt.driver [instance: 330532f9-4e36-401f-8f3a-4caaa8ebd4d0] result = proxy_call(self._autowrap, f, *args, **kwarg
s)
2016-10-18 18:35:53.000 20623 ERROR nova.virt.libvirt.driver [instance: 330532f9-4e36-401f-8f3a-4caaa8ebd4d0] File "/usr/lib/python2.7/site-packages/eventlet/tpool.p
y", line 141, in proxy_call
2016-10-18 18:35:53.000 20623 ERROR nova.virt.libvirt.driver [instance: 330532f9-4e36-401f-8f3a-4caaa8ebd4d0] rv = execute(f, *args, **kwargs)
2016-10-18 18:35:53.000 20623 ERROR nova.virt.libvirt.driver [instance: 330532f9-4e36-401f-8f3a-4caaa8ebd4d0] File "/usr/lib/python2.7/site-packages/eventlet/tpool.p
y", line 122, in execute
2016-10-18 18:35:53.000 20623 ERROR nova.virt.libvirt.driver [instance: 330532f9-4e36-401f-8f3a-4caaa8ebd4d0] six.reraise(c, e, tb)
2016-10-18 18:35:53.000 20623 ERROR nova.virt.libvirt.driver [instance: 330532f9-4e36-401f-8f3a-4caaa8ebd4d0] File "/usr/lib/python2.7/site-packages/eventlet/tpool.p
y", line 80, in tworker
2016-10-18 18:35:53.000 20623 ERROR nova.virt.libvirt.driver [instance: 330532f9-4e36-401f-8f3a-4caaa8ebd4d0] rv = meth(*args, **kwargs)
2016-10-18 18:35:53.000 20623 ERROR nova.virt.libvirt.driver [instance: 330532f9-4e36-401f-8f3a-4caaa8ebd4d0] File "/usr/lib64/python2.7/site-packages/libvirt.py", l
ine 554, in attachDeviceFlags
2016-10-18 18:35:53.000 20623 ERROR nova.virt.libvirt.driver [instance: 330532f9-4e36-401f-8f3a-4caaa8ebd4d0] if ret == -1: raise libvirtError ('virDomainAttachDev
iceFlags() failed', dom=self)
2016-10-18 18:35:53.000 20623 ERROR nova.virt.libvirt.driver [instance: 330532f9-4e36-401f-8f3a-4caaa8ebd4d0] libvirtError: unsupported configuration: Multiqueue network is not supported for: vhostuser
2016-10-18 18:35:53.000 20623 ERROR nova.virt.libvirt.driver [instance: 330532f9-4e36-401f-8f3a-4caaa8ebd4d0]
2016-10-18 18:35:53.001 20623 WARNING nova.compute.manager [req-a35c4a76-2299-47ad-af98-0e3f5448638a e8461562c57a48a3a89ef5326c6d70f9 b745cf763efa49cd9fa5b523bf097fa1
- - -] [instance: 330532f9-4e36-401f-8f3a-4caaa8ebd4d0] attach interface failed , try to deallocate port cee4095b-3f91-4118-935a-6b6c930949be, reason: Failed to attach
network adapter device to 330532f9-4e36-401f-8f3a-4caaa8ebd4d0
Line of relevance:
2016-10-18 18:35:53.000 20623 ERROR nova.virt.libvirt.driver [instance: 330532f9-4e36-401f-8f3a-4caaa8ebd4d0] libvirtError: unsupported configuration: Multiqueue network is not supported for: vhostuser
So, appears like Flow2 doesn't work, and unfortunately both these flows are used by the customers.
Please provide package versions for the host compute node - livbirt, qemu-kvm-rhev, kernel... Is this rhel-7.2 still? What is the rhosp version? Thanks. This is with RHOSP9 [root@overcloud-compute-0 heat-admin]# libvirtd --version libvirtd (libvirt) 1.2.17 [root@overcloud-compute-0 heat-admin]# uname -a Linux overcloud-compute-0.localdomain 3.10.0-327.28.3.el7.x86_64 #1 SMP Fri Aug 12 13:21:05 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux [root@overcloud-compute-0 heat-admin]# rpm -qa | grep qemu- libvirt-daemon-driver-qemu-1.2.17-13.el7_2.5.x86_64 qemu-kvm-rhev-2.3.0-31.el7_2.21.x86_64 ipxe-roms-qemu-20160127-1.git6366fa7a.el7.noarch qemu-img-rhev-2.3.0-31.el7_2.21.x86_64 qemu-kvm-common-rhev-2.3.0-31.el7_2.21.x86_64 [root@overcloud-compute-0 heat-admin]# cat /etc/redhat-release Red Hat Enterprise Linux Server release 7.2 (Maipo) (In reply to bigswitch from comment #19) > This is with RHOSP9 > > [root@overcloud-compute-0 heat-admin]# libvirtd --version > libvirtd (libvirt) 1.2.17 > > [root@overcloud-compute-0 heat-admin]# uname -a > Linux overcloud-compute-0.localdomain 3.10.0-327.28.3.el7.x86_64 #1 SMP Fri > Aug 12 13:21:05 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux > > [root@overcloud-compute-0 heat-admin]# rpm -qa | grep qemu- > libvirt-daemon-driver-qemu-1.2.17-13.el7_2.5.x86_64 > qemu-kvm-rhev-2.3.0-31.el7_2.21.x86_64 > ipxe-roms-qemu-20160127-1.git6366fa7a.el7.noarch > qemu-img-rhev-2.3.0-31.el7_2.21.x86_64 > qemu-kvm-common-rhev-2.3.0-31.el7_2.21.x86_64 > > [root@overcloud-compute-0 heat-admin]# cat /etc/redhat-release > Red Hat Enterprise Linux Server release 7.2 (Maipo) Versions look good. Thanks! Studying comment #17 again, this looks like a bug in libvirt vhost-user hot-plug support. We can either move this BZ to libvirt or you can enter a new BZ. I prefer the latter. This BZ is about basic vhost-user MQ and live migration support. (This BZ is already overloaded with 2 features.) Did you also verify live migration support? If so, I think this BZ is verified. Can you please enter a new BZ for libvirt to fix vhost-user MQ hot-plug support? Thanks. > Studying comment #17 again, this looks like a bug in libvirt vhost-user > hot-plug support. We can either move this BZ to libvirt or you can enter a > new BZ. I prefer the latter. Thanks, sounds good. opened https://bugzilla.redhat.com/show_bug.cgi?id=1386976 to track this separately. > This BZ is about basic vhost-user MQ and live migration support. (This BZ is > already overloaded with 2 features.) Did you also verify live migration > support? If so, I think this BZ is verified. We are awaiting a beefy hardware delivery to test out migration. Due to https://bugzilla.redhat.com/show_bug.cgi?id=1381704 we can't test it on differing computes. We shall update this BZ once we test live migration. We tested migration and ran into the issue mentioned here: https://access.redhat.com/solutions/2191071 Looks like the issue is known and there is no actual resolution: "Resolution: Currently due to OPEN bugs in nova, we recommend not to (live or cold) migrate instances that are using numa+cpu-pinning." Is it OK to close this specific bz and monitor the remaining issues using the relevant libvirt and Nova bzs? Regarding live VM migration feature of a VM using virtio backed by vhost-user/OVS-DPDK : In addition to addressing pending issues in qemu/libvirt and/or OpenStack to allow for live VM migration to work, customers expect that live VM migration continues to work fine for the cases where the VM being migrated is hosting an actual DPDK enabled application in the presence of traffic through that DPDK application. Please include the KVM migration performance improvements (e.g. reduced downtime etc) that were pursued as part of the OPNFV KVM subgroup effort. Amnon, it is ok to close this ticket and track the remaining issues with more specific BZs (In reply to bigswitch from comment #31) > Amnon, it is ok to close this ticket and track the remaining issues with > more specific BZs Thanks, closing this bug. |