Bug 1386976

Summary: attaching a new multi-queue vhost-user interface to a running VM fails
Product: Red Hat Enterprise Linux 7 Reporter: bigswitch <rhosp-bugs-internal>
Component: libvirtAssignee: Michal Privoznik <mprivozn>
Status: CLOSED ERRATA QA Contact: yalzhang <yalzhang>
Severity: high Docs Contact:
Priority: high    
Version: 7.0CC: dyuan, jdenemar, juzhang, knoel, mburns, pezhang, rbalakri, rhosp-bugs-internal, srevivo, xuzhang, yalzhang
Target Milestone: pre-dev-freezeKeywords: ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-2.5.0-1.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1404186 (view as bug list) Environment:
Last Closed: 2017-08-01 17:19:14 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1404186    

Description bigswitch 2016-10-19 23:55:42 UTC
[Creating a new BZ based on https://bugzilla.redhat.com/show_bug.cgi?id=1360519#c24 ]

Update on multi-queue support:
We testing the following workflow:

1. Attach a vhostuser interface to the VM when bringing up the VM
This works!
[Output from virsh dumpxml instanceXYZ]

    <interface type='vhostuser'>
      <mac address='fa:16:3e:f6:7c:3f'/>
      <source type='unix' path='/run/vhost/vhost2' mode='server'/>
      <model type='virtio'/>
      <driver name='vhost' queues='4'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>


2. Bring up a VM, and then attach the vhostuser interface to it
This doesn't work.

The following error is observed in the log: /var/log/nova/nova-compute.log
2016-10-18:35:53.000 20623 ERROR nova.virt.libvirt.driver [instance: 330532f9-4e36-401f-8f3a-4caaa8ebd4d0] Traceback (most recent call last):
2016-10-18 18:35:53.000 20623 ERROR nova.virt.libvirt.driver [instance: 330532f9-4e36-401f-8f3a-4caaa8ebd4d0]   File "/usr/lib/python2.7/site-packages/nova/virt/libvir
t/driver.py", line 1504, in attach_interface
2016-10-18 18:35:53.000 20623 ERROR nova.virt.libvirt.driver [instance: 330532f9-4e36-401f-8f3a-4caaa8ebd4d0]     guest.attach_device(cfg, persistent=True, live=live)
2016-10-18 18:35:53.000 20623 ERROR nova.virt.libvirt.driver [instance: 330532f9-4e36-401f-8f3a-4caaa8ebd4d0]   File "/usr/lib/python2.7/site-packages/nova/virt/libvir
t/guest.py", line 250, in attach_device
2016-10-18 18:35:53.000 20623 ERROR nova.virt.libvirt.driver [instance: 330532f9-4e36-401f-8f3a-4caaa8ebd4d0]     self._domain.attachDeviceFlags(conf.to_xml(), flags=f
lags)
2016-10-18 18:35:53.000 20623 ERROR nova.virt.libvirt.driver [instance: 330532f9-4e36-401f-8f3a-4caaa8ebd4d0]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.p
y", line 183, in doit
2016-10-18 18:35:53.000 20623 ERROR nova.virt.libvirt.driver [instance: 330532f9-4e36-401f-8f3a-4caaa8ebd4d0]     result = proxy_call(self._autowrap, f, *args, **kwarg
s)
2016-10-18 18:35:53.000 20623 ERROR nova.virt.libvirt.driver [instance: 330532f9-4e36-401f-8f3a-4caaa8ebd4d0]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.p
y", line 141, in proxy_call
2016-10-18 18:35:53.000 20623 ERROR nova.virt.libvirt.driver [instance: 330532f9-4e36-401f-8f3a-4caaa8ebd4d0]     rv = execute(f, *args, **kwargs)
2016-10-18 18:35:53.000 20623 ERROR nova.virt.libvirt.driver [instance: 330532f9-4e36-401f-8f3a-4caaa8ebd4d0]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.p
y", line 122, in execute
2016-10-18 18:35:53.000 20623 ERROR nova.virt.libvirt.driver [instance: 330532f9-4e36-401f-8f3a-4caaa8ebd4d0]     six.reraise(c, e, tb)
2016-10-18 18:35:53.000 20623 ERROR nova.virt.libvirt.driver [instance: 330532f9-4e36-401f-8f3a-4caaa8ebd4d0]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.p
y", line 80, in tworker
2016-10-18 18:35:53.000 20623 ERROR nova.virt.libvirt.driver [instance: 330532f9-4e36-401f-8f3a-4caaa8ebd4d0]     rv = meth(*args, **kwargs)
2016-10-18 18:35:53.000 20623 ERROR nova.virt.libvirt.driver [instance: 330532f9-4e36-401f-8f3a-4caaa8ebd4d0]   File "/usr/lib64/python2.7/site-packages/libvirt.py", l
ine 554, in attachDeviceFlags
2016-10-18 18:35:53.000 20623 ERROR nova.virt.libvirt.driver [instance: 330532f9-4e36-401f-8f3a-4caaa8ebd4d0]     if ret == -1: raise libvirtError ('virDomainAttachDev
iceFlags() failed', dom=self)
2016-10-18 18:35:53.000 20623 ERROR nova.virt.libvirt.driver [instance: 330532f9-4e36-401f-8f3a-4caaa8ebd4d0] libvirtError: unsupported configuration: Multiqueue network is not supported for: vhostuser
2016-10-18 18:35:53.000 20623 ERROR nova.virt.libvirt.driver [instance: 330532f9-4e36-401f-8f3a-4caaa8ebd4d0] 
2016-10-18 18:35:53.001 20623 WARNING nova.compute.manager [req-a35c4a76-2299-47ad-af98-0e3f5448638a e8461562c57a48a3a89ef5326c6d70f9 b745cf763efa49cd9fa5b523bf097fa1 
- - -] [instance: 330532f9-4e36-401f-8f3a-4caaa8ebd4d0] attach interface failed , try to deallocate port cee4095b-3f91-4118-935a-6b6c930949be, reason: Failed to attach
 network adapter device to 330532f9-4e36-401f-8f3a-4caaa8ebd4d0

Line of relevance:
2016-10-18 18:35:53.000 20623 ERROR nova.virt.libvirt.driver [instance: 330532f9-4e36-401f-8f3a-4caaa8ebd4d0] libvirtError: unsupported configuration: Multiqueue network is not supported for: vhostuser


So, appears like Flow2 doesn't work, and unfortunately both these flows are used by the customers.

(In reply to bigswitch from comment #19)
> This is with RHOSP9
> 
> [root@overcloud-compute-0 heat-admin]# libvirtd --version
> libvirtd (libvirt) 1.2.17
> 
> [root@overcloud-compute-0 heat-admin]# uname -a
> Linux overcloud-compute-0.localdomain 3.10.0-327.28.3.el7.x86_64 #1 SMP Fri
> Aug 12 13:21:05 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux
> 
> [root@overcloud-compute-0 heat-admin]# rpm -qa | grep qemu-
> libvirt-daemon-driver-qemu-1.2.17-13.el7_2.5.x86_64
> qemu-kvm-rhev-2.3.0-31.el7_2.21.x86_64
> ipxe-roms-qemu-20160127-1.git6366fa7a.el7.noarch
> qemu-img-rhev-2.3.0-31.el7_2.21.x86_64
> qemu-kvm-common-rhev-2.3.0-31.el7_2.21.x86_64
> 
> [root@overcloud-compute-0 heat-admin]# cat /etc/redhat-release 
> Red Hat Enterprise Linux Server release 7.2 (Maipo)

Versions look good. Thanks!

Studying comment #17 again, this looks like a bug in libvirt vhost-user hot-plug support. We can either move this BZ to libvirt or you can enter a new BZ. I prefer the latter.

Can you please enter a new BZ for libvirt to fix vhost-user MQ hot-plug support? Thanks.

Comment 1 Jaroslav Suchanek 2016-10-25 08:56:44 UTC
Can it be related to bug 1366108? Miso, can you provide a scratch build for
testing? Thanks.

Comment 2 Michal Privoznik 2016-10-25 10:44:17 UTC
Sure. Here are the patches that I intent to propose to the upstream:

https://github.com/zippy2/libvirt/commits/vhost_mq

Here's a scratch build of the current git HEAD with them applied to test:

https://mprivozn.fedorapeople.org/vhostmq/

bigswitch can you please check whether that fixes your issue?

Comment 5 Michal Privoznik 2016-11-04 12:29:44 UTC
Patches proposed on the upstream list:

https://www.redhat.com/archives/libvir-list/2016-November/msg00252.html

Comment 6 Michal Privoznik 2016-11-10 16:35:07 UTC
I've just pushed the patches upstream:

commit 21db4ab0528ca6c78744148ca4b7515aaeb4d0bf
Author:     Michal Privoznik <mprivozn>
AuthorDate: Tue Oct 25 12:18:23 2016 +0200
Commit:     Michal Privoznik <mprivozn>
CommitDate: Thu Nov 10 16:47:32 2016 +0100

    qemuDomainAttachNetDevice: Enable multiqueue for vhost-user
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1386976
    
    We have everything ready. Actually the only limitation was our
    check that denied hotplug of vhost-user.
    
    Signed-off-by: Michal Privoznik <mprivozn>

commit 0e82fa4c345acb7ad52e0da0e54f7375eda57657
Author:     Michal Privoznik <mprivozn>
AuthorDate: Tue Oct 25 12:16:36 2016 +0200
Commit:     Michal Privoznik <mprivozn>
CommitDate: Thu Nov 10 16:47:32 2016 +0100

    qemuDomainAttachNetDevice: Don't overwrite error on rollback
    
    If there is an error hotpluging a net device (for whatever
    reason) a rollback operation is performed. However, whilst doing
    so various helper functions that are called report errors on
    their own. This results in the original error to be overwritten
    and thus misleading the user.
    
    Signed-off-by: Michal Privoznik <mprivozn>

v2.4.0-56-g0f62843

Comment 9 yalzhang@redhat.com 2017-02-27 12:05:36 UTC
Set this bug as verified, see Bug 1366505#c6.

Comment 10 bigswitch 2017-04-12 17:47:16 UTC
[stack@rhosp10 ~]$ nova list
+--------------------------------------+---------+--------+------------+-------------+------------------+
| ID                                   | Name    | Status | Task State | Power State | Networks         |
+--------------------------------------+---------+--------+------------+-------------+------------------+
| 36464502-cc38-471a-89ea-63368e40ca0d | nfv-vm1 | ACTIVE | -          | Running     | nfv1=10.10.10.5  |
| 7a042537-77f1-4cbd-b461-0785bc60c9f8 | vm2     | ACTIVE | -          | Running     | nfv1=10.10.10.8  |
| 9b52d181-64b7-42eb-8a79-65d978254e57 | vm3     | ACTIVE | -          | Running     | nfv3=10.10.12.10 |
| 611454a8-e10e-4097-ba0f-210fedb9a323 | vm4     | ACTIVE | -          | Running     | nfv1=10.10.10.6  |
+--------------------------------------+---------+--------+------------+-------------+------------------+
[stack@rhosp10 ~]$ nova interface-attach --port-id 542ad3dd-974d-4754-bca4-8a2d05b0fc4c vm3
ERROR (ClientException): Failed to attach network adapter device to 9b52d181-64b7-42eb-8a79-65d978254e57 (HTTP 500) (Request-ID: req-c18da442-7099-487b-85aa-cbbe10303654)

rpm -qa | grep qemu-
qemu-img-rhev-2.6.0-27.el7.x86_64
qemu-kvm-rhev-2.6.0-27.el7.x86_64
ipxe-roms-qemu-20160127-5.git6366fa7a.el7.noarch
qemu-kvm-common-rhev-2.6.0-27.el7.x86_64
libvirt-daemon-driver-qemu-2.0.0-10.el7_3.4.x86_64

Comment 11 Michal Privoznik 2017-04-13 05:09:36 UTC
(In reply to bigswitch from comment #10)
> [stack@rhosp10 ~]$ nova list
> +--------------------------------------+---------+--------+------------+-----
> --------+------------------+
> | ID                                   | Name    | Status | Task State |
> Power State | Networks         |
> +--------------------------------------+---------+--------+------------+-----
> --------+------------------+
> | 36464502-cc38-471a-89ea-63368e40ca0d | nfv-vm1 | ACTIVE | -          |
> Running     | nfv1=10.10.10.5  |
> | 7a042537-77f1-4cbd-b461-0785bc60c9f8 | vm2     | ACTIVE | -          |
> Running     | nfv1=10.10.10.8  |
> | 9b52d181-64b7-42eb-8a79-65d978254e57 | vm3     | ACTIVE | -          |
> Running     | nfv3=10.10.12.10 |
> | 611454a8-e10e-4097-ba0f-210fedb9a323 | vm4     | ACTIVE | -          |
> Running     | nfv1=10.10.10.6  |
> +--------------------------------------+---------+--------+------------+-----
> --------+------------------+
> [stack@rhosp10 ~]$ nova interface-attach --port-id
> 542ad3dd-974d-4754-bca4-8a2d05b0fc4c vm3
> ERROR (ClientException): Failed to attach network adapter device to
> 9b52d181-64b7-42eb-8a79-65d978254e57 (HTTP 500) (Request-ID:
> req-c18da442-7099-487b-85aa-cbbe10303654)
> 
> rpm -qa | grep qemu-
> qemu-img-rhev-2.6.0-27.el7.x86_64
> qemu-kvm-rhev-2.6.0-27.el7.x86_64
> ipxe-roms-qemu-20160127-5.git6366fa7a.el7.noarch
> qemu-kvm-common-rhev-2.6.0-27.el7.x86_64

> libvirt-daemon-driver-qemu-2.0.0-10.el7_3.4.x86_64

The 'Fixed in' field of this bug states the bug was fixed in 2.5.0. Therefore it is no surprise that it doesn't work with 2.0.0. Upgrade libvirt and it will work.

Comment 14 errata-xmlrpc 2017-08-01 17:19:14 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1846

Comment 15 errata-xmlrpc 2017-08-01 23:59:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1846