Bug 1597940

Summary: vhost-user socket path is not recognized by libvirt
Product: Red Hat Enterprise Linux 7 Reporter: Pei Zhang <pezhang>
Component: libvirtAssignee: Daniel Berrangé <berrange>
Status: CLOSED ERRATA QA Contact: Luyao Huang <lhuang>
Severity: urgent Docs Contact:
Priority: high    
Version: 7.6CC: berrange, chayang, ctrautma, dyuan, jdenemar, jiyan, juzhang, lmen, maxime.coquelin, michen, siliu, tli, xuzhang, yalzhang
Target Milestone: rcKeywords: Regression
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-4.5.0-2.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1598269 (view as bug list) Environment:
Last Closed: 2018-10-30 09:57:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
VM XML none

Description Pei Zhang 2018-07-04 01:00:44 UTC
Created attachment 1456347 [details]
VM XML

Description of problem:
We fail booting VM with vhost-user. Seems libvirt fails translate the socket path of vhost-user.

Version-Release number of selected component (if applicable):
3.10.0-916.el7.x86_64
libvirt-4.5.0-1.el7.x86_64
qemu-kvm-rhev-2.12.0-6.el7.x86_64


How reproducible:
100%

Steps to Reproduce:
1. Boot VM with vhost-user socket, full XML is attached.
    <interface type='vhostuser'>
      <mac address='18:66:da:5f:dd:22'/>
      <source type='unix' path='/tmp/vhostuser0.sock' mode='server'/>
      <model type='virtio'/>
      <driver name='vhost' queues='2' rx_queue_size='512'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </interface>
    <interface type='vhostuser'>
      <mac address='18:66:da:5f:dd:23'/>
      <source type='unix' path='/tmp/vhostuser1.sock' mode='server'/>
      <model type='virtio'/>
      <driver name='vhost' queues='2' rx_queue_size='512'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </interface>

2. Boot VM, fail. The qemu command line shows that "path=" is not translated. This option is important to vhost-user.

# virsh start rhel7.6_nonrt
error: Failed to start domain rhel7.6_nonrt
error: internal error: qemu unexpectedly closed the monitor: 2018-07-04T00:52:53.619540Z qemu-kvm: -chardev socket,id=charnet1,fd=30,server: info: QEMU waiting for connection on: disconnected:unix:/tmp/vhostuser0.sock,server
2018-07-04T00:52:54.498544Z qemu-kvm: -chardev socket,id=charnet2,fd=31,server: info: QEMU waiting for connection on: disconnected:unix:/tmp/vhostuser1.sock,server
2018-07-04T00:52:54.500301Z qemu-kvm: chardev "charnet1" does not support FD passing


Actual results:
VM with vhost-user fail boot up.


Expected results:
VM with vhost-user should boot up.


Additional info:
1. This is not qemu bug, as qemu command line works well as expected.

2. This should be libvirt regression bug, as libvirt-4.4.0-2.el7.x86_64  works well.

Comment 4 Daniel Berrangé 2018-07-04 11:24:39 UTC
Libvirt switched to using FD passing for all UNIX sockets.

Unfortunately it appears QEMU has a bug which causes this to break vhostuser. I've sent a fix for the QEMU bug:

  https://lists.gnu.org/archive/html/qemu-devel/2018-07/msg01147.html

but we'll need to workaround it in libvirt too

Comment 5 Pei Zhang 2018-07-05 01:53:06 UTC
(In reply to Daniel Berrange from comment #4)
> Libvirt switched to using FD passing for all UNIX sockets.
> 
> Unfortunately it appears QEMU has a bug which causes this to break
> vhostuser. I've sent a fix for the QEMU bug:
> 
>   https://lists.gnu.org/archive/html/qemu-devel/2018-07/msg01147.html
> 
> but we'll need to workaround it in libvirt too

Thanks Daniel. 

From QE perspective, a BZ in qemu-kvm-rhev component is needed, we file bug[1]  to track this issue:

[1]Bug 1598269 - vhost-user socket path is not recognized by libvirt - [QEMU side] 



Best Regards,
Pei

Comment 6 Daniel Berrangé 2018-07-05 11:37:02 UTC
The libvirt fix is proposed here:

https://www.redhat.com/archives/libvir-list/2018-July/msg00316.html

this is *not* dependent on any QEMU change.

Comment 9 Pei Zhang 2018-07-10 02:04:39 UTC
Update:

Versions:
libvirt-4.5.0-2.el7.x86_64


Results:
This bug can not be reproduced anymore with above version. Guest with vhost-user can boot up and works well as expected. 

Thanks Daniel for your efforts.

Comment 10 Luyao Huang 2018-08-21 03:11:00 UTC
Verify this bug with libvirt-4.5.0-6.el7.x86_64:

1. install openvswitch + dpdk and set up env

2. prepare vhostuser server and client port:

# ovs-vsctl show
94687455-bb2b-48a1-8842-a68c6c62412b
    Bridge "ovsbr0"
        Port "vhost-user0"
            Interface "vhost-user0"
                type: dpdkvhostuserclient
                options: {vhost-server-path="/var/lib/libvirt/qemu/vhost0.sock"}
        Port "vhost-user2"
            Interface "vhost-user2"
                type: dpdkvhostuser
        Port "ovsbr0"
            Interface "ovsbr0"
                type: internal
        Port "vhost-user1"
            Interface "vhost-user1"
                type: dpdkvhostuser

3. Add vhost-user network in guest xml:

    <interface type='vhostuser'>
      <mac address='18:66:da:5f:dd:22'/>
      <source type='unix' path='/var/lib/libvirt/qemu/vhost0.sock' mode='server'/>
      <model type='virtio'/>
      <driver name='vhost' rx_queue_size='512'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </interface>
    <interface type='vhostuser'>
      <mac address='52:54:00:89:c3:cf'/>
      <source type='unix' path='/run/openvswitch/vhost-user1' mode='client'/>
      <model type='virtio'/>
      <driver name='vhost' rx_queue_size='512'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </interface>

4. start guest
# virsh start vm1
Domain vm1 started

5. check qemu cmdline and make sure libvirt use path for vhost-user device chardev:
# ps aux|grep qemu
...-chardev socket,id=charnet1,path=/var/lib/libvirt/qemu/vhost0.sock,server -netdev vhost-user,chardev=charnet1,id=hostnet1 -device virtio-net-pci,rx_queue_size=512,netdev=hostnet1,id=net1,mac=18:66:da:5f:dd:22,bus=pci.0,addr=0x8 -chardev socket,id=charnet2,path=/run/openvswitch/vhost-user1 -netdev vhost-user,chardev=charnet2,id=hostnet2 -device virtio-net-pci,rx_queue_size=512,netdev=hostnet2,id=net2,mac=52:54:00:89:c3:cf,bus=pci.0,addr=0x9

6. start guest with a unix channel (like agent):

    <channel type='unix'>
      <source mode='bind' path='/var/lib/libvirt/qemu/r6.agent'/>
      <target type='virtio' name='org.qemu.guest_agent.0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>

7. check libvirt will use fd instead of path in chardev:

# ps aux|grep qemu
-chardev socket,id=charchannel1,fd=29,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel1,id=channel1,name=org.qemu.guest_agent.0 -device usb-tablet,id=input0,bus=usb.0,port=2

Comment 12 errata-xmlrpc 2018-10-30 09:57:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:3113