Bug 1544948

Summary: Virsh start <guest in server mode> hang even when Selinux=Permissive
Product: Red Hat Enterprise Linux 7 Reporter: Jean-Tsung Hsiao <jhsiao>
Component: openvswitchAssignee: Aaron Conole <aconole>
Status: CLOSED NOTABUG QA Contact: ovs-qe
Severity: high Docs Contact:
Priority: unspecified    
Version: 7.5CC: aconole, ailan, atragler, chayang, ctrautma, jdenemar, jhsiao, juzhang, knoel, kzhang, maxime.coquelin, michen, pezhang, ralongi, rbalakri, rcain, virt-maint, yanghliu
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-02-22 15:56:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
script to boot OVS none

Description Jean-Tsung Hsiao 2018-02-13 20:06:11 UTC
Description of problem: Virsh start <guest in server mode> hang even when Selinux=Permissive

root@netqe17 ~]# virsh list --all
 Id    Name                           State
----------------------------------------------------
 2     vhost-server                   paused

[root@netqe17 ~]# ll /tmp/vhost*
srwxrwxr-x. 1 qemu hugetlbfs 0 Feb 13 14:55 /tmp/vhost0


Version-Release number of selected component (if applicable):

[root@netqe17 ~]# rpm -qa | grep qemu
qemu-img-rhev-2.10.0-20.el7.x86_64
libvirt-daemon-driver-qemu-3.9.0-10.el7.x86_64
qemu-kvm-rhev-2.10.0-20.el7.x86_64
ipxe-roms-qemu-20170123-1.git4e85b27.el7_4.1.noarch
qemu-kvm-common-rhev-2.10.0-20.el7.x86_64

[root@netqe17 ~]# uname -a
Linux netqe17.knqe.lab.eng.bos.redhat.com 3.10.0-843.el7.x86_64 #1 SMP Wed Jan 31 12:09:22 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

[root@netqe17 ~]# rpm -q openvswitch
openvswitch-2.9.0-0.3.20180124git26cdc33.el7fdp.x86_64

How reproducible: Reproducible


Steps to Reproduce:
1.Config OVS-dpdk with vhost0 and vhos1 of type=dpdkvhostuserclient
2.Start guest in server mode
3.virsh start will NOT return; virsh list --all will show the guest is in paused state.


Actual results:


Expected results:


Additional info:

Comment 2 Jiri Denemark 2018-02-14 12:46:12 UTC
Could you please attach the domain XML?

Comment 3 Jiri Denemark 2018-02-14 15:43:04 UTC
Oh and libvirtd's debug log () and qemu log too. See https://wiki.libvirt.org/page/DebugLogs for instructions.

Comment 4 Rick Alongi 2018-02-14 17:00:53 UTC
Below is some additional information on hitting this problem.  The existence (or lack thereof) and permissions of the vhost sockets appears to be an issue:

[root@netqe12 ~]# virsh list --all
 Id    Name                           State
----------------------------------------------------
 1     1                              paused

[root@netqe12 ~]# getenforce
Permissive

[root@netqe12 ~]# ls -ald /tmp
drwxrwxrwx. 9 root root 4096 Feb 14 11:45 /tmp

Note that only /tmp/vhost0 exists and is not 777 permissions:
[root@netqe12 ~]# ls -alth /tmp/vhost*
srwxrwxr-x. 1 qemu qemu 0 Feb 14 11:49 /tmp/vhost0

Change permissions to 777 on /tmp/vhost0 /tmp/vhost1 gets created but not 777 permissions.  VM still stuck in "Paused" state:
[root@netqe12 ~]# chmod 777 /tmp/vhost*

[root@netqe12 ~]# ls -alth /tmp/vhost*
srwxrwxr-x. 1 qemu qemu 0 Feb 14 11:50 /tmp/vhost1
srwxrwxrwx. 1 qemu qemu 0 Feb 14 11:49 /tmp/vhost0

[root@netqe12 ~]# virsh list --all
 Id    Name                           State
----------------------------------------------------
 2     1                              paused

Change permissions on /tmp/vhost1 to 777; VM starts successfully:
[root@netqe12 ~]# chmod 777 /tmp/vhost1

[root@netqe12 ~]# virsh list --all
 Id    Name                           State
----------------------------------------------------
 1     1                              running

[root@netqe12 ~]# rpm -q openvswitch
openvswitch-2.9.0-0.4.20180124git26cdc33.el7fdp.x86_64

[root@netqe12 ~]# uname -r
3.10.0-845.el7.x86_64

[root@netqe12 ~]# rpm -q libvirt
libvirt-3.9.0-12.el7.x86_64

[root@netqe12 ~]# rpm -qa | grep qemu
libvirt-daemon-driver-qemu-3.9.0-12.el7.x86_64
ipxe-roms-qemu-20170123-1.git4e85b27.el7_4.1.noarch
qemu-kvm-common-rhev-2.9.0-14.el7.x86_64
qemu-kvm-rhev-2.9.0-14.el7.x86_64
qemu-img-rhev-2.9.0-14.el7.x86_64

[root@netqe12 ~]# ovs-vsctl show
f6475677-13ff-4b85-b9bc-872626c3e052
    Bridge "ovsbr0"
        Port "dpdk-1005"
            Interface "dpdk-1005"
                type: dpdk
                options: {dpdk-devargs="0000:81:00.0"}
        Port "dpdk-1010"
            Interface "dpdk-1010"
                type: dpdk
                options: {dpdk-devargs="0000:81:00.1"}
        Port "ovsbr0"
            Interface "ovsbr0"
                type: internal
        Port "vhost1"
            Interface "vhost1"
                type: dpdkvhostuserclient
                options: {vhost-server-path="/tmp/vhost1"}
        Port "vhost0"
            Interface "vhost0"
                type: dpdkvhostuserclient
                options: {vhost-server-path="/tmp/vhost0"}
    ovs_version: "2.9.0"

[root@netqe12 ~]# virsh dumpxml 1
<domain type='kvm' id='2'>
  <name>1</name>
  <uuid>5851b5e2-7e90-40fd-96fb-66fe66fe2326</uuid>
  <memory unit='KiB'>4194304</memory>
  <currentMemory unit='KiB'>4194304</currentMemory>
  <memoryBacking>
    <hugepages>
      <page size='1048576' unit='KiB' nodeset='0'/>
    </hugepages>
    <access mode='shared'/>
  </memoryBacking>
  <vcpu placement='static'>3</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='1'/>
    <vcpupin vcpu='1' cpuset='3'/>
    <vcpupin vcpu='2' cpuset='5'/>
    <emulatorpin cpuset='1'/>
  </cputune>
  <resource>
    <partition>/machine</partition>
  </resource>
  <os>
    <type arch='x86_64' machine='pc-i440fx-rhel7.2.0'>hvm</type>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
    <apic/>
  </features>
  <cpu mode='host-passthrough' check='none'>
    <feature policy='require' name='tsc-deadline'/>
    <numa>
      <cell id='0' cpus='0-2' memory='4194304' unit='KiB' memAccess='shared'/>
    </numa>
  </cpu>
  <clock offset='utc'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <pm>
    <suspend-to-mem enabled='no'/>
    <suspend-to-disk enabled='no'/>
  </pm>
  <devices>
    <emulator>/usr/libexec/qemu-kvm</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/var/lib/libvirt/images/1.qcow2'/>
      <backingStore/>
      <target dev='vda' bus='virtio'/>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </disk>
    <controller type='usb' index='0' model='ich9-ehci1'>
      <alias name='usb'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x7'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci1'>
      <alias name='usb'/>
      <master startport='0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0' multifunction='on'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci2'>
      <alias name='usb'/>
      <master startport='2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x1'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci3'>
      <alias name='usb'/>
      <master startport='4'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x2'/>
    </controller>
    <controller type='pci' index='0' model='pci-root'>
      <alias name='pci.0'/>
    </controller>
    <controller type='virtio-serial' index='0'>
      <alias name='virtio-serial0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </controller>
    <interface type='vhostuser'>
      <mac address='52:54:00:db:4c:25'/>
      <source type='unix' path='tmp/vhost0' mode='server'/>
      <target dev='vhost0'/>
      <model type='virtio'/>
      <driver name='vhost'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </interface>
    <interface type='vhostuser'>
      <mac address='52:54:00:0d:a8:b6'/>
      <source type='unix' path='tmp/vhost1' mode='server'/>
      <target dev='vhost1'/>
      <model type='virtio'/>
      <driver name='vhost'/>
      <alias name='net1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </interface>
    <serial type='pty'>
      <target type='isa-serial' port='0'>
        <model name='isa-serial'/>
      </target>
      <alias name='serial0'/>
    </serial>
    <console type='pty'>
      <target type='serial' port='0'/>
      <alias name='serial0'/>
    </console>
    <channel type='unix'>
      <source mode='bind' path='/var/lib/libvirt/qemu/channel/target/domain-2-1/org.qemu.guest_agent.0'/>
      <target type='virtio' name='org.qemu.guest_agent.0'/>
      <alias name='channel0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <input type='tablet' bus='usb'>
      <alias name='input0'/>
      <address type='usb' bus='0' port='1'/>
    </input>
    <input type='mouse' bus='ps2'>
      <alias name='input1'/>
    </input>
    <input type='keyboard' bus='ps2'>
      <alias name='input2'/>
    </input>
    <graphics type='vnc' port='5900' autoport='yes' listen='0.0.0.0'>
      <listen type='address' address='0.0.0.0'/>
    </graphics>
    <video>
      <model type='cirrus' vram='16384' heads='1' primary='yes'/>
      <alias name='video0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </video>
    <memballoon model='virtio'>
      <alias name='balloon0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </memballoon>
  </devices>
  <seclabel type='dynamic' model='selinux' relabel='yes'>
    <label>system_u:system_r:svirt_t:s0:c381,c465</label>
    <imagelabel>system_u:object_r:svirt_image_t:s0:c381,c465</imagelabel>
  </seclabel>
  <seclabel type='dynamic' model='dac' relabel='yes'>
    <label>+107:+107</label>
    <imagelabel>+107:+107</imagelabel>
  </seclabel>
</domain>

Comment 6 Pei Zhang 2018-02-15 00:12:29 UTC
(In reply to Rick Alongi from comment #4)
> Below is some additional information on hitting this problem.  The existence
> (or lack thereof) and permissions of the vhost sockets appears to be an
> issue:
> 
> [root@netqe12 ~]# virsh list --all
>  Id    Name                           State
> ----------------------------------------------------
>  1     1                              paused
> 
> [root@netqe12 ~]# getenforce
> Permissive
> 
> [root@netqe12 ~]# ls -ald /tmp
> drwxrwxrwx. 9 root root 4096 Feb 14 11:45 /tmp
> 
> Note that only /tmp/vhost0 exists and is not 777 permissions:
> [root@netqe12 ~]# ls -alth /tmp/vhost*
> srwxrwxr-x. 1 qemu qemu 0 Feb 14 11:49 /tmp/vhost0
> 
> Change permissions to 777 on /tmp/vhost0 /tmp/vhost1 gets created but not
> 777 permissions.  VM still stuck in "Paused" state:
> [root@netqe12 ~]# chmod 777 /tmp/vhost*
> 
> [root@netqe12 ~]# ls -alth /tmp/vhost*
> srwxrwxr-x. 1 qemu qemu 0 Feb 14 11:50 /tmp/vhost1
> srwxrwxrwx. 1 qemu qemu 0 Feb 14 11:49 /tmp/vhost0
> 
> [root@netqe12 ~]# virsh list --all
>  Id    Name                           State
> ----------------------------------------------------
>  2     1                              paused
> 
> Change permissions on /tmp/vhost1 to 777; VM starts successfully:
> [root@netqe12 ~]# chmod 777 /tmp/vhost1
> 
> [root@netqe12 ~]# virsh list --all
>  Id    Name                           State
> ----------------------------------------------------
>  1     1                              running
> 
> [root@netqe12 ~]# rpm -q openvswitch
> openvswitch-2.9.0-0.4.20180124git26cdc33.el7fdp.x86_64
> 
> [root@netqe12 ~]# uname -r
> 3.10.0-845.el7.x86_64
> 
> [root@netqe12 ~]# rpm -q libvirt
> libvirt-3.9.0-12.el7.x86_64
> 
> [root@netqe12 ~]# rpm -qa | grep qemu
> libvirt-daemon-driver-qemu-3.9.0-12.el7.x86_64
> ipxe-roms-qemu-20170123-1.git4e85b27.el7_4.1.noarch
> qemu-kvm-common-rhev-2.9.0-14.el7.x86_64
> qemu-kvm-rhev-2.9.0-14.el7.x86_64
> qemu-img-rhev-2.9.0-14.el7.x86_64
> 
> [root@netqe12 ~]# ovs-vsctl show
> f6475677-13ff-4b85-b9bc-872626c3e052
>     Bridge "ovsbr0"
>         Port "dpdk-1005"
>             Interface "dpdk-1005"
>                 type: dpdk
>                 options: {dpdk-devargs="0000:81:00.0"}
>         Port "dpdk-1010"
>             Interface "dpdk-1010"
>                 type: dpdk
>                 options: {dpdk-devargs="0000:81:00.1"}
>         Port "ovsbr0"
>             Interface "ovsbr0"
>                 type: internal
>         Port "vhost1"
>             Interface "vhost1"
>                 type: dpdkvhostuserclient
>                 options: {vhost-server-path="/tmp/vhost1"}
>         Port "vhost0"
>             Interface "vhost0"
>                 type: dpdkvhostuserclient
>                 options: {vhost-server-path="/tmp/vhost0"}
>     ovs_version: "2.9.0"
> 
> [root@netqe12 ~]# virsh dumpxml 1
> <domain type='kvm' id='2'>
>   <name>1</name>
>   <uuid>5851b5e2-7e90-40fd-96fb-66fe66fe2326</uuid>
>   <memory unit='KiB'>4194304</memory>
>   <currentMemory unit='KiB'>4194304</currentMemory>
>   <memoryBacking>
>     <hugepages>
>       <page size='1048576' unit='KiB' nodeset='0'/>
>     </hugepages>
>     <access mode='shared'/>
>   </memoryBacking>
>   <vcpu placement='static'>3</vcpu>
>   <cputune>
>     <vcpupin vcpu='0' cpuset='1'/>
>     <vcpupin vcpu='1' cpuset='3'/>
>     <vcpupin vcpu='2' cpuset='5'/>
>     <emulatorpin cpuset='1'/>
>   </cputune>
>   <resource>
>     <partition>/machine</partition>
>   </resource>
>   <os>
>     <type arch='x86_64' machine='pc-i440fx-rhel7.2.0'>hvm</type>
>     <boot dev='hd'/>
>   </os>
>   <features>
>     <acpi/>
>     <apic/>
>   </features>
>   <cpu mode='host-passthrough' check='none'>
>     <feature policy='require' name='tsc-deadline'/>
>     <numa>
>       <cell id='0' cpus='0-2' memory='4194304' unit='KiB'
> memAccess='shared'/>
>     </numa>
>   </cpu>
>   <clock offset='utc'>
>     <timer name='rtc' tickpolicy='catchup'/>
>     <timer name='pit' tickpolicy='delay'/>
>     <timer name='hpet' present='no'/>
>   </clock>
>   <on_poweroff>destroy</on_poweroff>
>   <on_reboot>restart</on_reboot>
>   <on_crash>restart</on_crash>
>   <pm>
>     <suspend-to-mem enabled='no'/>
>     <suspend-to-disk enabled='no'/>
>   </pm>
>   <devices>
>     <emulator>/usr/libexec/qemu-kvm</emulator>
>     <disk type='file' device='disk'>
>       <driver name='qemu' type='qcow2'/>
>       <source file='/var/lib/libvirt/images/1.qcow2'/>
>       <backingStore/>
>       <target dev='vda' bus='virtio'/>
>       <alias name='virtio-disk0'/>
>       <address type='pci' domain='0x0000' bus='0x00' slot='0x06'
> function='0x0'/>
>     </disk>
>     <controller type='usb' index='0' model='ich9-ehci1'>
>       <alias name='usb'/>
>       <address type='pci' domain='0x0000' bus='0x00' slot='0x05'
> function='0x7'/>
>     </controller>
>     <controller type='usb' index='0' model='ich9-uhci1'>
>       <alias name='usb'/>
>       <master startport='0'/>
>       <address type='pci' domain='0x0000' bus='0x00' slot='0x05'
> function='0x0' multifunction='on'/>
>     </controller>
>     <controller type='usb' index='0' model='ich9-uhci2'>
>       <alias name='usb'/>
>       <master startport='2'/>
>       <address type='pci' domain='0x0000' bus='0x00' slot='0x05'
> function='0x1'/>
>     </controller>
>     <controller type='usb' index='0' model='ich9-uhci3'>
>       <alias name='usb'/>
>       <master startport='4'/>
>       <address type='pci' domain='0x0000' bus='0x00' slot='0x05'
> function='0x2'/>
>     </controller>
>     <controller type='pci' index='0' model='pci-root'>
>       <alias name='pci.0'/>
>     </controller>
>     <controller type='virtio-serial' index='0'>
>       <alias name='virtio-serial0'/>
>       <address type='pci' domain='0x0000' bus='0x00' slot='0x04'
> function='0x0'/>
>     </controller>
>     <interface type='vhostuser'>
>       <mac address='52:54:00:db:4c:25'/>
>       <source type='unix' path='tmp/vhost0' mode='server'/>
>       <target dev='vhost0'/>
>       <model type='virtio'/>
>       <driver name='vhost'/>
>       <alias name='net0'/>
>       <address type='pci' domain='0x0000' bus='0x00' slot='0x09'
> function='0x0'/>
>     </interface>
>     <interface type='vhostuser'>
>       <mac address='52:54:00:0d:a8:b6'/>
>       <source type='unix' path='tmp/vhost1' mode='server'/>

Hi Rick, Could you please try path='/tmp/vhost1' here? 

FYI: in my latest testing, everything works well. 

Thanks,
Pei

>       <target dev='vhost1'/>
>       <model type='virtio'/>
>       <driver name='vhost'/>
>       <alias name='net1'/>
>       <address type='pci' domain='0x0000' bus='0x00' slot='0x08'
> function='0x0'/>
>     </interface>
>     <serial type='pty'>
>       <target type='isa-serial' port='0'>
>         <model name='isa-serial'/>
>       </target>
>       <alias name='serial0'/>
>     </serial>
>     <console type='pty'>
>       <target type='serial' port='0'/>
>       <alias name='serial0'/>
>     </console>
>     <channel type='unix'>
>       <source mode='bind'
> path='/var/lib/libvirt/qemu/channel/target/domain-2-1/org.qemu.guest_agent.
> 0'/>
>       <target type='virtio' name='org.qemu.guest_agent.0'/>
>       <alias name='channel0'/>
>       <address type='virtio-serial' controller='0' bus='0' port='1'/>
>     </channel>
>     <input type='tablet' bus='usb'>
>       <alias name='input0'/>
>       <address type='usb' bus='0' port='1'/>
>     </input>
>     <input type='mouse' bus='ps2'>
>       <alias name='input1'/>
>     </input>
>     <input type='keyboard' bus='ps2'>
>       <alias name='input2'/>
>     </input>
>     <graphics type='vnc' port='5900' autoport='yes' listen='0.0.0.0'>
>       <listen type='address' address='0.0.0.0'/>
>     </graphics>
>     <video>
>       <model type='cirrus' vram='16384' heads='1' primary='yes'/>
>       <alias name='video0'/>
>       <address type='pci' domain='0x0000' bus='0x00' slot='0x02'
> function='0x0'/>
>     </video>
>     <memballoon model='virtio'>
>       <alias name='balloon0'/>
>       <address type='pci' domain='0x0000' bus='0x00' slot='0x07'
> function='0x0'/>
>     </memballoon>
>   </devices>
>   <seclabel type='dynamic' model='selinux' relabel='yes'>
>     <label>system_u:system_r:svirt_t:s0:c381,c465</label>
>     <imagelabel>system_u:object_r:svirt_image_t:s0:c381,c465</imagelabel>
>   </seclabel>
>   <seclabel type='dynamic' model='dac' relabel='yes'>
>     <label>+107:+107</label>
>     <imagelabel>+107:+107</imagelabel>
>   </seclabel>
> </domain>

Comment 7 Jean-Tsung Hsiao 2018-02-15 00:46:43 UTC
[root@netqe17 ~]# virsh dumpxml vhost-server
<domain type='kvm' id='2'>
  <name>vhost-server</name>
  <uuid>f06f5eae-5bdc-4391-b392-b998cad21e2d</uuid>
  <memory unit='KiB'>8388608</memory>
  <currentMemory unit='KiB'>8388608</currentMemory>
  <memoryBacking>
    <hugepages>
      <page size='1048576' unit='KiB'/>
    </hugepages>
  </memoryBacking>
  <vcpu placement='static'>5</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='0'/>
    <vcpupin vcpu='1' cpuset='1'/>
    <vcpupin vcpu='2' cpuset='13'/>
    <vcpupin vcpu='3' cpuset='3'/>
    <vcpupin vcpu='4' cpuset='15'/>
    <emulatorpin cpuset='7,9'/>
  </cputune>
  <resource>
    <partition>/machine</partition>
  </resource>
  <os>
    <type arch='x86_64' machine='pc-i440fx-rhel7.3.0'>hvm</type>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <ioapic driver='qemu'/>
  </features>
  <cpu mode='custom' match='exact' check='full'>
    <model fallback='forbid'>Haswell-noTSX</model>
    <feature policy='require' name='vme'/>
    <feature policy='require' name='f16c'/>
    <feature policy='require' name='rdrand'/>
    <feature policy='require' name='hypervisor'/>
    <feature policy='require' name='arat'/>
    <feature policy='require' name='xsaveopt'/>
    <feature policy='require' name='abm'/>
    <numa>
      <cell id='0' cpus='0-4' memory='8388608' unit='KiB' memAccess='shared'/>
    </numa>
  </cpu>
  <clock offset='utc'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <pm>
    <suspend-to-mem enabled='no'/>
    <suspend-to-disk enabled='no'/>
  </pm>
  <devices>
    <emulator>/usr/libexec/qemu-kvm</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/home/images/vhost-server.img'/>
      <backingStore/>
      <target dev='vda' bus='virtio'/>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </disk>
    <controller type='usb' index='0' model='ich9-ehci1'>
      <alias name='usb'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x7'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci1'>
      <alias name='usb'/>
      <master startport='0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0' multifunction='on'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci2'>
      <alias name='usb'/>
      <master startport='2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x1'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci3'>
      <alias name='usb'/>
      <master startport='4'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x2'/>
    </controller>
    <controller type='pci' index='0' model='pci-root'>
      <alias name='pci.0'/>
    </controller>
    <controller type='virtio-serial' index='0'>
      <alias name='virtio-serial0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </controller>
    <controller type='sata' index='0'>
      <alias name='sata0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/>
    </controller>
    <interface type='vhostuser'>
      <mac address='52:54:00:92:e4:96'/>
      <source type='unix' path='/tmp/vhost0' mode='server'/>
      <target dev='vhost0'/>
      <model type='virtio'/>
      <driver name='vhost'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>
    <interface type='vhostuser'>
      <mac address='52:54:00:f2:44:21'/>
      <source type='unix' path='/tmp/vhost0' mode='server'/>
      <target dev='vhost0'/>
      <model type='virtio'/>
      <driver name='vhost'/>
      <alias name='net1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </interface>
    <interface type='bridge'>
      <mac address='52:54:00:8c:2c:6d'/>
      <source bridge='virbr0'/>
      <target dev='vnet0'/>
      <model type='virtio'/>
      <alias name='net2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x0b' function='0x0'/>
    </interface>
    <serial type='pty'>
      <target type='isa-serial' port='0'>
        <model name='isa-serial'/>
      </target>
      <alias name='serial0'/>
    </serial>
    <console type='pty'>
      <target type='serial' port='0'/>
      <alias name='serial0'/>
    </console>
    <channel type='unix'>
      <source mode='bind' path='/var/lib/libvirt/qemu/channel/target/domain-2-vhost-server/org.qemu.guest_agent.0'/>
      <target type='virtio' name='org.qemu.guest_agent.0'/>
      <alias name='channel0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <input type='tablet' bus='usb'>
      <alias name='input0'/>
      <address type='usb' bus='0' port='1'/>
    </input>
    <input type='mouse' bus='ps2'>
      <alias name='input1'/>
    </input>
    <input type='keyboard' bus='ps2'>
      <alias name='input2'/>
    </input>
    <graphics type='vnc' port='5900' autoport='yes' listen='127.0.0.1'>
      <listen type='address' address='127.0.0.1'/>
    </graphics>
    <video>
      <model type='cirrus' vram='16384' heads='1' primary='yes'/>
      <alias name='video0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </video>
    <memballoon model='virtio'>
      <alias name='balloon0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </memballoon>
  </devices>
  <seclabel type='dynamic' model='selinux' relabel='yes'>
    <label>system_u:system_r:svirt_t:s0:c423,c864</label>
    <imagelabel>system_u:object_r:svirt_image_t:s0:c423,c864</imagelabel>
  </seclabel>
  <seclabel type='dynamic' model='dac' relabel='yes'>
    <label>+997:+1001</label>
    <imagelabel>+997:+1001</imagelabel>
  </seclabel>
</domain>

Comment 8 Rick Alongi 2018-02-15 02:26:16 UTC
Hi Pei,

I'm a bit confused why you are requesting in Comment 6 that I add an entry that already exists in the XML file.  Maybe I am missing something?

Below is an XML file (3.xml) I use to define the VM named "3".  As you can see, there is an entry for both /tmp/vhost0 and /tmp/vost1.

[root@netqe12 home]# cat 3.xml 
<domain type='kvm'>
  <name>3</name>
  <memory unit='KiB'>4194304</memory>
  <currentMemory unit='KiB'>4194304</currentMemory>
  <memoryBacking>
    <hugepages>
      <page size='1048576' unit='KiB' nodeset='0'/>
    </hugepages>
    <access mode='shared'/>
  </memoryBacking>
  <vcpu placement='static'>3</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='1'/>
    <vcpupin vcpu='1' cpuset='3'/>
    <vcpupin vcpu='2' cpuset='5'/>
    <emulatorpin cpuset='1'/>
  </cputune>
  <resource>
    <partition>/machine</partition>
  </resource>
  <os>
    <type arch='x86_64' machine='pc-i440fx-rhel7.2.0'>hvm</type>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
    <apic/>
  </features>
  <cpu mode='host-passthrough'>
    <feature policy='require' name='tsc-deadline'/>
    <numa>
      <cell id='0' cpus='0-2' memory='4194304' unit='KiB' memAccess='shared'/>
    </numa>
  </cpu>
  <clock offset='utc'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <pm>
    <suspend-to-mem enabled='no'/>
    <suspend-to-disk enabled='no'/>
  </pm>
  <devices>
    <emulator>/usr/libexec/qemu-kvm</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/var/lib/libvirt/images/3.qcow2'/>
      <target dev='vda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </disk>
    <controller type='usb' index='0' model='ich9-ehci1'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x7'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci1'>
      <master startport='0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0' multifunction='on'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci2'>
      <master startport='2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x1'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci3'>
      <master startport='4'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x2'/>
    </controller>
    <controller type='pci' index='0' model='pci-root'/>
    <controller type='virtio-serial' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </controller>
    <interface type='vhostuser'>
      <source type='unix' path='tmp/vhost0' mode='server'/>
      <model type='virtio'/>
      <driver name='vhost'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </interface>
    <interface type='vhostuser'>
      <source type='unix' path='tmp/vhost1' mode='server'/>
      <model type='virtio'/>
      <driver name='vhost'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </interface>
    <serial type='pty'>
      <target port='0'/>
    </serial>
    <console type='pty'>
      <target type='serial' port='0'/>
    </console>
    <channel type='unix'>
      <target type='virtio' name='org.qemu.guest_agent.0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <input type='tablet' bus='usb'>
      <address type='usb' bus='0' port='1'/>
    </input>
    <input type='mouse' bus='ps2'/>
    <input type='keyboard' bus='ps2'/>
    <graphics type='vnc' port='-1' autoport='yes' listen='0.0.0.0'>
      <listen type='address' address='0.0.0.0'/>
    </graphics>
    <video>
      <model type='cirrus' vram='16384' heads='1' primary='yes'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </video>
    <memballoon model='virtio'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </memballoon>
  </devices>
  <seclabel type='dynamic' model='selinux' relabel='yes'/>
</domain>

I then execute "virsh start 3". The VM won't go to a Running state; it stays in "Paused" state.

Below is the dumpxml output of VM "3" when in the "Paused" state.  Again, note that there are entries for both /tmp/vhost0 and /tmp/vhost1:

[root@netqe12 home]# cat 3_dumpxml.xml 
<domain type='kvm' id='4'>
  <name>3</name>
  <uuid>a9cbadf2-5df6-4682-b865-d9833fadd743</uuid>
  <memory unit='KiB'>4194304</memory>
  <currentMemory unit='KiB'>4194304</currentMemory>
  <memoryBacking>
    <hugepages>
      <page size='1048576' unit='KiB' nodeset='0'/>
    </hugepages>
    <access mode='shared'/>
  </memoryBacking>
  <vcpu placement='static'>3</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='1'/>
    <vcpupin vcpu='1' cpuset='3'/>
    <vcpupin vcpu='2' cpuset='5'/>
    <emulatorpin cpuset='1'/>
  </cputune>
  <resource>
    <partition>/machine</partition>
  </resource>
  <os>
    <type arch='x86_64' machine='pc-i440fx-rhel7.2.0'>hvm</type>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
    <apic/>
  </features>
  <cpu mode='host-passthrough' check='none'>
    <feature policy='require' name='tsc-deadline'/>
    <numa>
      <cell id='0' cpus='0-2' memory='4194304' unit='KiB' memAccess='shared'/>
    </numa>
  </cpu>
  <clock offset='utc'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <pm>
    <suspend-to-mem enabled='no'/>
    <suspend-to-disk enabled='no'/>
  </pm>
  <devices>
    <emulator>/usr/libexec/qemu-kvm</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/var/lib/libvirt/images/3.qcow2'/>
      <backingStore/>
      <target dev='vda' bus='virtio'/>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </disk>
    <controller type='usb' index='0' model='ich9-ehci1'>
      <alias name='usb'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x7'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci1'>
      <alias name='usb'/>
      <master startport='0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0' multifunction='on'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci2'>
      <alias name='usb'/>
      <master startport='2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x1'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci3'>
      <alias name='usb'/>
      <master startport='4'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x2'/>
    </controller>
    <controller type='pci' index='0' model='pci-root'>
      <alias name='pci.0'/>
    </controller>
    <controller type='virtio-serial' index='0'>
      <alias name='virtio-serial0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </controller>
    <interface type='vhostuser'>
      <mac address='52:54:00:b9:c9:10'/>
      <source type='unix' path='tmp/vhost0' mode='server'/>
      <target dev='vhost0'/>
      <model type='virtio'/>
      <driver name='vhost'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </interface>
    <interface type='vhostuser'>
      <mac address='52:54:00:73:48:f7'/>
      <source type='unix' path='tmp/vhost1' mode='server'/>
      <target dev='vhost1'/>
      <model type='virtio'/>
      <driver name='vhost'/>
      <alias name='net1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </interface>
    <serial type='pty'>
      <target type='isa-serial' port='0'>
        <model name='isa-serial'/>
      </target>
      <alias name='serial0'/>
    </serial>
    <console type='pty'>
      <target type='serial' port='0'/>
      <alias name='serial0'/>
    </console>
    <channel type='unix'>
      <source mode='bind' path='/var/lib/libvirt/qemu/channel/target/domain-4-3/org.qemu.guest_agent.0'/>
      <target type='virtio' name='org.qemu.guest_agent.0'/>
      <alias name='channel0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <input type='tablet' bus='usb'>
      <alias name='input0'/>
      <address type='usb' bus='0' port='1'/>
    </input>
    <input type='mouse' bus='ps2'>
      <alias name='input1'/>
    </input>
    <input type='keyboard' bus='ps2'>
      <alias name='input2'/>
    </input>
    <graphics type='vnc' port='5900' autoport='yes' listen='0.0.0.0'>
      <listen type='address' address='0.0.0.0'/>
    </graphics>
    <video>
      <model type='cirrus' vram='16384' heads='1' primary='yes'/>
      <alias name='video0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </video>
    <memballoon model='virtio'>
      <alias name='balloon0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </memballoon>
  </devices>
  <seclabel type='dynamic' model='selinux' relabel='yes'>
    <label>system_u:system_r:svirt_t:s0:c592,c954</label>
    <imagelabel>system_u:object_r:svirt_image_t:s0:c592,c954</imagelabel>
  </seclabel>
  <seclabel type='dynamic' model='dac' relabel='yes'>
    <label>+107:+107</label>
    <imagelabel>+107:+107</imagelabel>
  </seclabel>
</domain>

Note the sequence of events below to bring VM "3" to a running state:

While VM "3" is in "Paused" state:

[root@netqe12 home]# virsh list --all
 Id    Name                           State
----------------------------------------------------
 2     2                              paused
 4     3                              paused
 -     1                              shut off

[root@netqe12 home]# ls -alh /tmp/vhost*
srwxrwxr-x. 1 qemu qemu 0 Feb 14 21:17 /tmp/vhost0

[root@netqe12 home]# chmod 777 /tmp/vhost0

[root@netqe12 home]# ls -alh /tmp/vhost*
srwxrwxrwx. 1 qemu qemu 0 Feb 14 21:17 /tmp/vhost0
srwxrwxr-x. 1 qemu qemu 0 Feb 14 21:18 /tmp/vhost1

[root@netqe12 home]# virsh list --all
 Id    Name                           State
----------------------------------------------------
 2     2                              paused
 5     3                              paused
 -     1                              shut off

[root@netqe12 home]# chmod 777 /tmp/vhost1

[root@netqe12 home]# virsh list --all
 Id    Name                           State
----------------------------------------------------
 2     2                              paused
 5     3                              running
 -     1                              shut off

Comment 10 Jiri Denemark 2018-02-15 08:03:52 UTC
(In reply to Rick Alongi from comment #8)
> Hi Pei,
> 
> I'm a bit confused why you are requesting in Comment 6 that I add an entry
> that already exists in the XML file.  Maybe I am missing something?
> 
> Below is an XML file (3.xml) I use to define the VM named "3".  As you can
> see, there is an entry for both /tmp/vhost0 and /tmp/vost1.

Not really.

>     <interface type='vhostuser'>
>       <source type='unix' path='tmp/vhost0' mode='server'/>
>       <model type='virtio'/>
>       <driver name='vhost'/>
>       <address type='pci' domain='0x0000' bus='0x00' slot='0x09'
> function='0x0'/>
>     </interface>
>     <interface type='vhostuser'>
>       <source type='unix' path='tmp/vhost1' mode='server'/>
>       <model type='virtio'/>
>       <driver name='vhost'/>
>       <address type='pci' domain='0x0000' bus='0x00' slot='0x08'
> function='0x0'/>
>     </interface>

As you can see, the path is 'tmp/vhost0' (and 'tmp/vhost1') rather than '/tmp/vhost0'. In other words the leading '/' is missing in the XML in both cases. That's what Pei pointed to in comment 6.

In general, using relative paths in domain XMLs is a bad idea.

Comment 11 Pei Zhang 2018-02-15 08:23:43 UTC
Created attachment 1396317 [details]
script to boot OVS

Hi Rick, Jean-Tsung,

I'm still can not reproduce this issue in my host with below versions:

3.10.0-851.el7.x86_64
qemu-kvm-rhev-2.10.0-20.el7.x86_64
libvirt-3.9.0-13.el7.x86_64
openvswitch-2.9.0-0.6.20171212git6625e43.el7fdb.x86_64

# ll /tmp/vhostuser*
srwxrwxr-x 1 qemu qemu 0 Feb 15 03:06 /tmp/vhostuser0.sock
srwxrwxr-x 1 qemu qemu 0 Feb 15 03:06 /tmp/vhostuser1.sock


Thanks Rick for sharing his host. However I still can boot the VM successfully in Rick's host.

The differences are:
(1) As Jiri explained in Comment 10, I'm using the absolute socket path like below. 

<source type='unix' path='/tmp/vhostuser0.sock' mode='server'/>

(2) I'm not sure if we are using same method to boot ovs with dpdkvhostuserclient. I always clean the ovs environment before doing each testing. I'd like to share the script I used, please check attachment in the comment. Run this script can set up the ovs environment.

# sh boot_ovs_client.sh 



Thanks,
Pei

Comment 12 Jiri Denemark 2018-02-15 09:29:17 UTC
Anyway, the relative path should not make a bug difference since QEMU is started by libvirt in /

I'm not sure whether the sockets are supposed to be created by ovs or QEMU, although the ownership and
'server' mode suggest they were created by QEMU and ovs is apparently unable to connect to them until
 writing is permitted to others. Libvirt is not involved in this in any way.

In the libvirtd.log I can see we spawn a new QEMU process, wait for the monitor socket to appear and
connect to it (uninteresting parts of the log were dropped):

2018-02-14 19:04:10.330+0000: 20109: debug : virCommandRunAsync:2477 : About to run LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name guest=1,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-1-1/master-key.aes -machine pc-i440fx-rhel7.2.0,accel=kvm,usb=off,dump-guest-core=off -cpu host,tsc-deadline=on -m 4096 -realtime mlock=off -smp 3,sockets=3,cores=1,threads=1 -object memory-backend-file,id=ram-node0,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu/1-1,share=yes,size=4294967296 -numa node,nodeid=0,cpus=0-2,memdev=ram-node0 -uuid 5851b5e2-7e90-40fd-96fb-66fe66fe2326 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-1-1/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x5.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x5 -device ich9-usb-hci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x5.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x5.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4 -drive file=/var/lib/libvirt/images/1.qcow2,format=qcow2,if=none,id=drive-virtio-disk0 -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -chardev socket,id=charnet0,path=tmp/vhost0,server -netdev vhost-user,chardev=charnet0,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:db:4c:25,bus=pci.0,addr=0x9 -chardev socket,id=charnet1,path=tmp/vhost1,server -netdev vhost-user,chardev=charnet1,id=hostnet1 -device virtio-net-pci,netdev=hostnet1,id=net1,mac=52:54:00:0d:a8:b6,bus=pci.0,addr=0x8 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/domain-1-1/org.qemu.guest_agent.0,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -vnc 0.0.0.0:0 -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7 -msg timestamp=on
2018-02-14 19:04:10.383+0000: 20109: debug : qemuProcessWaitForMonitor:2161 : Connect monitor to 0x7fa8502807b0 '1'
2018-02-14 19:04:10.511+0000: 20109: info : qemuMonitorOpenInternal:856 : QEMU_MONITOR_NEW: mon=0x7fa88000ead0 refs=2 fd=27
2018-02-14 19:04:10.511+0000: 20109: debug : qemuDomainObjEnterMonitorInternal:4895 : Entering monitor (mon=0x7fa88000ead0 vm=0x7fa8502807b0 name=1)
2018-02-14 19:04:10.511+0000: 20109: debug : qemuMonitorSetCapabilities:1656 : mon:0x7fa88000ead0 vm:0x7fa8502807b0 json:1 fd:27
2018-02-14 19:04:10.512+0000: 20109: info : qemuMonitorSend:1061 : QEMU_MONITOR_SEND_MSG: mon=0x7fa88000ead0 msg={"execute":"qmp_capabilities","id":"libvirt-1"}
 fd=-1

(BTW, qemuMonitorSend just queues the QMP message, it does not actually send it yet)

In the meantime in QEMU log:

2018-02-14T19:04:10.453564Z qemu-kvm: -chardev socket,id=charnet0,path=tmp/vhost0,server: QEMU waiting for connection on: disconnected:unix:tmp/vhost0,server
2018-02-14T19:04:38.929598Z qemu-kvm: -chardev socket,id=charnet1,path=tmp/vhost1,server: QEMU waiting for connection on: disconnected:unix:tmp/vhost1,server
2018-02-14T19:04:52.932101Z qemu-kvm: -chardev pty,id=charserial0: char device redirected to /dev/pts/2 (label charserial0)

At this point the processing of virDomainCreate() API hangs waiting for QMP greeting from QEMU, which
is sent about 43 seconds later (I guess after the permission on both sockets were manually modified),
and after the greeting libvirt actually sends the qmp_capabilities command and continues starting the
domain:

2018-02-14 19:04:53.691+0000: 20104: info : qemuMonitorIOProcess:438 : QEMU_MONITOR_IO_PROCESS: mon=0x7fa88000ead0 buf={"QMP": {"version": {"qemu": {"micro": 0, "minor": 9, "major": 2}, "package": "(qemu-kvm-rhev-2.9.0-14.el7)"}, "capabilities": []}}
 len=133
2018-02-14 19:04:53.691+0000: 20104: info : qemuMonitorIOWrite:543 : QEMU_MONITOR_IO_WRITE: mon=0x7fa88000ead0 buf={"execute":"qmp_capabilities","id":"libvirt-1"}
 len=49 ret=49 errno=0
2018-02-14 19:04:53.691+0000: 20104: info : qemuMonitorIOProcess:438 : QEMU_MONITOR_IO_PROCESS: mon=0x7fa88000ead0 buf={"return": {}, "id": "libvirt-1"}
 len=35
2018-02-14 19:04:53.691+0000: 20109: debug : qemuDomainObjExitMonitorInternal:4918 : Exited monitor (mon=0x7fa88000ead0 vm=0x7fa8502807b0 name=1)
2018-02-14 19:04:53.691+0000: 20109: debug : qemuDomainObjEndJob:4826 : Stopping job: async nested (async=start vm=0x7fa8502807b0 name=1)
2018-02-14 19:04:53.691+0000: 20109: debug : qemuDomainObjBeginJobInternal:4625 : Starting job: async nested (vm=0x7fa8502807b0 name=1, current job=none async=start)
2018-02-14 19:04:53.691+0000: 20109: debug : qemuDomainObjBeginJobInternal:4666 : Started job: async nested (async=start vm=0x7fa8502807b0 name=1)
2018-02-14 19:04:53.691+0000: 20109: debug : qemuDomainObjEnterMonitorInternal:4895 : Entering monitor (mon=0x7fa88000ead0 vm=0x7fa8502807b0 name=1)
2018-02-14 19:04:53.691+0000: 20109: debug : qemuMonitorGetMigrationCapabilities:3942 : mon:0x7fa88000ead0 vm:0x7fa8502807b0 json:1 fd:27
2018-02-14 19:04:53.691+0000: 20109: info : qemuMonitorSend:1061 : QEMU_MONITOR_SEND_MSG: mon=0x7fa88000ead0 msg={"execute":"query-migrate-capabilities","id":"libvirt-2"}
 fd=-1
2018-02-14 19:04:53.691+0000: 20104: info : qemuMonitorIOWrite:543 : QEMU_MONITOR_IO_WRITE: mon=0x7fa88000ead0 buf={"execute":"query-migrate-capabilities","id":"libvirt-2"}
 len=59 ret=59 errno=0
...

In other words, virsh start is stuck because the QEMU monitor is not responding.

IMHO there are two issues here:
- the host is not configured properly, i.e., ovs is not allowed to connect to the sockets
  created by QEMU
- QEMU just hangs until ovs connects to the vhost sockets without any timeout

Comment 13 Jean-Tsung Hsiao 2018-02-16 02:24:28 UTC
Two stories here:

I. A successful scenario

I provisioned a server(netqe19) using latest compose RHEL-7.5-20180215.1.

Got the same issue before modifying /etc/libvirt/qemu.conf.

But, after making the following change, guest got started successfully:

#group = "root"
group = "hugetlbfs"

Other related packages:
[root@netqe19 ~]# rpm -qa | grep qemu |sort
ipxe-roms-qemu-20170123-1.git4e85b27.el7_4.1.noarch
libvirt-daemon-driver-qemu-3.9.0-13.el7.x86_64
qemu-img-rhev-2.10.0-20.el7.x86_64
qemu-kvm-common-rhev-2.10.0-20.el7.x86_64
qemu-kvm-rhev-2.10.0-20.el7.x86_64
[root@netqe19 ~]# rpm -qa | grep libvirt | sort
libvirt-3.9.0-13.el7.x86_64
libvirt-client-3.9.0-13.el7.x86_64
libvirt-daemon-3.9.0-13.el7.x86_64
libvirt-daemon-config-network-3.9.0-13.el7.x86_64
libvirt-daemon-config-nwfilter-3.9.0-13.el7.x86_64
libvirt-daemon-driver-interface-3.9.0-13.el7.x86_64
libvirt-daemon-driver-lxc-3.9.0-13.el7.x86_64
libvirt-daemon-driver-network-3.9.0-13.el7.x86_64
libvirt-daemon-driver-nodedev-3.9.0-13.el7.x86_64
libvirt-daemon-driver-nwfilter-3.9.0-13.el7.x86_64
libvirt-daemon-driver-qemu-3.9.0-13.el7.x86_64
libvirt-daemon-driver-secret-3.9.0-13.el7.x86_64
libvirt-daemon-driver-storage-3.9.0-13.el7.x86_64
libvirt-daemon-driver-storage-core-3.9.0-13.el7.x86_64
libvirt-daemon-driver-storage-disk-3.9.0-13.el7.x86_64
libvirt-daemon-driver-storage-gluster-3.9.0-13.el7.x86_64
libvirt-daemon-driver-storage-iscsi-3.9.0-13.el7.x86_64
libvirt-daemon-driver-storage-logical-3.9.0-13.el7.x86_64
libvirt-daemon-driver-storage-mpath-3.9.0-13.el7.x86_64
libvirt-daemon-driver-storage-rbd-3.9.0-13.el7.x86_64
libvirt-daemon-driver-storage-scsi-3.9.0-13.el7.x86_64
libvirt-libs-3.9.0-13.el7.x86_64
[root@netqe19 ~]# rpm -q openvswitch
openvswitch-2.9.0-0.4.20180124git26cdc33.el7fdp.x86_64
[root@netqe19 ~]# 


II. A failed scenario

Update another server netqe17 from RHEL-7.5-20180201.2 to RHEL-7.5-20180215.1. But, still failed after making change /etc/libvirt/qemu.conf to have group = "hugetlbfs".

Other related packages:
[root@netqe17 ~]# rpm -qa | grep qemu |sort
ipxe-roms-qemu-20170123-1.git4e85b27.el7_4.1.noarch
libvirt-daemon-driver-qemu-3.9.0-13.el7.x86_64
qemu-img-rhev-2.10.0-20.el7.x86_64
qemu-kvm-common-rhev-2.10.0-20.el7.x86_64
qemu-kvm-rhev-2.10.0-20.el7.x86_64
[root@netqe17 ~]# 
[root@netqe17 ~]# rpm -qa | grep libvirt | sort
libvirt-3.9.0-13.el7.x86_64
libvirt-client-3.9.0-13.el7.x86_64
libvirt-daemon-3.9.0-13.el7.x86_64
libvirt-daemon-config-network-3.9.0-13.el7.x86_64
libvirt-daemon-config-nwfilter-3.9.0-13.el7.x86_64
libvirt-daemon-driver-interface-3.9.0-13.el7.x86_64
libvirt-daemon-driver-lxc-3.9.0-13.el7.x86_64
libvirt-daemon-driver-network-3.9.0-13.el7.x86_64
libvirt-daemon-driver-nodedev-3.9.0-13.el7.x86_64
libvirt-daemon-driver-nwfilter-3.9.0-13.el7.x86_64
libvirt-daemon-driver-qemu-3.9.0-13.el7.x86_64
libvirt-daemon-driver-secret-3.9.0-13.el7.x86_64
libvirt-daemon-driver-storage-3.9.0-13.el7.x86_64
libvirt-daemon-driver-storage-core-3.9.0-13.el7.x86_64
libvirt-daemon-driver-storage-disk-3.9.0-13.el7.x86_64
libvirt-daemon-driver-storage-gluster-3.9.0-13.el7.x86_64
libvirt-daemon-driver-storage-iscsi-3.9.0-13.el7.x86_64
libvirt-daemon-driver-storage-logical-3.9.0-13.el7.x86_64
libvirt-daemon-driver-storage-mpath-3.9.0-13.el7.x86_64
libvirt-daemon-driver-storage-rbd-3.9.0-13.el7.x86_64
libvirt-daemon-driver-storage-scsi-3.9.0-13.el7.x86_64
libvirt-glib-1.0.0-1.el7.x86_64
libvirt-libs-3.9.0-13.el7.x86_64
libvirt-python-3.9.0-1.el7.x86_64
[root@netqe17 ~]# 
[root@netqe17 ~]# rpm -q openvswitch
openvswitch-2.9.0-0.4.20180124git26cdc33.el7fdp.x86_64
[root@netqe17 ~]#

Comment 14 Maxime Coquelin 2018-02-19 13:30:24 UTC
Hi Jean-Tsung,

In comment 13, can you check in you failing host whether the /tmp/vhost0 and/or /tmp/vhost1 are still here VM and ovs are shut down?

If this is the case, could you try to remove the files and retry?

Adding Aaron in cc:, as IIRC, he worked on the vhost-user socket permissions.

Thanks,
Maxime

Comment 15 Jean-Tsung Hsiao 2018-02-19 15:12:35 UTC
(In reply to Maxime Coquelin from comment #14)
> Hi Jean-Tsung,
> 
> In comment 13, can you check in you failing host whether the /tmp/vhost0
> and/or /tmp/vhost1 are still here VM and ovs are shut down?
> 
> If this is the case, could you try to remove the files and retry?
Yes, both /tmp/vhost0 and /tmp/vhost1 are still even after VM destroyed and OVS stop.

Please take a look below.

[root@netqe17 ~]# ll /tmp/vhost*
srwxrwxr-x. 1 qemu hugetlbfs 0 Feb 18 22:15 /tmp/vhost0
srwxrwxr-x. 1 qemu hugetlbfs 0 Feb 18 22:15 /tmp/vhost1
[root@netqe17 ~]# 
[root@netqe17 ~]# 
[root@netqe17 ~]# virsh list --all
 Id    Name                           State
----------------------------------------------------
 
 -     vhost-server                   shut off

 
[root@netqe17 ~]# systemctl status openvswitch
● openvswitch.service - Open vSwitch
   Loaded: loaded (/usr/lib/systemd/system/openvswitch.service; disabled; vendor preset: disabled)
   Active: inactive (dead)

> 
> Adding Aaron in cc:, as IIRC, he worked on the vhost-user socket permissions.
> 
> Thanks,
> Maxime

The system, netqe17, is available now for you or Aaron to debug.

Comment 16 Maxime Coquelin 2018-02-20 08:24:44 UTC
(In reply to Jean-Tsung Hsiao from comment #15)
> (In reply to Maxime Coquelin from comment #14)
> > Hi Jean-Tsung,
> > 
> > In comment 13, can you check in you failing host whether the /tmp/vhost0
> > and/or /tmp/vhost1 are still here VM and ovs are shut down?
> > 
> > If this is the case, could you try to remove the files and retry?
> Yes, both /tmp/vhost0 and /tmp/vhost1 are still even after VM destroyed and
> OVS stop.
> 
> Please take a look below.
> 
> [root@netqe17 ~]# ll /tmp/vhost*
> srwxrwxr-x. 1 qemu hugetlbfs 0 Feb 18 22:15 /tmp/vhost0
> srwxrwxr-x. 1 qemu hugetlbfs 0 Feb 18 22:15 /tmp/vhost1
> [root@netqe17 ~]# 
> [root@netqe17 ~]# 
> [root@netqe17 ~]# virsh list --all
>  Id    Name                           State
> ----------------------------------------------------
>  
>  -     vhost-server                   shut off
> 
>  
> [root@netqe17 ~]# systemctl status openvswitch
> ● openvswitch.service - Open vSwitch
>    Loaded: loaded (/usr/lib/systemd/system/openvswitch.service; disabled;
> vendor preset: disabled)
>    Active: inactive (dead)

Have you tried as I suggested to manually remove the socket files and
restart OVS and the VM?
 
> > 
> > Adding Aaron in cc:, as IIRC, he worked on the vhost-user socket permissions.
> > 
> > Thanks,
> > Maxime
> 
> The system, netqe17, is available now for you or Aaron to debug.

Maybe Aaron will be a better fit to have a look at it, but I can have a look if 
you provide me in private info for connecting.

Comment 17 Jean-Tsung Hsiao 2018-02-20 12:48:23 UTC
(In reply to Maxime Coquelin from comment #16)
> (In reply to Jean-Tsung Hsiao from comment #15)
> > (In reply to Maxime Coquelin from comment #14)
> > > Hi Jean-Tsung,
> > > 
> > > In comment 13, can you check in you failing host whether the /tmp/vhost0
> > > and/or /tmp/vhost1 are still here VM and ovs are shut down?
> > > 
> > > If this is the case, could you try to remove the files and retry?
> > Yes, both /tmp/vhost0 and /tmp/vhost1 are still even after VM destroyed and
> > OVS stop.
> > 
> > Please take a look below.
> > 
> > [root@netqe17 ~]# ll /tmp/vhost*
> > srwxrwxr-x. 1 qemu hugetlbfs 0 Feb 18 22:15 /tmp/vhost0
> > srwxrwxr-x. 1 qemu hugetlbfs 0 Feb 18 22:15 /tmp/vhost1
> > [root@netqe17 ~]# 
> > [root@netqe17 ~]# 
> > [root@netqe17 ~]# virsh list --all
> >  Id    Name                           State
> > ----------------------------------------------------
> >  
> >  -     vhost-server                   shut off
> > 
> >  
> > [root@netqe17 ~]# systemctl status openvswitch
> > ● openvswitch.service - Open vSwitch
> >    Loaded: loaded (/usr/lib/systemd/system/openvswitch.service; disabled;
> > vendor preset: disabled)
> >    Active: inactive (dead)
> 
> Have you tried as I suggested to manually remove the socket files and
> restart OVS and the VM?
>  
> > > 
> > > Adding Aaron in cc:, as IIRC, he worked on the vhost-user socket permissions.
> > > 
> > > Thanks,
> > > Maxime
> > 
> > The system, netqe17, is available now for you or Aaron to debug.
> 
> Maybe Aaron will be a better fit to have a look at it, but I can have a look
> if 
> you provide me in private info for connecting.

Hi Maxime,

I am sending an email for netqe17 login info.

Thanks!

Jean

Comment 19 Aaron Conole 2018-02-22 15:56:08 UTC
I found a couple of issues with the setup.  I changed them, and things are now working.

First, I found an issue with the kernel cmdline:

  [root@netqe17 ~]# cat /proc/cmdline | grep iommu
  BOOT_IMAGE=/vmlinuz-3.10.0-851.el7.x86_64 root=/dev/mapper/rhel_netqe17-root ro intel_iommu=on default_hugepagesz=1GB hugepagesz=1G hugepages=32 crashkernel=auto rd.lvm.lv=rhel_netqe17/root rd.lvm.lv=rhel_netqe17/swap console=ttyS1,115200 LANG=en_US.UTF-8

Notice there is no iommu=pt setting (or iommu setting at all).  This can cause problems using an iommu depending on the guest setup.  I added this and rebooted.

Additionally, the xml had a duplicated vhost stanza:

    <interface type='vhostuser'>
      <mac address='52:54:00:92:e4:96'/>
      <source type='unix' path='/tmp/vhost0' mode='server'/>
      <target dev='vhost0'/>
      <model type='virtio'/>
      <driver name='vhost'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>

This appeared twice, and explains the following log:

2018-02-22T15:51:22.349389Z qemu-kvm: -chardev socket,id=charnet0,path=/tmp/vhost0,server: info: QEMU waiting for connection on: disconnected:unix:/tmp/vhost0,server
2018-02-22T15:51:23.115822Z qemu-kvm: -chardev socket,id=charnet1,path=/tmp/vhost1,server: info: QEMU waiting for connection on: disconnected:unix:/tmp/vhost1,server
2018-02-22T15:51:22.349389Z qemu-kvm: -chardev socket,id=charnet2,path=/tmp/vhost0,server: info: QEMU waiting for connection on: disconnected:unix:/tmp/vhost0,server

There is possible a bug to report here - the libvirt domain xml could have realized that a server socket was being used by two different charnet devices (which is an impossible situation to resolve).  Closing this bug, but suggest another bug to be opened about the duplicate server socket.

Comment 20 Jean-Tsung Hsiao 2018-02-22 16:10:55 UTC
(In reply to Aaron Conole from comment #19)
> I found a couple of issues with the setup.  I changed them, and things are
> now working.
> 
> First, I found an issue with the kernel cmdline:
> 
>   [root@netqe17 ~]# cat /proc/cmdline | grep iommu
>   BOOT_IMAGE=/vmlinuz-3.10.0-851.el7.x86_64
> root=/dev/mapper/rhel_netqe17-root ro intel_iommu=on default_hugepagesz=1GB
> hugepagesz=1G hugepages=32 crashkernel=auto rd.lvm.lv=rhel_netqe17/root
> rd.lvm.lv=rhel_netqe17/swap console=ttyS1,115200 LANG=en_US.UTF-8
> 
> Notice there is no iommu=pt setting (or iommu setting at all).  This can
> cause problems using an iommu depending on the guest setup.  I added this
> and rebooted.
> 
> Additionally, the xml had a duplicated vhost stanza:
> 
>     <interface type='vhostuser'>
>       <mac address='52:54:00:92:e4:96'/>
>       <source type='unix' path='/tmp/vhost0' mode='server'/>
>       <target dev='vhost0'/>
>       <model type='virtio'/>
>       <driver name='vhost'/>
>       <alias name='net0'/>
>       <address type='pci' domain='0x0000' bus='0x00' slot='0x03'
> function='0x0'/>
>     </interface>
> 
> This appeared twice, and explains the following log:
> 
> 2018-02-22T15:51:22.349389Z qemu-kvm: -chardev
> socket,id=charnet0,path=/tmp/vhost0,server: info: QEMU waiting for
> connection on: disconnected:unix:/tmp/vhost0,server
> 2018-02-22T15:51:23.115822Z qemu-kvm: -chardev
> socket,id=charnet1,path=/tmp/vhost1,server: info: QEMU waiting for
> connection on: disconnected:unix:/tmp/vhost1,server
> 2018-02-22T15:51:22.349389Z qemu-kvm: -chardev
> socket,id=charnet2,path=/tmp/vhost0,server: info: QEMU waiting for
> connection on: disconnected:unix:/tmp/vhost0,server
> 
> There is possible a bug to report here - the libvirt domain xml could have
> realized that a server socket was being used by two different charnet
> devices (which is an impossible situation to resolve).  Closing this bug,
> but suggest another bug to be opened about the duplicate server socket.

So, this is caused by my "cut and paste" error. That's why only happened on netqe17, but not on netqe19 and netqe5.

NOTE: We should open a bug to address /etc/libvirt/qemu.conf issue. Right now, the file needs to edited to include "group = "hugetlbfs".

Comment 21 Aaron Conole 2018-02-22 16:18:56 UTC
Definitely we should open a bug that libvirt didn't detect the copy and paste error.  I believe it should have not allowed two different chardevs with the same server socket.

As for the qemu.conf issue, I don't know.  Possibly.

Comment 22 Maxime Coquelin 2018-02-27 13:49:14 UTC
(In reply to Aaron Conole from comment #21)
> Definitely we should open a bug that libvirt didn't detect the copy and
> paste error.  I believe it should have not allowed two different chardevs
> with the same server socket.
> 
> As for the qemu.conf issue, I don't know.  Possibly.

I opened the libvirt bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1549582

Cheers,
Maxime