Bug 1444785

Summary: The second migration fails over openvswitch+dpdk
Product: Red Hat Enterprise Linux 7 Reporter: Pei Zhang <pezhang>
Component: libvirtAssignee: Jiri Denemark <jdenemar>
Status: CLOSED DUPLICATE QA Contact: yafu <yafu>
Severity: high Docs Contact:
Priority: unspecified    
Version: 7.4CC: chayang, chhu, dyuan, juzhang, michen, pezhang, rbalakri, xfu, xuzhang
Target Milestone: rcKeywords: Regression
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-04-26 20:04:17 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
/var/log/libvirt/libvirtd.log from src host
none
/var/log/libvirt/libvirtd.log from des host
none
/var/log/libvirt/qemu/rhel7.4_nonrt.log from src host
none
/var/log/libvirt/qemu/rhel7.4_nonrt.log from des host none

Description Pei Zhang 2017-04-24 09:53:22 UTC
Description of problem:
This issue is found when testing ping-pong migration over openvswitch and dpdk. First, do migrate from src to des, works. Then migrate from des to src, migration will fail. This is an regression issue.


Version-Release number of selected component (if applicable):
libvirt-3.2.0-3.el7.x86_64
3.10.0-653.el7.x86_64
qemu-kvm-rhev-2.9.0-1.el7.x86_64
dpdk-16.11-4.el7fdp.x86_64
openvswitch-2.6.1-15.git20161206.el7fdp.x86_64


How reproducible:
100%


Steps to Reproduce:
1. Install dpdk/openvswitch

2. Start openvswitch with 3 vhostuser sock, please refer to[1]

3. Boot VM, please refer to[2]

4. Do migration from src to des host

5. Do migration from des to src host, fail.
# /bin/virsh migrate --verbose --persistent --live rhel7.4_nonrt qemu+ssh://10.73.72.152/system 
error: operation failed: migration job: is not active


Actual results:
Migration fails.


Expected results:
Migration should work well.


Additional info:
1. This is an regression issue.
libvirt-3.2.0-1.el7.x86_64  work
libvirt-3.2.0-3.el7.x86_64  fail

Reference
[1]
# ovs-vsctl show
c1b3795a-0aca-4771-b588-672e7c0875cf
    Bridge "ovsbr1"
        Port "dpdk1"
            Interface "dpdk1"
                type: dpdk
                options: {n_rxq="1"}
        Port "ovsbr1"
            Interface "ovsbr1"
                type: internal
        Port "vhost-user1"
            Interface "vhost-user1"
                type: dpdkvhostuserclient
                options: {vhost-server-path="/tmp/vhostuser1.sock"}
    Bridge "ovsbr0"
        Port "dpdk0"
            Interface "dpdk0"
                type: dpdk
                options: {n_rxq="1"}
        Port "vhost-user0"
            Interface "vhost-user0"
                type: dpdkvhostuserclient
                options: {vhost-server-path="/tmp/vhostuser0.sock"}
        Port "ovsbr0"
            Interface "ovsbr0"
                type: internal
    Bridge "ovsbr2"
        Port "dpdk2"
            Interface "dpdk2"
                type: dpdk
                options: {n_rxq="1"}
        Port "vhost-user2"
            Interface "vhost-user2"
                type: dpdkvhostuserclient
                options: {vhost-server-path="/tmp/vhostuser2.sock"}
        Port "ovsbr2"
            Interface "ovsbr2"
                type: internal


[2]<domain type='kvm'>
  <name>rhel7.4_nonrt</name>
  <uuid>2a9c4b5c-28cb-11e7-acd8-14187748a2bb</uuid>
  <memory unit='KiB'>8388608</memory>
  <currentMemory unit='KiB'>8388608</currentMemory>
  <memoryBacking>
    <hugepages>
      <page size='1048576' unit='KiB' nodeset='0'/>
    </hugepages>
  </memoryBacking>
  <vcpu placement='static'>4</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='19'/>
    <vcpupin vcpu='1' cpuset='18'/>
    <vcpupin vcpu='2' cpuset='16'/>
    <vcpupin vcpu='3' cpuset='14'/>
    <emulatorpin cpuset='5,7,9,11,13,15'/>
  </cputune>
  <numatune>
    <memory mode='strict' nodeset='0'/>
  </numatune>
  <os>
    <type arch='x86_64' machine='pc'>hvm</type>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <pmu state='off'/>
    <vmport state='off'/>
  </features>
<cpu mode='host-passthrough'>
  <feature policy='require' name='tsc-deadline'/>
  <numa>
    <cell id='0' cpus='0-3' memory='8388608' unit='KiB' memAccess='shared'/>
  </numa>
</cpu>
  <clock offset='utc'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <pm>
    <suspend-to-mem enabled='no'/>
    <suspend-to-disk enabled='no'/>
  </pm>
  <devices>
    <emulator>/usr/libexec/qemu-kvm</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='none' io='threads'/>
      <source file='/mnt/nfv/rhel7.4_nonrt.qcow2'/>
      <target dev='vda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </disk>
    <controller type='usb' index='0' model='none'/>
    <controller type='pci' index='0' model='pci-root'/>
    <interface type='bridge'>
      <mac address='18:66:da:5f:dd:01'/>
      <source bridge='switch'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </interface>
    <interface type='vhostuser'>
      <mac address='18:66:da:5f:dd:02'/>
      <source type='unix' path='/tmp/vhostuser0.sock' mode='server'/>
      <model type='virtio'/>
      <driver name='vhost' queues='1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>
    <interface type='vhostuser'>
      <mac address='18:66:da:5f:dd:03'/>
      <source type='unix' path='/tmp/vhostuser1.sock' mode='server'/>
      <model type='virtio'/>
      <driver name='vhost' queues='1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </interface>
    <interface type='vhostuser'>
      <mac address='18:66:da:5f:dd:04'/>
      <source type='unix' path='/tmp/vhostuser2.sock' mode='server'/>
      <model type='virtio'/>
      <driver name='vhost' queues='1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </interface>
    <input type='mouse' bus='ps2'/>
    <input type='keyboard' bus='ps2'/>
    <memballoon model='virtio'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </memballoon>
  </devices>
</domain>

Comment 3 Jiri Denemark 2017-04-24 10:33:40 UTC
Could you upload debug logs from both sides of migration?

Comment 4 Xuesong Zhang 2017-04-25 02:45:15 UTC
It seems this bug is duplicated with regression BZ1441165, which libvirt found in the qe consumption testing of libvirt-3.2.0-2.el7.x86_64.

Comment 5 Pei Zhang 2017-04-25 03:01:12 UTC
Created attachment 1273793 [details]
/var/log/libvirt/libvirtd.log from src host

Comment 6 Pei Zhang 2017-04-25 03:02:02 UTC
Created attachment 1273794 [details]
/var/log/libvirt/libvirtd.log from des host

Comment 7 Pei Zhang 2017-04-25 03:04:46 UTC
Created attachment 1273796 [details]
/var/log/libvirt/qemu/rhel7.4_nonrt.log from src host

Comment 8 Pei Zhang 2017-04-25 03:05:19 UTC
Created attachment 1273797 [details]
/var/log/libvirt/qemu/rhel7.4_nonrt.log from des host

Comment 9 Jiri Denemark 2017-04-26 20:03:55 UTC
Yes, this is a duplicate of bug 1441165.

Comment 10 Jiri Denemark 2017-04-26 20:04:17 UTC

*** This bug has been marked as a duplicate of bug 1441165 ***