RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1538494 - Guest crashed on the source host when cancel migration by virDomainMigrateBegin3Params sometimes
Summary: Guest crashed on the source host when cancel migration by virDomainMigrateBeg...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev
Version: 7.5
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Dr. David Alan Gilbert
QA Contact: Minjia Cai
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-01-25 08:30 UTC by yafu
Modified: 2018-04-11 00:58 UTC (History)
23 users (show)

Fixed In Version: qemu-kvm-rhev-2.10.0-21.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-04-11 00:58:52 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
libvirtd log on source and target host (6.33 MB, application/x-gzip)
2018-01-25 08:44 UTC, yafu
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2018:1104 0 normal SHIPPED_LIVE Important: qemu-kvm-rhev security, bug fix, and enhancement update 2018-04-10 22:54:38 UTC

Description yafu 2018-01-25 08:30:27 UTC
Description of problem:
Guest crashed on the source host when cancel migration by virDomainMigrateBegin3Params sometimes.

Version-Release number of selected component (if applicable):
Source host:
libvirt-3.9.0-9.el7.x86_64
qemu-kvm-rhev-2.10.0-18.el7.x86_64
target host:
libvirt-3.2.0-14.el7_4.9.x86_64
qemu-kvm-rhev-2.9.0-16.el7_4.14.x86_64

How reproducible:
20%

Steps to Reproduce:
1.Start a guest with iommu config virtio video:
#virsh dumpxml vm3
<features>
    <acpi/>
    <apic/>
    <pmu state='on'/>
    <vmport state='off'/>
    <ioapic driver='qemu'/>
  </features>
...
<device>
...
 <iommu model='intel'>
      <driver intremap='on' caching_mode='on' eim='on' iotlb='on'/>
 </iommu>
</device>

2.Enable iommu in the guest os:
(1)Edit file  "/etc/default/grub":
Append "intel_iommu=on" or "amd_iommu=on" to the value of "GRUB_CMDLINE_LINUX=......". Such as the following one:
GRUB_CMDLINE_LINUX="rd.md=0 rd.lvm=0 rd.dm=0 vconsole.keymap=us $([ -x /usr/sbin/rhcrashkernel-param ] && /usr/sbin/rhcrashkernel-param || rd.luks=0 vconsole.font=latarcyrheb-sun16 rhgb quiet intel_iommu=on"

(2)Then run the following command to generate the updated grub file:
# grub2-mkconfig -o /boot/grub2/grub.cfg
Reboot the guest.

2.Do cross migration from 7.5->7.4:
#while true; do virsh migrate vm3 qemu+ssh://10.66.4.116/system --live --verbose; done
Migration: [100 %]error: internal error: qemu unexpectedly closed the monitor: 2018-01-25T05:30:31.339549Z qemu-kvm: AHCI: Failed to start FIS receive engine: bad FIS receive buffer address
2018-01-25T05:30:31.339599Z qemu-kvm: Failed to load ich9_ahci:ahci
2018-01-25T05:30:31.339604Z qemu-kvm: error while loading state for instance 0x0 of device '0000:00:1f.2/ich9_ahci'
2018-01-25T05:30:31.341097Z qemu-kvm: warning: TSC frequency mismatch between VM (2099998 kHz) and host (2399998 kHz), and TSC scaling unavailable
2018-01-25T05:30:31.341181Z qemu-kvm: warning: TSC frequency mismatch between VM (2099998 kHz) and host (2399998 kHz), and TSC scaling unavailable
2018-01-25T05:30:31.341250Z qemu-kvm: warning: TSC frequency mismatch between VM (2099998 kHz) and host (2399998 kHz), and TSC scaling unavailable
2018-01-25T05:30:31.341293Z qemu-kvm: warning: TSC frequency mismatch between VM (2099998 kHz) and host (2399998 kHz), and TSC scaling unavailable
2018-01-25T05:30:31.341470Z qemu-kvm: load of migration failed: Operation not permitted

Migration: [ 90 %]error: internal error: qemu unexpectedly closed the monitor: 2018-01-25T05:30:39.601378Z qemu-kvm: AHCI: Failed to start FIS receive engine: bad FIS receive buffer address
2018-01-25T05:30:39.601627Z qemu-kvm: Failed to load ich9_ahci:ahci
2018-01-25T05:30:39.601632Z qemu-kvm: error while loading state for instance 0x0 of device '0000:00:1f.2/ich9_ahci'
2018-01-25T05:30:39.601690Z qemu-kvm: warning: TSC frequency mismatch between VM (2099998 kHz) and host (2399998 kHz), and TSC scaling unavailable
2018-01-25T05:30:39.604516Z qemu-kvm: warning: TSC frequency mismatch between VM (2099998 kHz) and host (2399998 kHz), and TSC scaling unavailable
2018-01-25T05:30:39.604653Z qemu-kvm: warning: TSC frequency mismatch between VM (2099998 kHz) and host (2399998 kHz), and TSC scaling unavailable
2018-01-25T05:30:39.604743Z qemu-kvm: warning: TSC frequency mismatch between VM (2099998 kHz) and host (2399998 kHz), and TSC scaling unavailable
2018-01-25T05:30:39.604900Z qemu-kvm: load of migration failed: Operation not permitted

error: Unable to read from monitor: Connection reset by peer

error: Requested operation is not valid: domain is not running


3.The qemu log on the source host:
...
2018-01-22 12:08:20.285+0000: initiating migration
main_channel_client_handle_migrate_connected: client 0x55c8c4017100 connected: 1 seamless 1
qemu-kvm: block/io.c:1557: bdrv_co_pwritev: Assertion `!(bs->open_flags & 0x0800)' failed.
2018-01-22 12:08:24.328+0000: shutting down, reason=crashed

Actual results:
The guest crashed on the target host after migration cancelled.

Expected results:
Since iommu migration is only supported after 7.5 qemu-kvm, the migration failed is the expected results. But the guest on the source host should not be crashed after the migration cancelled.

Additional info:
1.The issue can be easily reproduced while the host have some loading, so if can not reproduce, can use stress to add loading while doing migration:
# stress --cpu 8 --io 4 --vm 2 --vm-bytes 128M --timeout 1000s &

Comment 2 yafu 2018-01-25 08:44:45 UTC
Created attachment 1385954 [details]
libvirtd log on source and target host

Comment 3 Jiri Denemark 2018-01-29 15:11:09 UTC
The source successfully transfers guests memory to the destination, stops virtual CPUS, and enters the pre-switchover state. Afterwards libvirt calls migrate-continue, which is honored by QEMU as it starts another migration pass. Soon after this the source QEMU reports failed migration. At this point vCPUs are automatically resumed by QEMU. This is all result of the destination QEMU not being able to load migration data. At this point libvirt starts to reset migration capabilities to make sure the next migration will see a clean environment. One of the migrate-set-capabilities (yes, libvirt is currently stupid and calls repeatedly for one capability at a time) doesn't finish in a few milliseconds as usual and QEMU aborts.

I doubt it is actually relevant, but in the case covered by the attached log files the last migrate-set-capabilities was clearing "compress" capability and QEMU aborted about 5 seconds after getting the command without replying.

I'm moving this bug to qemu-kvm-rhev for further investigation as I didn't see anything wrong from libvirt side.

Comment 4 Dr. David Alan Gilbert 2018-01-29 18:45:45 UTC
hmm, another race with block ownership then;  I guess the timing dependency is whether the source realises the destination has failed before the end of migration or at the end, but I'll need to dig to figure that out.

Comment 5 Minjia Cai 2018-01-31 06:21:58 UTC
Reproduce:
Version-Release number of selected component (if applicable):
Source host:
libvirt-3.9.0-6.el7.x86_64
qemu-kvm-rhev-2.10.0-18.el7.x86_64
target host:
libvirt-3.2.0-14.el7_4.9.x86_64
qemu-kvm-rhev-2.9.0-16.el7_4.14.x86_64

Steps to Reproduce:
1.Start a guest with iommu config virtio video:
<domain type='kvm'>
  <name>iommu1</name>
  <uuid>1b3268d6-b59c-406b-a14c-33b000b15b6c</uuid>
  <memory unit='KiB'>4240000</memory>
  <currentMemory unit='KiB'>4240000</currentMemory>
  <vcpu placement='static' cpuset='0-1'>4</vcpu>
  <resource>
    <partition>/machine</partition>
  </resource>
  <os>
    <type arch='x86_64' machine='pc-q35-rhel7.4.0'>hvm</type>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <ioapic driver='qemu'/>
  </features>
  <clock offset='utc'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>coredump-restart</on_crash>
  <pm>
    <suspend-to-mem enabled='no'/>
    <suspend-to-disk enabled='no'/>
  </pm>
  <devices>
    <emulator>/usr/libexec/qemu-kvm</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='none'/>
      <source file='/home/micai/rhel75-64-virtio-scsi.qcow2'/>
      <target dev='vda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
    </disk>
    <controller type='usb' index='0' model='ich9-ehci1'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x7'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci1'>
      <master startport='0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0' multifunction='on'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci2'>
      <master startport='2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x1'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci3'>
      <master startport='4'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x2'/>
    </controller>
    <controller type='virtio-serial' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </controller>
    <controller type='pci' index='0' model='pcie-root'/>
    <controller type='pci' index='1' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='1' port='0x8'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0' multifunction='on'/>
    </controller>
    <controller type='pci' index='2' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='2' port='0x9'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x1'/>
    </controller>
    <controller type='pci' index='3' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='3' port='0x13'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x3'/>
    </controller>
    <controller type='pci' index='4' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='4' port='0x14'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x4'/>
    </controller>
    <controller type='pci' index='5' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='5' port='0x15'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x5'/>
    </controller>
    <controller type='pci' index='6' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='6' port='0x16'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x6'/>
    </controller>
    <controller type='pci' index='7' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='7' port='0x17'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x7'/>
    </controller>
    <controller type='pci' index='8' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='8' port='0x18'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0' multifunction='on'/>
    </controller>
    <controller type='pci' index='9' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='9' port='0x19'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x1'/>
    </controller>
    <controller type='pci' index='10' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='10' port='0x1a'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x2'/>
    </controller>
    <controller type='pci' index='11' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='11' port='0x1b'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x3'/>
    </controller>
    <controller type='pci' index='12' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='12' port='0x1c'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x4'/>
    </controller>
    <controller type='pci' index='13' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='13' port='0x1d'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x5'/>
    </controller>
    <controller type='pci' index='14' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='14' port='0x1e'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x6'/>
    </controller>
    <controller type='pci' index='15' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='15' port='0xa'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x2'/>
    </controller>
    <controller type='pci' index='16' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='16' port='0x1f'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x7'/>
    </controller>
    <controller type='pci' index='17' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='17' port='0x20'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0' multifunction='on'/>
    </controller>
    <controller type='sata' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/>
    </controller>
    <interface type='network'>
      <mac address='52:54:00:a5:95:18'/>
      <source network='default'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x11' function='0x0'/>
    </interface>
    <serial type='file'>
      <source path='/var/log/libvirt/qemu/rhel7.3-serial1.log' append='on'/>
      <target type='isa-serial' port='1'>
        <model name='isa-serial'/>
      </target>
    </serial>
    <console type='file'>
      <source path='/var/log/libvirt/qemu/rhel7.3-serial1.log' append='on'/>
      <target type='serial' port='1'/>
    </console>
    <channel type='unix'>
      <target type='virtio' name='org.qemu.guest_agent.0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <channel type='spicevmc'>
      <target type='virtio' name='com.redhat.spice.0'/>
      <address type='virtio-serial' controller='0' bus='0' port='2'/>
    </channel>
    <input type='tablet' bus='usb'>
      <address type='usb' bus='0' port='1'/>
    </input>
    <input type='mouse' bus='ps2'/>
    <input type='keyboard' bus='ps2'/>
    <graphics type='spice' autoport='yes' listen='0.0.0.0'>
      <listen type='address' address='0.0.0.0'/>
    </graphics>
    <video>
      <model type='qxl' ram='65536' vram='65536' vgamem='16384' heads='1' primary='yes'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0'/>
    </video>
    <redirdev bus='usb' type='spicevmc'>
      <address type='usb' bus='0' port='2'/>
    </redirdev>
    <redirdev bus='usb' type='spicevmc'>
      <address type='usb' bus='0' port='3'/>
    </redirdev>
    <watchdog model='ib700' action='reset'/>
    <memballoon model='virtio'>
      <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
    </memballoon>
    <panic model='isa'/>
    <iommu model='intel'>
      <driver intremap='on' caching_mode='on' eim='on' iotlb='on'/>
    </iommu>
  </devices>
</domain>
2.Enable iommu in the guest os:
(1)Edit file  "/etc/default/grub":
Append "intel_iommu=on" into "GRUB_CMDLINE_LINUX=......"
(2)Then run the following command to generate the updated grub file:
# grub2-mkconfig -o /boot/grub2/grub.cfg
Reboot the guest.

3.for source host:
 #stress --cpu 8 --io 4 --vm 2 --vm-bytes 128M --timeout 1000s &
4.Do cross migration from 7.5->7.4:
#while true; do virsh migrate vm3 qemu+ssh://10.66.4.116/system --live --verbose; done

Normal phenomenon:
[root@dhcp-10-122 micai]# while true; do virsh migrate iommu1  qemu+ssh://10.66.10.208/system --live --verbose; done
Migration: [100 %]error: internal error: qemu unexpectedly closed the monitor: 2018-01-31T05:28:21.158642Z qemu-kvm: AHCI: Failed to start FIS receive engine: bad FIS receive buffer address
2018-01-31T05:28:21.158685Z qemu-kvm: Failed to load ich9_ahci:ahci
2018-01-31T05:28:21.158689Z qemu-kvm: error while loading state for instance 0x0 of device '0000:00:1f.2/ich9_ahci'
2018-01-31T05:28:21.159430Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392142 kHz), and TSC scaling unavailable
2018-01-31T05:28:21.159483Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392142 kHz), and TSC scaling unavailable
2018-01-31T05:28:21.159512Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392142 kHz), and TSC scaling unavailable
2018-01-31T05:28:21.159539Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392142 kHz), and TSC scaling unavailable
2018-01-31T05:28:21.159654Z qemu-kvm: load of migration failed: Operation not permitted

Migration: [100 %]error: internal error: qemu unexpectedly closed the monitor: 2018-01-31T05:28:26.369607Z qemu-kvm: AHCI: Failed to start FIS receive engine: bad FIS receive buffer address
2018-01-31T05:28:26.369639Z qemu-kvm: Failed to load ich9_ahci:ahci
2018-01-31T05:28:26.369643Z qemu-kvm: error while loading state for instance 0x0 of device '0000:00:1f.2/ich9_ahci'
2018-01-31T05:28:26.369680Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392142 kHz), and TSC scaling unavailable
2018-01-31T05:28:26.370457Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392142 kHz), and TSC scaling unavailable
2018-01-31T05:28:26.370500Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392142 kHz), and TSC scaling unavailable
2018-01-31T05:28:26.370547Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392142 kHz), and TSC scaling unavailable
2018-01-31T05:28:26.370661Z qemu-kvm: load of migration failed: Operation not permitted

Abnormal phenomenon:
 the source guest crash
[root@dhcp-10-122 micai]# while true; do virsh migrate iommu1  qemu+ssh://10.66.10.208/system --live --verbose; done
Migration: [100 %]
error: Requested operation is not valid: domain is not running

error: Requested operation is not valid: domain is not running

error: Requested operation is not valid: domain is not running

5.The qemu log on the source host:
'''
main_channel_link: add main channel client
main_channel_client_handle_pong: net test: latency 0.418000 ms, bitrate 144133999 bps (137.456893 Mbps)
red_qxl_set_cursor_peer:
inputs_connect: inputs channel client create
main_channel_handle_message: agent start
red_channel_client_disconnect: rcc=0x55ec46ef09c0 (channel=0x55ec46859100 type=3 id=0)
red_channel_client_disconnect: rcc=0x55ec5309fdd0 (channel=0x55ec479cf8a0 type=4 id=0)
red_channel_client_disconnect: rcc=0x55ec5309a5b0 (channel=0x55ec46858a30 type=2 id=0)
red_channel_client_disconnect: rcc=0x55ec46f149f0 (channel=0x55ec46858960 type=1 id=0)
main_channel_client_on_disconnect: rcc=0x55ec46f149f0
red_client_destroy: destroy client 0x55ec4689c1e0 with #channels=6
red_qxl_disconnect_cursor_peer:
red_qxl_disconnect_display_peer:
red_channel_client_disconnect: rcc=0x55ec496821b0 (channel=0x55ec479cf960 type=9 id=0)
red_channel_client_disconnect: rcc=0x55ec46f2b1b0 (channel=0x55ec479cfa30 type=9 id=1)
main_channel_link: add main channel client
main_channel_client_handle_pong: net test: latency 5.584000 ms, bitrate 1014363546 bps (967.372461 Mbps)
red_qxl_set_cursor_peer:
inputs_connect: inputs channel client create
2018-01-31 04:44:06.770+0000: shutting down, reason=crashed
2018-01-31 04:45:30.604+0000: starting up libvirt version: 3.9.0, package: 6.el7 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2017-12-12-04:11:26, x86-019.build.eng.bos.redhat.com), qemu version: 2.10.0(qemu-kvm-rhev-2.10.0-18.el7), hostname: dhcp-10-122.nay.redhat.com

Comment 6 Dr. David Alan Gilbert 2018-02-02 17:21:19 UTC
Getting this to fail is quite hard;  I've done ~160 migrates so far today, and had one failure, I can kind of imagine why it *might* fail, but currently dont have any way to make sure.

Comment 10 Dr. David Alan Gilbert 2018-02-05 09:43:32 UTC
Posted upstream:
migration: Recover block devices if failure in device state

Comment 15 Miroslav Rezanina 2018-02-20 13:38:55 UTC
Fix included in qemu-kvm-rhev-2.10.0-21.el7

Comment 17 Yongxue Hong 2018-02-23 04:31:51 UTC
Reproduction:
Src host(RHEL 7.5):
libvirt: libvirt-3.9.0-9.el7.x86_64
qemu: qemu-kvm-rhev-2.10.0-18.el7.x86_64
host: 3.10.0-855.el7.x86_64
guest: 3.10.0-855.el7.x86_64

Dst host(RHEL 7.4):
libvirt: libvirt-3.9.0-9.el7.x86_64
qemu: qemu-kvm-rhev-2.10.0-16.el7.x86_64
host: 3.10.0-693.20.1.el7.x86_64

Step of reproduction same as comment 5.

Actual result:
[root@dhcp-10-122 yhong]# while true; do virsh migrate iommu1 qemu+ssh://10.66.10.208/system --live --verbose; done
Migration: [100 %]
error: Requested operation is not valid: domain is not running
error: Requested operation is not valid: domain is not running

The src libvirt log info :
[root@dhcp-10-122 qemu]# cat iommu1.log 
2018-02-23 02:42:46.107+0000: starting up libvirt version: 3.9.0, package: 9.el7 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2018-01-23-10:22:35, x86-040.build.eng.bos.redhat.com), qemu version: 2.10.0(qemu-kvm-rhev-2.10.0-18.el7), hostname: dhcp-10-122.nay.redhat.com
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name guest=iommu1,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-16-iommu1/master-key.aes -machine pc-q35-rhel7.4.0,accel=kvm,usb=off,dump-guest-core=off,kernel_irqchip=split -m 4141 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid 1b3268d6-b59c-406b-a14c-33b000b15b6c -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-16-iommu1/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on -device intel-iommu,intremap=on,caching-mode=on,eim=on,device-iotlb=on -device pcie-root-port,port=0x8,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2 -device pcie-root-port,port=0x9,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 -device pcie-root-port,port=0x13,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x3 -device pcie-root-port,port=0x14,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x4 -device pcie-root-port,port=0x15,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x5 -device pcie-root-port,port=0x16,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x6 -device pcie-root-port,port=0x17,chassis=7,id=pci.7,bus=pcie.0,addr=0x2.0x7 -device pcie-root-port,port=0x18,chassis=8,id=pci.8,bus=pcie.0,multifunction=on,addr=0x3 -device pcie-root-port,port=0x19,chassis=9,id=pci.9,bus=pcie.0,addr=0x3.0x1 -device pcie-root-port,port=0x1a,chassis=10,id=pci.10,bus=pcie.0,addr=0x3.0x2 -device pcie-root-port,port=0x1b,chassis=11,id=pci.11,bus=pcie.0,addr=0x3.0x3 -device pcie-root-port,port=0x1c,chassis=12,id=pci.12,bus=pcie.0,addr=0x3.0x4 -device pcie-root-port,port=0x1d,chassis=13,id=pci.13,bus=pcie.0,addr=0x3.0x5 -device pcie-root-port,port=0x1e,chassis=14,id=pci.14,bus=pcie.0,addr=0x3.0x6 -device pcie-root-port,port=0xa,chassis=15,id=pci.15,bus=pcie.0,addr=0x2.0x2 -device pcie-root-port,port=0x1f,chassis=16,id=pci.16,bus=pcie.0,addr=0x3.0x7 -device pcie-root-port,port=0x20,chassis=17,id=pci.17,bus=pcie.0,multifunction=on,addr=0x4 -device ich9-usb-ehci1,id=usb,bus=pcie.0,addr=0x5.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pcie.0,multifunction=on,addr=0x5 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pcie.0,addr=0x5.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pcie.0,addr=0x5.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pcie.0,addr=0x6 -drive file=/home/yhong/rhel75-x86_64-scsi-30G.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none -device virtio-blk-pci,scsi=off,bus=pci.2,addr=0x0,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=26,id=hostnet0,vhost=on,vhostfd=28 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:a5:95:18,bus=pcie.0,addr=0x11 -add-fd set=2,fd=30 -chardev file,id=charserial0,path=/dev/fdset/2,append=on -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/domain-16-iommu1/org.qemu.guest_agent.0,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel1,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -spice port=5900,addr=0.0.0.0,disable-ticketing,seamless-migration=on -vnc 0.0.0.0:10,share=allow-exclusive -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1 -device ib700,id=watchdog0 -watchdog-action reset -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -device virtio-balloon-pci,id=balloon0,bus=pci.4,addr=0x0 -device pvpanic -msg timestamp=on
2018-02-23 02:43:07.705+0000: initiating migration
2018-02-23 02:43:09.428+0000: shutting down, reason=migrated
2018-02-23T02:43:09.430823Z qemu-kvm: terminating on signal 15 from pid 10049 (/usr/sbin/libvirtd)

The dst libvirt log info :
[root@dhcp-10-208 qemu]# cat iommu1.log 
2018-02-23 02:30:37.280+0000: shutting down, reason=crashed
2018-02-23 02:31:00.094+0000: starting up libvirt version: 3.9.0, package: 9.el7 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2018-01-23-10:22:35, x86-040.build.eng.bos.redhat.com), qemu version: 2.10.0(qemu-kvm-rhev-2.10.0-16.el7), hostname: dhcp-10-208.nay.redhat.com
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name guest=iommu1,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-11-iommu1/master-key.aes -machine pc-q35-rhel7.4.0,accel=kvm,usb=off,dump-guest-core=off,kernel_irqchip=split -m 4141 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid 1b3268d6-b59c-406b-a14c-33b000b15b6c -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-11-iommu1/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on -device intel-iommu,intremap=on,caching-mode=on,eim=on,device-iotlb=on -device pcie-root-port,port=0x8,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2 -device pcie-root-port,port=0x9,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 -device pcie-root-port,port=0x13,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x3 -device pcie-root-port,port=0x14,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x4 -device pcie-root-port,port=0x15,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x5 -device pcie-root-port,port=0x16,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x6 -device pcie-root-port,port=0x17,chassis=7,id=pci.7,bus=pcie.0,addr=0x2.0x7 -device pcie-root-port,port=0x18,chassis=8,id=pci.8,bus=pcie.0,multifunction=on,addr=0x3 -device pcie-root-port,port=0x19,chassis=9,id=pci.9,bus=pcie.0,addr=0x3.0x1 -device pcie-root-port,port=0x1a,chassis=10,id=pci.10,bus=pcie.0,addr=0x3.0x2 -device pcie-root-port,port=0x1b,chassis=11,id=pci.11,bus=pcie.0,addr=0x3.0x3 -device pcie-root-port,port=0x1c,chassis=12,id=pci.12,bus=pcie.0,addr=0x3.0x4 -device pcie-root-port,port=0x1d,chassis=13,id=pci.13,bus=pcie.0,addr=0x3.0x5 -device pcie-root-port,port=0x1e,chassis=14,id=pci.14,bus=pcie.0,addr=0x3.0x6 -device pcie-root-port,port=0xa,chassis=15,id=pci.15,bus=pcie.0,addr=0x2.0x2 -device pcie-root-port,port=0x1f,chassis=16,id=pci.16,bus=pcie.0,addr=0x3.0x7 -device pcie-root-port,port=0x20,chassis=17,id=pci.17,bus=pcie.0,multifunction=on,addr=0x4 -device ich9-usb-ehci1,id=usb,bus=pcie.0,addr=0x5.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pcie.0,multifunction=on,addr=0x5 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pcie.0,addr=0x5.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pcie.0,addr=0x5.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pcie.0,addr=0x6 -drive file=/home/yhong/rhel75-x86_64-scsi-30G.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none -device virtio-blk-pci,scsi=off,bus=pci.2,addr=0x0,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=26,id=hostnet0,vhost=on,vhostfd=28 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:a5:95:18,bus=pcie.0,addr=0x11 -add-fd set=2,fd=30 -chardev file,id=charserial0,path=/dev/fdset/2,append=on -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/domain-11-iommu1/org.qemu.guest_agent.0,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel1,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -spice port=5900,addr=0.0.0.0,disable-ticketing,seamless-migration=on -vnc 0.0.0.0:10,share=allow-exclusive -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1 -device ib700,id=watchdog0 -watchdog-action reset -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -incoming defer -device virtio-balloon-pci,id=balloon0,bus=pci.4,addr=0x0 -device pvpanic -msg timestamp=on
2018-02-23T02:31:01.546540Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392162 kHz), and TSC scaling unavailable
2018-02-23T02:31:01.546660Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392162 kHz), and TSC scaling unavailable
2018-02-23T02:31:01.546705Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392162 kHz), and TSC scaling unavailable
2018-02-23T02:31:01.546755Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392162 kHz), and TSC scaling unavailable
[root@dhcp-10-208 qemu]# 

The dst guest crashed.
Gdb strace info :
[root@dhcp-10-208 qemu]# gdb -p 14424
(gdb) bt
#0  0x00007fe431bb5f0f in ppoll () at /lib64/libc.so.6
#1  0x00005561f6fe6579 in qemu_poll_ns (__ss=0x0, __timeout=0x7ffdbe677a00, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77
#2  0x00005561f6fe6579 in qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=timeout@entry=1606302113) at util/qemu-timer.c:334
#3  0x00005561f6fe7378 in main_loop_wait (timeout=1606302113) at util/main-loop.c:255
#4  0x00005561f6fe7378 in main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:515
#5  0x00005561f6cc784a in main () at vl.c:1917
#6  0x00005561f6cc784a in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4805
(gdb) bt full
#0  0x00007fe431bb5f0f in ppoll () at /lib64/libc.so.6
#1  0x00005561f6fe6579 in qemu_poll_ns (__ss=0x0, __timeout=0x7ffdbe677a00, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77
        ts = {tv_sec = 1, tv_nsec = 606302113}
Python Exception <class 'gdb.error'> That operation is not available on integers of more than 8 bytes.: 
#2  0x00005561f6fe6579 in qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=timeout@entry=1606302113) at util/qemu-timer.c:334
        ts = {tv_sec = 1, tv_nsec = 606302113}
Python Exception <class 'gdb.error'> That operation is not available on integers of more than 8 bytes.: 
#3  0x00005561f6fe7378 in main_loop_wait (timeout=1606302113) at util/main-loop.c:255
        context = 0x5561f8476a50
        ret = <optimized out>
        spin_counter = 0
        ret = -1100514764
        timeout = 4294967295
        timeout_ns = <optimized out>
#4  0x00005561f6fe7378 in main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:515
        ret = -1100514764
        timeout = 4294967295
        timeout_ns = <optimized out>
#5  0x00005561f6cc784a in main () at vl.c:1917
        i = <optimized out>
        snapshot = <optimized out>
        linux_boot = <optimized out>
        initrd_filename = <optimized out>
        kernel_filename = <optimized out>
        kernel_cmdline = <optimized out>
        boot_order = <optimized out>
        boot_once = 0x0
        cyls = <optimized out>
        heads = <optimized out>
        secs = <optimized out>
        translation = <optimized out>
        opts = <optimized out>
        machine_opts = <optimized out>
        hda_opts = <optimized out>
        icount_opts = <optimized out>
        accel_opts = <optimized out>
        olist = <optimized out>
        optind = 130
        optarg = 0x7ffdbe679f81 "timestamp=on"
---Type <return> to continue, or q <return> to quit---
        loadvm = <optimized out>
        machine_class = 0x0
        cpu_model = <optimized out>
        vga_model = 0x0
        qtest_chrdev = <optimized out>
        qtest_log = <optimized out>
        pid_file = <optimized out>
        incoming = <optimized out>
        defconfig = <optimized out>
        userconfig = <optimized out>
        nographic = <optimized out>
        display_type = <optimized out>
        display_remote = <optimized out>
        log_mask = <optimized out>
        log_file = <optimized out>
        trace_file = <optimized out>
        maxram_size = <optimized out>
        ram_slots = <optimized out>
        vmstate_dump_file = <optimized out>
        main_loop_err = 0x0
        err = 0x0
        list_data_dirs = <optimized out>
        bdo_queue = {sqh_first = 0x0, sqh_last = 0x7ffdbe677b90}
        __func__ = "main"
        __FUNCTION__ = "main"
#6  0x00005561f6cc784a in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4805
        i = <optimized out>
        snapshot = <optimized out>
        linux_boot = <optimized out>
        initrd_filename = <optimized out>
        kernel_filename = <optimized out>
        kernel_cmdline = <optimized out>
        boot_order = <optimized out>
        boot_once = 0x0
        cyls = <optimized out>
        heads = <optimized out>
        secs = <optimized out>
---Type <return> to continue, or q <return> to quit---
        translation = <optimized out>
        opts = <optimized out>
        machine_opts = <optimized out>
        hda_opts = <optimized out>
        icount_opts = <optimized out>
        accel_opts = <optimized out>
        olist = <optimized out>
        optind = 130
        optarg = 0x7ffdbe679f81 "timestamp=on"
        loadvm = <optimized out>
        machine_class = 0x0
        cpu_model = <optimized out>
        vga_model = 0x0
        qtest_chrdev = <optimized out>
        qtest_log = <optimized out>
        pid_file = <optimized out>
        incoming = <optimized out>
        defconfig = <optimized out>
        userconfig = <optimized out>
        nographic = <optimized out>
        display_type = <optimized out>
        display_remote = <optimized out>
        log_mask = <optimized out>
        log_file = <optimized out>
        trace_file = <optimized out>
        maxram_size = <optimized out>
        ram_slots = <optimized out>
        vmstate_dump_file = <optimized out>
        main_loop_err = 0x0
        err = 0x0
        list_data_dirs = <optimized out>
        bdo_queue = {sqh_first = 0x0, sqh_last = 0x7ffdbe677b90}
        __func__ = "main"
        __FUNCTION__ = "main"
(gdb) 


Verification(repeat 5 times) with qemu-kvm-rhev-2.10.0-21.el7.x86_64 on src host:
Src host:
libvirt: libvirt-3.9.0-9.el7.x86_64
qemu: qemu-kvm-rhev-2.10.0-21.el7.x86_64
host: 3.10.0-855.el7.x86_64
guest: 3.10.0-855.el7.x86_64

Dst host:
libvirt: libvirt-3.9.0-9.el7.x86_64
qemu: qemu-kvm-rhev-2.10.0-16.el7.x86_64
host: 3.10.0-693.20.1.el7.x86_64

Step of verification same as comment 5.

Actual result:
[root@dhcp-10-122 yhong]# while true; do virsh migrate iommu1 qemu+ssh://10.66.10.208/system --live --verbose; done
Migration: [100 %]
error: Requested operation is not valid: domain is not running

error: Requested operation is not valid: domain is not running

The src libvirt log info:
[root@dhcp-10-122 qemu]# cat iommu1.log 
2018-02-23 03:04:54.797+0000: starting up libvirt version: 3.9.0, package: 9.el7 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2018-01-23-10:22:35, x86-040.build.eng.bos.redhat.com), qemu version: 2.10.0(qemu-kvm-rhev-2.10.0-21.el7), hostname: dhcp-10-122.nay.redhat.com
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name guest=iommu1,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-18-iommu1/master-key.aes -machine pc-q35-rhel7.4.0,accel=kvm,usb=off,dump-guest-core=off,kernel_irqchip=split -m 4141 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid 1b3268d6-b59c-406b-a14c-33b000b15b6c -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-18-iommu1/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on -device intel-iommu,intremap=on,caching-mode=on,eim=on,device-iotlb=on -device pcie-root-port,port=0x8,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2 -device pcie-root-port,port=0x9,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 -device pcie-root-port,port=0x13,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x3 -device pcie-root-port,port=0x14,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x4 -device pcie-root-port,port=0x15,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x5 -device pcie-root-port,port=0x16,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x6 -device pcie-root-port,port=0x17,chassis=7,id=pci.7,bus=pcie.0,addr=0x2.0x7 -device pcie-root-port,port=0x18,chassis=8,id=pci.8,bus=pcie.0,multifunction=on,addr=0x3 -device pcie-root-port,port=0x19,chassis=9,id=pci.9,bus=pcie.0,addr=0x3.0x1 -device pcie-root-port,port=0x1a,chassis=10,id=pci.10,bus=pcie.0,addr=0x3.0x2 -device pcie-root-port,port=0x1b,chassis=11,id=pci.11,bus=pcie.0,addr=0x3.0x3 -device pcie-root-port,port=0x1c,chassis=12,id=pci.12,bus=pcie.0,addr=0x3.0x4 -device pcie-root-port,port=0x1d,chassis=13,id=pci.13,bus=pcie.0,addr=0x3.0x5 -device pcie-root-port,port=0x1e,chassis=14,id=pci.14,bus=pcie.0,addr=0x3.0x6 -device pcie-root-port,port=0xa,chassis=15,id=pci.15,bus=pcie.0,addr=0x2.0x2 -device pcie-root-port,port=0x1f,chassis=16,id=pci.16,bus=pcie.0,addr=0x3.0x7 -device pcie-root-port,port=0x20,chassis=17,id=pci.17,bus=pcie.0,multifunction=on,addr=0x4 -device ich9-usb-ehci1,id=usb,bus=pcie.0,addr=0x5.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pcie.0,multifunction=on,addr=0x5 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pcie.0,addr=0x5.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pcie.0,addr=0x5.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pcie.0,addr=0x6 -drive file=/home/yhong/rhel75-x86_64-scsi-30G.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none -device virtio-blk-pci,scsi=off,bus=pci.2,addr=0x0,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=26,id=hostnet0,vhost=on,vhostfd=28 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:a5:95:18,bus=pcie.0,addr=0x11 -add-fd set=2,fd=30 -chardev file,id=charserial0,path=/dev/fdset/2,append=on -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/domain-18-iommu1/org.qemu.guest_agent.0,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel1,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -spice port=5900,addr=0.0.0.0,disable-ticketing,seamless-migration=on -vnc 0.0.0.0:10,share=allow-exclusive -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1 -device ib700,id=watchdog0 -watchdog-action reset -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -device virtio-balloon-pci,id=balloon0,bus=pci.4,addr=0x0 -device pvpanic -msg timestamp=on
2018-02-23 03:07:18.624+0000: initiating migration
2018-02-23 03:07:26.862+0000: shutting down, reason=migrated
2018-02-23T03:07:26.864252Z qemu-kvm: terminating on signal 15 from pid 10049 (/usr/sbin/libvirtd)

The dst libvirt log info:
[root@dhcp-10-208 qemu]# cat iommu1.log 
2018-02-23 04:22:41.744+0000: shutting down, reason=crashed
2018-02-23 04:23:02.188+0000: starting up libvirt version: 3.9.0, package: 9.el7 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2018-01-23-10:22:35, x86-040.build.eng.bos.redhat.com), qemu version: 2.10.0(qemu-kvm-rhev-2.10.0-16.el7), hostname: dhcp-10-208.nay.redhat.com
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name guest=iommu1,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-10-iommu1/master-key.aes -machine pc-q35-rhel7.4.0,accel=kvm,usb=off,dump-guest-core=off,kernel_irqchip=split -m 4141 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid 1b3268d6-b59c-406b-a14c-33b000b15b6c -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-10-iommu1/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on -device intel-iommu,intremap=on,caching-mode=on,eim=on,device-iotlb=on -device pcie-root-port,port=0x8,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2 -device pcie-root-port,port=0x9,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 -device pcie-root-port,port=0x13,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x3 -device pcie-root-port,port=0x14,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x4 -device pcie-root-port,port=0x15,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x5 -device pcie-root-port,port=0x16,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x6 -device pcie-root-port,port=0x17,chassis=7,id=pci.7,bus=pcie.0,addr=0x2.0x7 -device pcie-root-port,port=0x18,chassis=8,id=pci.8,bus=pcie.0,multifunction=on,addr=0x3 -device pcie-root-port,port=0x19,chassis=9,id=pci.9,bus=pcie.0,addr=0x3.0x1 -device pcie-root-port,port=0x1a,chassis=10,id=pci.10,bus=pcie.0,addr=0x3.0x2 -device pcie-root-port,port=0x1b,chassis=11,id=pci.11,bus=pcie.0,addr=0x3.0x3 -device pcie-root-port,port=0x1c,chassis=12,id=pci.12,bus=pcie.0,addr=0x3.0x4 -device pcie-root-port,port=0x1d,chassis=13,id=pci.13,bus=pcie.0,addr=0x3.0x5 -device pcie-root-port,port=0x1e,chassis=14,id=pci.14,bus=pcie.0,addr=0x3.0x6 -device pcie-root-port,port=0xa,chassis=15,id=pci.15,bus=pcie.0,addr=0x2.0x2 -device pcie-root-port,port=0x1f,chassis=16,id=pci.16,bus=pcie.0,addr=0x3.0x7 -device pcie-root-port,port=0x20,chassis=17,id=pci.17,bus=pcie.0,multifunction=on,addr=0x4 -device ich9-usb-ehci1,id=usb,bus=pcie.0,addr=0x5.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pcie.0,multifunction=on,addr=0x5 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pcie.0,addr=0x5.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pcie.0,addr=0x5.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pcie.0,addr=0x6 -drive file=/home/yhong/rhel75-x86_64-scsi-30G.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none -device virtio-blk-pci,scsi=off,bus=pci.2,addr=0x0,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=26,id=hostnet0,vhost=on,vhostfd=28 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:a5:95:18,bus=pcie.0,addr=0x11 -add-fd set=2,fd=30 -chardev file,id=charserial0,path=/dev/fdset/2,append=on -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/domain-10-iommu1/org.qemu.guest_agent.0,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel1,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -spice port=5900,addr=0.0.0.0,disable-ticketing,seamless-migration=on -vnc 0.0.0.0:10,share=allow-exclusive -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1 -device ib700,id=watchdog0 -watchdog-action reset -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -incoming defer -device virtio-balloon-pci,id=balloon0,bus=pci.4,addr=0x0 -device pvpanic -msg timestamp=on
2018-02-23T04:23:03.778396Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392167 kHz), and TSC scaling unavailable
2018-02-23T04:23:03.778497Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392167 kHz), and TSC scaling unavailable
2018-02-23T04:23:03.778537Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392167 kHz), and TSC scaling unavailable
2018-02-23T04:23:03.778574Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392167 kHz), and TSC scaling unavailable
[root@dhcp-10-208 qemu]# 

Gdb strace info :
(gdb) bt
#0  0x00007ff792319f0f in ppoll () at /lib64/libc.so.6
#1  0x00005622b42ec579 in qemu_poll_ns (__ss=0x0, __timeout=0x7ffc8581e060, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77
#2  0x00005622b42ec579 in qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=timeout@entry=27078081) at util/qemu-timer.c:334
#3  0x00005622b42ed378 in main_loop_wait (timeout=27078081) at util/main-loop.c:255
#4  0x00005622b42ed378 in main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:515
#5  0x00005622b3fcd84a in main () at vl.c:1917
#6  0x00005622b3fcd84a in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4805
(gdb) bt full
#0  0x00007ff792319f0f in ppoll () at /lib64/libc.so.6
#1  0x00005622b42ec579 in qemu_poll_ns (__ss=0x0, __timeout=0x7ffc8581e060, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77
        ts = {tv_sec = 0, tv_nsec = 27078081}
Python Exception <class 'gdb.error'> That operation is not available on integers of more than 8 bytes.: 
#2  0x00005622b42ec579 in qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=timeout@entry=27078081) at util/qemu-timer.c:334
        ts = {tv_sec = 0, tv_nsec = 27078081}
Python Exception <class 'gdb.error'> That operation is not available on integers of more than 8 bytes.: 
#3  0x00005622b42ed378 in main_loop_wait (timeout=27078081) at util/main-loop.c:255
        context = 0x5622b67cea50
        ret = <optimized out>
---Type <return> to continue, or q <return> to quit---
        spin_counter = 0
        ret = -2055085932
        timeout = 4294967295
        timeout_ns = <optimized out>
#4  0x00005622b42ed378 in main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:515
        ret = -2055085932
        timeout = 4294967295
        timeout_ns = <optimized out>
---Type <return> to continue, or q <return> to quit---
#5  0x00005622b3fcd84a in main () at vl.c:1917
        i = <optimized out>
        snapshot = <optimized out>
        linux_boot = <optimized out>
        initrd_filename = <optimized out>
        kernel_filename = <optimized out>
        kernel_cmdline = <optimized out>
        boot_order = <optimized out>
---Type <return> to continue, or q <return> to quit---
        boot_once = 0x0
        cyls = <optimized out>
        heads = <optimized out>
        secs = <optimized out>
        translation = <optimized out>
        opts = <optimized out>
        machine_opts = <optimized out>
        hda_opts = <optimized out>
---Type <return> to continue, or q <return> to quit---
        icount_opts = <optimized out>
        accel_opts = <optimized out>
        olist = <optimized out>
        optind = 130
        optarg = 0x7ffc85820f81 "timestamp=on"
        loadvm = <optimized out>
        machine_class = 0x0
        cpu_model = <optimized out>
---Type <return> to continue, or q <return> to quit---
        vga_model = 0x0
        qtest_chrdev = <optimized out>
        qtest_log = <optimized out>
        pid_file = <optimized out>
        incoming = <optimized out>
        defconfig = <optimized out>
        userconfig = <optimized out>
        nographic = <optimized out>
---Type <return> to continue, or q <return> to quit---
        display_type = <optimized out>
        display_remote = <optimized out>
        log_mask = <optimized out>
        log_file = <optimized out>
        trace_file = <optimized out>
        maxram_size = <optimized out>
        ram_slots = <optimized out>
        vmstate_dump_file = <optimized out>
---Type <return> to continue, or q <return> to quit---
        main_loop_err = 0x0
        err = 0x0
        list_data_dirs = <optimized out>
        bdo_queue = {sqh_first = 0x0, sqh_last = 0x7ffc8581e1f0}
        __func__ = "main"
        __FUNCTION__ = "main"
#6  0x00005622b3fcd84a in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4805
        i = <optimized out>
---Type <return> to continue, or q <return> to quit---
        snapshot = <optimized out>
        linux_boot = <optimized out>
        initrd_filename = <optimized out>
        kernel_filename = <optimized out>
        kernel_cmdline = <optimized out>
        boot_order = <optimized out>
        boot_once = 0x0
        cyls = <optimized out>
---Type <return> to continue, or q <return> to quit---
        heads = <optimized out>
        secs = <optimized out>
        translation = <optimized out>
        opts = <optimized out>
        machine_opts = <optimized out>
        hda_opts = <optimized out>
        icount_opts = <optimized out>
        accel_opts = <optimized out>
---Type <return> to continue, or q <return> to quit---
        olist = <optimized out>
        optind = 130
        optarg = 0x7ffc85820f81 "timestamp=on"
        loadvm = <optimized out>
        machine_class = 0x0
        cpu_model = <optimized out>
        vga_model = 0x0
        qtest_chrdev = <optimized out>
---Type <return> to continue, or q <return> to quit---
        qtest_log = <optimized out>
        pid_file = <optimized out>
        incoming = <optimized out>
        defconfig = <optimized out>
        userconfig = <optimized out>
        nographic = <optimized out>
        display_type = <optimized out>
        display_remote = <optimized out>
---Type <return> to continue, or q <return> to quit---
        log_mask = <optimized out>
        log_file = <optimized out>
        trace_file = <optimized out>
        maxram_size = <optimized out>
        ram_slots = <optimized out>
        vmstate_dump_file = <optimized out>
        main_loop_err = 0x0
        err = 0x0
---Type <return> to continue, or q <return> to quit---
        list_data_dirs = <optimized out>
        bdo_queue = {sqh_first = 0x0, sqh_last = 0x7ffc8581e1f0}
        __func__ = "main"
        __FUNCTION__ = "main"

How reproducible:
20%

Note: The dst guest crash easily during booting.
This bug seems like not be fixed with qemu-kvm-rhev-2.10.0-21.el7.x86_64.
Maybe whether have problem with the step of verification ?
Thanks.

Comment 18 yafu 2018-02-23 05:50:07 UTC
(In reply to Yongxue Hong from comment #17)
> Reproduction:
> Src host(RHEL 7.5):
> libvirt: libvirt-3.9.0-9.el7.x86_64
> qemu: qemu-kvm-rhev-2.10.0-18.el7.x86_64
> host: 3.10.0-855.el7.x86_64
> guest: 3.10.0-855.el7.x86_64
> 
> Dst host(RHEL 7.4):
> libvirt: libvirt-3.9.0-9.el7.x86_64
> qemu: qemu-kvm-rhev-2.10.0-16.el7.x86_64
> host: 3.10.0-693.20.1.el7.x86_64
> 
> Step of reproduction same as comment 5.
> 
> Actual result:
> [root@dhcp-10-122 yhong]# while true; do virsh migrate iommu1
> qemu+ssh://10.66.10.208/system --live --verbose; done
> Migration: [100 %]
> error: Requested operation is not valid: domain is not running
> error: Requested operation is not valid: domain is not running
> 
> The src libvirt log info :
> [root@dhcp-10-122 qemu]# cat iommu1.log 
> 2018-02-23 02:42:46.107+0000: starting up libvirt version: 3.9.0, package:
> 9.el7 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>,
> 2018-01-23-10:22:35, x86-040.build.eng.bos.redhat.com), qemu version:
> 2.10.0(qemu-kvm-rhev-2.10.0-18.el7), hostname: dhcp-10-122.nay.redhat.com
> LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
> QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name
> guest=iommu1,debug-threads=on -S -object
> secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-16-iommu1/
> master-key.aes -machine
> pc-q35-rhel7.4.0,accel=kvm,usb=off,dump-guest-core=off,kernel_irqchip=split
> -m 4141 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid
> 1b3268d6-b59c-406b-a14c-33b000b15b6c -no-user-config -nodefaults -chardev
> socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-16-iommu1/monitor.
> sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc
> base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet
> -no-shutdown -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1
> -boot strict=on -device
> intel-iommu,intremap=on,caching-mode=on,eim=on,device-iotlb=on -device
> pcie-root-port,port=0x8,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,
> addr=0x2 -device
> pcie-root-port,port=0x9,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 -device
> pcie-root-port,port=0x13,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x3 -device
> pcie-root-port,port=0x14,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x4 -device
> pcie-root-port,port=0x15,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x5 -device
> pcie-root-port,port=0x16,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x6 -device
> pcie-root-port,port=0x17,chassis=7,id=pci.7,bus=pcie.0,addr=0x2.0x7 -device
> pcie-root-port,port=0x18,chassis=8,id=pci.8,bus=pcie.0,multifunction=on,
> addr=0x3 -device
> pcie-root-port,port=0x19,chassis=9,id=pci.9,bus=pcie.0,addr=0x3.0x1 -device
> pcie-root-port,port=0x1a,chassis=10,id=pci.10,bus=pcie.0,addr=0x3.0x2
> -device
> pcie-root-port,port=0x1b,chassis=11,id=pci.11,bus=pcie.0,addr=0x3.0x3
> -device
> pcie-root-port,port=0x1c,chassis=12,id=pci.12,bus=pcie.0,addr=0x3.0x4
> -device
> pcie-root-port,port=0x1d,chassis=13,id=pci.13,bus=pcie.0,addr=0x3.0x5
> -device
> pcie-root-port,port=0x1e,chassis=14,id=pci.14,bus=pcie.0,addr=0x3.0x6
> -device pcie-root-port,port=0xa,chassis=15,id=pci.15,bus=pcie.0,addr=0x2.0x2
> -device
> pcie-root-port,port=0x1f,chassis=16,id=pci.16,bus=pcie.0,addr=0x3.0x7
> -device
> pcie-root-port,port=0x20,chassis=17,id=pci.17,bus=pcie.0,multifunction=on,
> addr=0x4 -device ich9-usb-ehci1,id=usb,bus=pcie.0,addr=0x5.0x7 -device
> ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pcie.0,multifunction=on,
> addr=0x5 -device
> ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pcie.0,addr=0x5.0x1 -device
> ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pcie.0,addr=0x5.0x2 -device
> virtio-serial-pci,id=virtio-serial0,bus=pcie.0,addr=0x6 -drive
> file=/home/yhong/rhel75-x86_64-scsi-30G.qcow2,format=qcow2,if=none,id=drive-
> virtio-disk0,cache=none -device
> virtio-blk-pci,scsi=off,bus=pci.2,addr=0x0,drive=drive-virtio-disk0,
> id=virtio-disk0,bootindex=1 -netdev
> tap,fd=26,id=hostnet0,vhost=on,vhostfd=28 -device
> virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:a5:95:18,bus=pcie.0,
> addr=0x11 -add-fd set=2,fd=30 -chardev
> file,id=charserial0,path=/dev/fdset/2,append=on -device
> isa-serial,chardev=charserial0,id=serial0 -chardev
> socket,id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/domain-16-
> iommu1/org.qemu.guest_agent.0,server,nowait -device
> virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,
> name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel1,name=vdagent
> -device
> virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,
> name=com.redhat.spice.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -spice
> port=5900,addr=0.0.0.0,disable-ticketing,seamless-migration=on -vnc
> 0.0.0.0:10,share=allow-exclusive -device
> qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,
> vgamem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1 -device ib700,id=watchdog0
> -watchdog-action reset -chardev spicevmc,id=charredir0,name=usbredir -device
> usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev
> spicevmc,id=charredir1,name=usbredir -device
> usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -device
> virtio-balloon-pci,id=balloon0,bus=pci.4,addr=0x0 -device pvpanic -msg
> timestamp=on
> 2018-02-23 02:43:07.705+0000: initiating migration
> 2018-02-23 02:43:09.428+0000: shutting down, reason=migrated
> 2018-02-23T02:43:09.430823Z qemu-kvm: terminating on signal 15 from pid
> 10049 (/usr/sbin/libvirtd)
> 
> The dst libvirt log info :
> [root@dhcp-10-208 qemu]# cat iommu1.log 
> 2018-02-23 02:30:37.280+0000: shutting down, reason=crashed
> 2018-02-23 02:31:00.094+0000: starting up libvirt version: 3.9.0, package:
> 9.el7 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>,
> 2018-01-23-10:22:35, x86-040.build.eng.bos.redhat.com), qemu version:
> 2.10.0(qemu-kvm-rhev-2.10.0-16.el7), hostname: dhcp-10-208.nay.redhat.com
> LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
> QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name
> guest=iommu1,debug-threads=on -S -object
> secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-11-iommu1/
> master-key.aes -machine
> pc-q35-rhel7.4.0,accel=kvm,usb=off,dump-guest-core=off,kernel_irqchip=split
> -m 4141 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid
> 1b3268d6-b59c-406b-a14c-33b000b15b6c -no-user-config -nodefaults -chardev
> socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-11-iommu1/monitor.
> sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc
> base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet
> -no-shutdown -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1
> -boot strict=on -device
> intel-iommu,intremap=on,caching-mode=on,eim=on,device-iotlb=on -device
> pcie-root-port,port=0x8,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,
> addr=0x2 -device
> pcie-root-port,port=0x9,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 -device
> pcie-root-port,port=0x13,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x3 -device
> pcie-root-port,port=0x14,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x4 -device
> pcie-root-port,port=0x15,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x5 -device
> pcie-root-port,port=0x16,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x6 -device
> pcie-root-port,port=0x17,chassis=7,id=pci.7,bus=pcie.0,addr=0x2.0x7 -device
> pcie-root-port,port=0x18,chassis=8,id=pci.8,bus=pcie.0,multifunction=on,
> addr=0x3 -device
> pcie-root-port,port=0x19,chassis=9,id=pci.9,bus=pcie.0,addr=0x3.0x1 -device
> pcie-root-port,port=0x1a,chassis=10,id=pci.10,bus=pcie.0,addr=0x3.0x2
> -device
> pcie-root-port,port=0x1b,chassis=11,id=pci.11,bus=pcie.0,addr=0x3.0x3
> -device
> pcie-root-port,port=0x1c,chassis=12,id=pci.12,bus=pcie.0,addr=0x3.0x4
> -device
> pcie-root-port,port=0x1d,chassis=13,id=pci.13,bus=pcie.0,addr=0x3.0x5
> -device
> pcie-root-port,port=0x1e,chassis=14,id=pci.14,bus=pcie.0,addr=0x3.0x6
> -device pcie-root-port,port=0xa,chassis=15,id=pci.15,bus=pcie.0,addr=0x2.0x2
> -device
> pcie-root-port,port=0x1f,chassis=16,id=pci.16,bus=pcie.0,addr=0x3.0x7
> -device
> pcie-root-port,port=0x20,chassis=17,id=pci.17,bus=pcie.0,multifunction=on,
> addr=0x4 -device ich9-usb-ehci1,id=usb,bus=pcie.0,addr=0x5.0x7 -device
> ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pcie.0,multifunction=on,
> addr=0x5 -device
> ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pcie.0,addr=0x5.0x1 -device
> ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pcie.0,addr=0x5.0x2 -device
> virtio-serial-pci,id=virtio-serial0,bus=pcie.0,addr=0x6 -drive
> file=/home/yhong/rhel75-x86_64-scsi-30G.qcow2,format=qcow2,if=none,id=drive-
> virtio-disk0,cache=none -device
> virtio-blk-pci,scsi=off,bus=pci.2,addr=0x0,drive=drive-virtio-disk0,
> id=virtio-disk0,bootindex=1 -netdev
> tap,fd=26,id=hostnet0,vhost=on,vhostfd=28 -device
> virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:a5:95:18,bus=pcie.0,
> addr=0x11 -add-fd set=2,fd=30 -chardev
> file,id=charserial0,path=/dev/fdset/2,append=on -device
> isa-serial,chardev=charserial0,id=serial0 -chardev
> socket,id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/domain-11-
> iommu1/org.qemu.guest_agent.0,server,nowait -device
> virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,
> name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel1,name=vdagent
> -device
> virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,
> name=com.redhat.spice.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -spice
> port=5900,addr=0.0.0.0,disable-ticketing,seamless-migration=on -vnc
> 0.0.0.0:10,share=allow-exclusive -device
> qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,
> vgamem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1 -device ib700,id=watchdog0
> -watchdog-action reset -chardev spicevmc,id=charredir0,name=usbredir -device
> usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev
> spicevmc,id=charredir1,name=usbredir -device
> usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -incoming defer
> -device virtio-balloon-pci,id=balloon0,bus=pci.4,addr=0x0 -device pvpanic
> -msg timestamp=on
> 2018-02-23T02:31:01.546540Z qemu-kvm: warning: TSC frequency mismatch
> between VM (3392293 kHz) and host (3392162 kHz), and TSC scaling unavailable
> 2018-02-23T02:31:01.546660Z qemu-kvm: warning: TSC frequency mismatch
> between VM (3392293 kHz) and host (3392162 kHz), and TSC scaling unavailable
> 2018-02-23T02:31:01.546705Z qemu-kvm: warning: TSC frequency mismatch
> between VM (3392293 kHz) and host (3392162 kHz), and TSC scaling unavailable
> 2018-02-23T02:31:01.546755Z qemu-kvm: warning: TSC frequency mismatch
> between VM (3392293 kHz) and host (3392162 kHz), and TSC scaling unavailable
> [root@dhcp-10-208 qemu]# 
> 
> The dst guest crashed.
> Gdb strace info :
> [root@dhcp-10-208 qemu]# gdb -p 14424
> (gdb) bt
> #0  0x00007fe431bb5f0f in ppoll () at /lib64/libc.so.6
> #1  0x00005561f6fe6579 in qemu_poll_ns (__ss=0x0, __timeout=0x7ffdbe677a00,
> __nfds=<optimized out>, __fds=<optimized out>) at
> /usr/include/bits/poll2.h:77
> #2  0x00005561f6fe6579 in qemu_poll_ns (fds=<optimized out>, nfds=<optimized
> out>, timeout=timeout@entry=1606302113) at util/qemu-timer.c:334
> #3  0x00005561f6fe7378 in main_loop_wait (timeout=1606302113) at
> util/main-loop.c:255
> #4  0x00005561f6fe7378 in main_loop_wait (nonblocking=nonblocking@entry=0)
> at util/main-loop.c:515
> #5  0x00005561f6cc784a in main () at vl.c:1917
> #6  0x00005561f6cc784a in main (argc=<optimized out>, argv=<optimized out>,
> envp=<optimized out>) at vl.c:4805
> (gdb) bt full
> #0  0x00007fe431bb5f0f in ppoll () at /lib64/libc.so.6
> #1  0x00005561f6fe6579 in qemu_poll_ns (__ss=0x0, __timeout=0x7ffdbe677a00,
> __nfds=<optimized out>, __fds=<optimized out>) at
> /usr/include/bits/poll2.h:77
>         ts = {tv_sec = 1, tv_nsec = 606302113}
> Python Exception <class 'gdb.error'> That operation is not available on
> integers of more than 8 bytes.: 
> #2  0x00005561f6fe6579 in qemu_poll_ns (fds=<optimized out>, nfds=<optimized
> out>, timeout=timeout@entry=1606302113) at util/qemu-timer.c:334
>         ts = {tv_sec = 1, tv_nsec = 606302113}
> Python Exception <class 'gdb.error'> That operation is not available on
> integers of more than 8 bytes.: 
> #3  0x00005561f6fe7378 in main_loop_wait (timeout=1606302113) at
> util/main-loop.c:255
>         context = 0x5561f8476a50
>         ret = <optimized out>
>         spin_counter = 0
>         ret = -1100514764
>         timeout = 4294967295
>         timeout_ns = <optimized out>
> #4  0x00005561f6fe7378 in main_loop_wait (nonblocking=nonblocking@entry=0)
> at util/main-loop.c:515
>         ret = -1100514764
>         timeout = 4294967295
>         timeout_ns = <optimized out>
> #5  0x00005561f6cc784a in main () at vl.c:1917
>         i = <optimized out>
>         snapshot = <optimized out>
>         linux_boot = <optimized out>
>         initrd_filename = <optimized out>
>         kernel_filename = <optimized out>
>         kernel_cmdline = <optimized out>
>         boot_order = <optimized out>
>         boot_once = 0x0
>         cyls = <optimized out>
>         heads = <optimized out>
>         secs = <optimized out>
>         translation = <optimized out>
>         opts = <optimized out>
>         machine_opts = <optimized out>
>         hda_opts = <optimized out>
>         icount_opts = <optimized out>
>         accel_opts = <optimized out>
>         olist = <optimized out>
>         optind = 130
>         optarg = 0x7ffdbe679f81 "timestamp=on"
> ---Type <return> to continue, or q <return> to quit---
>         loadvm = <optimized out>
>         machine_class = 0x0
>         cpu_model = <optimized out>
>         vga_model = 0x0
>         qtest_chrdev = <optimized out>
>         qtest_log = <optimized out>
>         pid_file = <optimized out>
>         incoming = <optimized out>
>         defconfig = <optimized out>
>         userconfig = <optimized out>
>         nographic = <optimized out>
>         display_type = <optimized out>
>         display_remote = <optimized out>
>         log_mask = <optimized out>
>         log_file = <optimized out>
>         trace_file = <optimized out>
>         maxram_size = <optimized out>
>         ram_slots = <optimized out>
>         vmstate_dump_file = <optimized out>
>         main_loop_err = 0x0
>         err = 0x0
>         list_data_dirs = <optimized out>
>         bdo_queue = {sqh_first = 0x0, sqh_last = 0x7ffdbe677b90}
>         __func__ = "main"
>         __FUNCTION__ = "main"
> #6  0x00005561f6cc784a in main (argc=<optimized out>, argv=<optimized out>,
> envp=<optimized out>) at vl.c:4805
>         i = <optimized out>
>         snapshot = <optimized out>
>         linux_boot = <optimized out>
>         initrd_filename = <optimized out>
>         kernel_filename = <optimized out>
>         kernel_cmdline = <optimized out>
>         boot_order = <optimized out>
>         boot_once = 0x0
>         cyls = <optimized out>
>         heads = <optimized out>
>         secs = <optimized out>
> ---Type <return> to continue, or q <return> to quit---
>         translation = <optimized out>
>         opts = <optimized out>
>         machine_opts = <optimized out>
>         hda_opts = <optimized out>
>         icount_opts = <optimized out>
>         accel_opts = <optimized out>
>         olist = <optimized out>
>         optind = 130
>         optarg = 0x7ffdbe679f81 "timestamp=on"
>         loadvm = <optimized out>
>         machine_class = 0x0
>         cpu_model = <optimized out>
>         vga_model = 0x0
>         qtest_chrdev = <optimized out>
>         qtest_log = <optimized out>
>         pid_file = <optimized out>
>         incoming = <optimized out>
>         defconfig = <optimized out>
>         userconfig = <optimized out>
>         nographic = <optimized out>
>         display_type = <optimized out>
>         display_remote = <optimized out>
>         log_mask = <optimized out>
>         log_file = <optimized out>
>         trace_file = <optimized out>
>         maxram_size = <optimized out>
>         ram_slots = <optimized out>
>         vmstate_dump_file = <optimized out>
>         main_loop_err = 0x0
>         err = 0x0
>         list_data_dirs = <optimized out>
>         bdo_queue = {sqh_first = 0x0, sqh_last = 0x7ffdbe677b90}
>         __func__ = "main"
>         __FUNCTION__ = "main"
> (gdb) 
> 
> 
> Verification(repeat 5 times) with qemu-kvm-rhev-2.10.0-21.el7.x86_64 on src
> host:
> Src host:
> libvirt: libvirt-3.9.0-9.el7.x86_64
> qemu: qemu-kvm-rhev-2.10.0-21.el7.x86_64
> host: 3.10.0-855.el7.x86_64
> guest: 3.10.0-855.el7.x86_64
> 
> Dst host:
> libvirt: libvirt-3.9.0-9.el7.x86_64
> qemu: qemu-kvm-rhev-2.10.0-16.el7.x86_64
> host: 3.10.0-693.20.1.el7.x86_64
> 
> Step of verification same as comment 5.
> 
> Actual result:
> [root@dhcp-10-122 yhong]# while true; do virsh migrate iommu1
> qemu+ssh://10.66.10.208/system --live --verbose; done
> Migration: [100 %]
> error: Requested operation is not valid: domain is not running
> 
> error: Requested operation is not valid: domain is not running
> 
> The src libvirt log info:
> [root@dhcp-10-122 qemu]# cat iommu1.log 
> 2018-02-23 03:04:54.797+0000: starting up libvirt version: 3.9.0, package:
> 9.el7 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>,
> 2018-01-23-10:22:35, x86-040.build.eng.bos.redhat.com), qemu version:
> 2.10.0(qemu-kvm-rhev-2.10.0-21.el7), hostname: dhcp-10-122.nay.redhat.com
> LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
> QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name
> guest=iommu1,debug-threads=on -S -object
> secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-18-iommu1/
> master-key.aes -machine
> pc-q35-rhel7.4.0,accel=kvm,usb=off,dump-guest-core=off,kernel_irqchip=split
> -m 4141 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid
> 1b3268d6-b59c-406b-a14c-33b000b15b6c -no-user-config -nodefaults -chardev
> socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-18-iommu1/monitor.
> sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc
> base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet
> -no-shutdown -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1
> -boot strict=on -device
> intel-iommu,intremap=on,caching-mode=on,eim=on,device-iotlb=on -device
> pcie-root-port,port=0x8,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,
> addr=0x2 -device
> pcie-root-port,port=0x9,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 -device
> pcie-root-port,port=0x13,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x3 -device
> pcie-root-port,port=0x14,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x4 -device
> pcie-root-port,port=0x15,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x5 -device
> pcie-root-port,port=0x16,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x6 -device
> pcie-root-port,port=0x17,chassis=7,id=pci.7,bus=pcie.0,addr=0x2.0x7 -device
> pcie-root-port,port=0x18,chassis=8,id=pci.8,bus=pcie.0,multifunction=on,
> addr=0x3 -device
> pcie-root-port,port=0x19,chassis=9,id=pci.9,bus=pcie.0,addr=0x3.0x1 -device
> pcie-root-port,port=0x1a,chassis=10,id=pci.10,bus=pcie.0,addr=0x3.0x2
> -device
> pcie-root-port,port=0x1b,chassis=11,id=pci.11,bus=pcie.0,addr=0x3.0x3
> -device
> pcie-root-port,port=0x1c,chassis=12,id=pci.12,bus=pcie.0,addr=0x3.0x4
> -device
> pcie-root-port,port=0x1d,chassis=13,id=pci.13,bus=pcie.0,addr=0x3.0x5
> -device
> pcie-root-port,port=0x1e,chassis=14,id=pci.14,bus=pcie.0,addr=0x3.0x6
> -device pcie-root-port,port=0xa,chassis=15,id=pci.15,bus=pcie.0,addr=0x2.0x2
> -device
> pcie-root-port,port=0x1f,chassis=16,id=pci.16,bus=pcie.0,addr=0x3.0x7
> -device
> pcie-root-port,port=0x20,chassis=17,id=pci.17,bus=pcie.0,multifunction=on,
> addr=0x4 -device ich9-usb-ehci1,id=usb,bus=pcie.0,addr=0x5.0x7 -device
> ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pcie.0,multifunction=on,
> addr=0x5 -device
> ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pcie.0,addr=0x5.0x1 -device
> ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pcie.0,addr=0x5.0x2 -device
> virtio-serial-pci,id=virtio-serial0,bus=pcie.0,addr=0x6 -drive
> file=/home/yhong/rhel75-x86_64-scsi-30G.qcow2,format=qcow2,if=none,id=drive-
> virtio-disk0,cache=none -device
> virtio-blk-pci,scsi=off,bus=pci.2,addr=0x0,drive=drive-virtio-disk0,
> id=virtio-disk0,bootindex=1 -netdev
> tap,fd=26,id=hostnet0,vhost=on,vhostfd=28 -device
> virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:a5:95:18,bus=pcie.0,
> addr=0x11 -add-fd set=2,fd=30 -chardev
> file,id=charserial0,path=/dev/fdset/2,append=on -device
> isa-serial,chardev=charserial0,id=serial0 -chardev
> socket,id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/domain-18-
> iommu1/org.qemu.guest_agent.0,server,nowait -device
> virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,
> name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel1,name=vdagent
> -device
> virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,
> name=com.redhat.spice.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -spice
> port=5900,addr=0.0.0.0,disable-ticketing,seamless-migration=on -vnc
> 0.0.0.0:10,share=allow-exclusive -device
> qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,
> vgamem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1 -device ib700,id=watchdog0
> -watchdog-action reset -chardev spicevmc,id=charredir0,name=usbredir -device
> usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev
> spicevmc,id=charredir1,name=usbredir -device
> usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -device
> virtio-balloon-pci,id=balloon0,bus=pci.4,addr=0x0 -device pvpanic -msg
> timestamp=on
> 2018-02-23 03:07:18.624+0000: initiating migration
> 2018-02-23 03:07:26.862+0000: shutting down, reason=migrated
> 2018-02-23T03:07:26.864252Z qemu-kvm: terminating on signal 15 from pid
> 10049 (/usr/sbin/libvirtd)
> 
> The dst libvirt log info:
> [root@dhcp-10-208 qemu]# cat iommu1.log 
> 2018-02-23 04:22:41.744+0000: shutting down, reason=crashed
> 2018-02-23 04:23:02.188+0000: starting up libvirt version: 3.9.0, package:
> 9.el7 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>,
> 2018-01-23-10:22:35, x86-040.build.eng.bos.redhat.com), qemu version:
> 2.10.0(qemu-kvm-rhev-2.10.0-16.el7), hostname: dhcp-10-208.nay.redhat.com
> LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
> QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name
> guest=iommu1,debug-threads=on -S -object
> secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-10-iommu1/
> master-key.aes -machine
> pc-q35-rhel7.4.0,accel=kvm,usb=off,dump-guest-core=off,kernel_irqchip=split
> -m 4141 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid
> 1b3268d6-b59c-406b-a14c-33b000b15b6c -no-user-config -nodefaults -chardev
> socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-10-iommu1/monitor.
> sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc
> base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet
> -no-shutdown -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1
> -boot strict=on -device
> intel-iommu,intremap=on,caching-mode=on,eim=on,device-iotlb=on -device
> pcie-root-port,port=0x8,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,
> addr=0x2 -device
> pcie-root-port,port=0x9,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 -device
> pcie-root-port,port=0x13,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x3 -device
> pcie-root-port,port=0x14,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x4 -device
> pcie-root-port,port=0x15,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x5 -device
> pcie-root-port,port=0x16,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x6 -device
> pcie-root-port,port=0x17,chassis=7,id=pci.7,bus=pcie.0,addr=0x2.0x7 -device
> pcie-root-port,port=0x18,chassis=8,id=pci.8,bus=pcie.0,multifunction=on,
> addr=0x3 -device
> pcie-root-port,port=0x19,chassis=9,id=pci.9,bus=pcie.0,addr=0x3.0x1 -device
> pcie-root-port,port=0x1a,chassis=10,id=pci.10,bus=pcie.0,addr=0x3.0x2
> -device
> pcie-root-port,port=0x1b,chassis=11,id=pci.11,bus=pcie.0,addr=0x3.0x3
> -device
> pcie-root-port,port=0x1c,chassis=12,id=pci.12,bus=pcie.0,addr=0x3.0x4
> -device
> pcie-root-port,port=0x1d,chassis=13,id=pci.13,bus=pcie.0,addr=0x3.0x5
> -device
> pcie-root-port,port=0x1e,chassis=14,id=pci.14,bus=pcie.0,addr=0x3.0x6
> -device pcie-root-port,port=0xa,chassis=15,id=pci.15,bus=pcie.0,addr=0x2.0x2
> -device
> pcie-root-port,port=0x1f,chassis=16,id=pci.16,bus=pcie.0,addr=0x3.0x7
> -device
> pcie-root-port,port=0x20,chassis=17,id=pci.17,bus=pcie.0,multifunction=on,
> addr=0x4 -device ich9-usb-ehci1,id=usb,bus=pcie.0,addr=0x5.0x7 -device
> ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pcie.0,multifunction=on,
> addr=0x5 -device
> ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pcie.0,addr=0x5.0x1 -device
> ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pcie.0,addr=0x5.0x2 -device
> virtio-serial-pci,id=virtio-serial0,bus=pcie.0,addr=0x6 -drive
> file=/home/yhong/rhel75-x86_64-scsi-30G.qcow2,format=qcow2,if=none,id=drive-
> virtio-disk0,cache=none -device
> virtio-blk-pci,scsi=off,bus=pci.2,addr=0x0,drive=drive-virtio-disk0,
> id=virtio-disk0,bootindex=1 -netdev
> tap,fd=26,id=hostnet0,vhost=on,vhostfd=28 -device
> virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:a5:95:18,bus=pcie.0,
> addr=0x11 -add-fd set=2,fd=30 -chardev
> file,id=charserial0,path=/dev/fdset/2,append=on -device
> isa-serial,chardev=charserial0,id=serial0 -chardev
> socket,id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/domain-10-
> iommu1/org.qemu.guest_agent.0,server,nowait -device
> virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,
> name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel1,name=vdagent
> -device
> virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,
> name=com.redhat.spice.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -spice
> port=5900,addr=0.0.0.0,disable-ticketing,seamless-migration=on -vnc
> 0.0.0.0:10,share=allow-exclusive -device
> qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,
> vgamem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1 -device ib700,id=watchdog0
> -watchdog-action reset -chardev spicevmc,id=charredir0,name=usbredir -device
> usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev
> spicevmc,id=charredir1,name=usbredir -device
> usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -incoming defer
> -device virtio-balloon-pci,id=balloon0,bus=pci.4,addr=0x0 -device pvpanic
> -msg timestamp=on
> 2018-02-23T04:23:03.778396Z qemu-kvm: warning: TSC frequency mismatch
> between VM (3392293 kHz) and host (3392167 kHz), and TSC scaling unavailable
> 2018-02-23T04:23:03.778497Z qemu-kvm: warning: TSC frequency mismatch
> between VM (3392293 kHz) and host (3392167 kHz), and TSC scaling unavailable
> 2018-02-23T04:23:03.778537Z qemu-kvm: warning: TSC frequency mismatch
> between VM (3392293 kHz) and host (3392167 kHz), and TSC scaling unavailable
> 2018-02-23T04:23:03.778574Z qemu-kvm: warning: TSC frequency mismatch
> between VM (3392293 kHz) and host (3392167 kHz), and TSC scaling unavailable
> [root@dhcp-10-208 qemu]# 
> 
> Gdb strace info :
> (gdb) bt
> #0  0x00007ff792319f0f in ppoll () at /lib64/libc.so.6
> #1  0x00005622b42ec579 in qemu_poll_ns (__ss=0x0, __timeout=0x7ffc8581e060,
> __nfds=<optimized out>, __fds=<optimized out>) at
> /usr/include/bits/poll2.h:77
> #2  0x00005622b42ec579 in qemu_poll_ns (fds=<optimized out>, nfds=<optimized
> out>, timeout=timeout@entry=27078081) at util/qemu-timer.c:334
> #3  0x00005622b42ed378 in main_loop_wait (timeout=27078081) at
> util/main-loop.c:255
> #4  0x00005622b42ed378 in main_loop_wait (nonblocking=nonblocking@entry=0)
> at util/main-loop.c:515
> #5  0x00005622b3fcd84a in main () at vl.c:1917
> #6  0x00005622b3fcd84a in main (argc=<optimized out>, argv=<optimized out>,
> envp=<optimized out>) at vl.c:4805
> (gdb) bt full
> #0  0x00007ff792319f0f in ppoll () at /lib64/libc.so.6
> #1  0x00005622b42ec579 in qemu_poll_ns (__ss=0x0, __timeout=0x7ffc8581e060,
> __nfds=<optimized out>, __fds=<optimized out>) at
> /usr/include/bits/poll2.h:77
>         ts = {tv_sec = 0, tv_nsec = 27078081}
> Python Exception <class 'gdb.error'> That operation is not available on
> integers of more than 8 bytes.: 
> #2  0x00005622b42ec579 in qemu_poll_ns (fds=<optimized out>, nfds=<optimized
> out>, timeout=timeout@entry=27078081) at util/qemu-timer.c:334
>         ts = {tv_sec = 0, tv_nsec = 27078081}
> Python Exception <class 'gdb.error'> That operation is not available on
> integers of more than 8 bytes.: 
> #3  0x00005622b42ed378 in main_loop_wait (timeout=27078081) at
> util/main-loop.c:255
>         context = 0x5622b67cea50
>         ret = <optimized out>
> ---Type <return> to continue, or q <return> to quit---
>         spin_counter = 0
>         ret = -2055085932
>         timeout = 4294967295
>         timeout_ns = <optimized out>
> #4  0x00005622b42ed378 in main_loop_wait (nonblocking=nonblocking@entry=0)
> at util/main-loop.c:515
>         ret = -2055085932
>         timeout = 4294967295
>         timeout_ns = <optimized out>
> ---Type <return> to continue, or q <return> to quit---
> #5  0x00005622b3fcd84a in main () at vl.c:1917
>         i = <optimized out>
>         snapshot = <optimized out>
>         linux_boot = <optimized out>
>         initrd_filename = <optimized out>
>         kernel_filename = <optimized out>
>         kernel_cmdline = <optimized out>
>         boot_order = <optimized out>
> ---Type <return> to continue, or q <return> to quit---
>         boot_once = 0x0
>         cyls = <optimized out>
>         heads = <optimized out>
>         secs = <optimized out>
>         translation = <optimized out>
>         opts = <optimized out>
>         machine_opts = <optimized out>
>         hda_opts = <optimized out>
> ---Type <return> to continue, or q <return> to quit---
>         icount_opts = <optimized out>
>         accel_opts = <optimized out>
>         olist = <optimized out>
>         optind = 130
>         optarg = 0x7ffc85820f81 "timestamp=on"
>         loadvm = <optimized out>
>         machine_class = 0x0
>         cpu_model = <optimized out>
> ---Type <return> to continue, or q <return> to quit---
>         vga_model = 0x0
>         qtest_chrdev = <optimized out>
>         qtest_log = <optimized out>
>         pid_file = <optimized out>
>         incoming = <optimized out>
>         defconfig = <optimized out>
>         userconfig = <optimized out>
>         nographic = <optimized out>
> ---Type <return> to continue, or q <return> to quit---
>         display_type = <optimized out>
>         display_remote = <optimized out>
>         log_mask = <optimized out>
>         log_file = <optimized out>
>         trace_file = <optimized out>
>         maxram_size = <optimized out>
>         ram_slots = <optimized out>
>         vmstate_dump_file = <optimized out>
> ---Type <return> to continue, or q <return> to quit---
>         main_loop_err = 0x0
>         err = 0x0
>         list_data_dirs = <optimized out>
>         bdo_queue = {sqh_first = 0x0, sqh_last = 0x7ffc8581e1f0}
>         __func__ = "main"
>         __FUNCTION__ = "main"
> #6  0x00005622b3fcd84a in main (argc=<optimized out>, argv=<optimized out>,
> envp=<optimized out>) at vl.c:4805
>         i = <optimized out>
> ---Type <return> to continue, or q <return> to quit---
>         snapshot = <optimized out>
>         linux_boot = <optimized out>
>         initrd_filename = <optimized out>
>         kernel_filename = <optimized out>
>         kernel_cmdline = <optimized out>
>         boot_order = <optimized out>
>         boot_once = 0x0
>         cyls = <optimized out>
> ---Type <return> to continue, or q <return> to quit---
>         heads = <optimized out>
>         secs = <optimized out>
>         translation = <optimized out>
>         opts = <optimized out>
>         machine_opts = <optimized out>
>         hda_opts = <optimized out>
>         icount_opts = <optimized out>
>         accel_opts = <optimized out>
> ---Type <return> to continue, or q <return> to quit---
>         olist = <optimized out>
>         optind = 130
>         optarg = 0x7ffc85820f81 "timestamp=on"
>         loadvm = <optimized out>
>         machine_class = 0x0
>         cpu_model = <optimized out>
>         vga_model = 0x0
>         qtest_chrdev = <optimized out>
> ---Type <return> to continue, or q <return> to quit---
>         qtest_log = <optimized out>
>         pid_file = <optimized out>
>         incoming = <optimized out>
>         defconfig = <optimized out>
>         userconfig = <optimized out>
>         nographic = <optimized out>
>         display_type = <optimized out>
>         display_remote = <optimized out>
> ---Type <return> to continue, or q <return> to quit---
>         log_mask = <optimized out>
>         log_file = <optimized out>
>         trace_file = <optimized out>
>         maxram_size = <optimized out>
>         ram_slots = <optimized out>
>         vmstate_dump_file = <optimized out>
>         main_loop_err = 0x0
>         err = 0x0
> ---Type <return> to continue, or q <return> to quit---
>         list_data_dirs = <optimized out>
>         bdo_queue = {sqh_first = 0x0, sqh_last = 0x7ffc8581e1f0}
>         __func__ = "main"
>         __FUNCTION__ = "main"
> 
> How reproducible:
> 20%
> 
> Note: The dst guest crash easily during booting.
> This bug seems like not be fixed with qemu-kvm-rhev-2.10.0-21.el7.x86_64.
> Maybe whether have problem with the step of verification ?
> Thanks.

The qemu-kvm-rhev pkg needs to use 7.4 version on target host. For more details, please see comment 1 and comment 5.

Comment 19 Yongxue Hong 2018-02-24 02:22:18 UTC
Sorry for comment 17, forgot downgrade the version of qemu.

Re-verified it with qemu-kvm-rhev-2.9.0-16.el7_4.14.x86_64 on dst host.

[root@dhcp-10-122 yhong]# while true; do virsh migrate iommu1 qemu+ssh://10.66.10.208/system --live --verbose; done
Migration: [100 %]error: internal error: qemu unexpectedly closed the monitor: 2018-02-24T01:57:27.279245Z qemu-kvm: AHCI: Failed to start FIS receive engine: bad FIS receive buffer address
2018-02-24T01:57:27.279289Z qemu-kvm: Failed to load ich9_ahci:ahci
2018-02-24T01:57:27.279293Z qemu-kvm: error while loading state for instance 0x0 of device '0000:00:1f.2/ich9_ahci'
2018-02-24T01:57:27.286693Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392167 kHz), and TSC scaling unavailable
2018-02-24T01:57:27.286806Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392167 kHz), and TSC scaling unavailable
2018-02-24T01:57:27.286861Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392167 kHz), and TSC scaling unavailable
2018-02-24T01:57:27.286913Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392167 kHz), and TSC scaling unavailable
2018-02-24T01:57:27.287073Z qemu-kvm: load of migration failed: Operation not permitted

Migration: [100 %]error: internal error: qemu unexpectedly closed the monitor: 2018-02-24T01:57:43.302112Z qemu-kvm: AHCI: Failed to start FIS receive engine: bad FIS receive buffer address
2018-02-24T01:57:43.302160Z qemu-kvm: Failed to load ich9_ahci:ahci
2018-02-24T01:57:43.302166Z qemu-kvm: error while loading state for instance 0x0 of device '0000:00:1f.2/ich9_ahci'
2018-02-24T01:57:43.302208Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392167 kHz), and TSC scaling unavailable
2018-02-24T01:57:43.302286Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392167 kHz), and TSC scaling unavailable
2018-02-24T01:57:43.302320Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392167 kHz), and TSC scaling unavailable
2018-02-24T01:57:43.302350Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392167 kHz), and TSC scaling unavailable
2018-02-24T01:57:43.302467Z qemu-kvm: load of migration failed: Operation not permitted

Migration: [100 %]error: internal error: qemu unexpectedly closed the monitor: 2018-02-24T01:57:58.152070Z qemu-kvm: AHCI: Failed to start FIS receive engine: bad FIS receive buffer address
2018-02-24T01:57:58.152116Z qemu-kvm: Failed to load ich9_ahci:ahci
2018-02-24T01:57:58.152123Z qemu-kvm: error while loading state for instance 0x0 of device '0000:00:1f.2/ich9_ahci'
2018-02-24T01:57:58.152169Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392167 kHz), and TSC scaling unavailable
2018-02-24T01:57:58.152253Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392167 kHz), and TSC scaling unavailable
2018-02-24T01:57:58.152285Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392167 kHz), and TSC scaling unavailable
2018-02-24T01:57:58.152317Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392167 kHz), and TSC scaling unavailable
2018-02-24T01:57:58.152434Z qemu-kvm: load of migration failed: Operation not permitted

Migration: [100 %]error: internal error: qemu unexpectedly closed the monitor: 2018-02-24T01:58:12.554427Z qemu-kvm: AHCI: Failed to start FIS receive engine: bad FIS receive buffer address
2018-02-24T01:58:12.554472Z qemu-kvm: Failed to load ich9_ahci:ahci
2018-02-24T01:58:12.554477Z qemu-kvm: error while loading state for instance 0x0 of device '0000:00:1f.2/ich9_ahci'
2018-02-24T01:58:12.554507Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392167 kHz), and TSC scaling unavailable
2018-02-24T01:58:12.554581Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392167 kHz), and TSC scaling unavailable
2018-02-24T01:58:12.554623Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392167 kHz), and TSC scaling unavailable
2018-02-24T01:58:12.554660Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392167 kHz), and TSC scaling unavailable
2018-02-24T01:58:12.554819Z qemu-kvm: load of migration failed: Operation not permitted

Migration: [100 %]error: internal error: qemu unexpectedly closed the monitor: 2018-02-24T01:58:26.782324Z qemu-kvm: AHCI: Failed to start FIS receive engine: bad FIS receive buffer address
2018-02-24T01:58:26.782370Z qemu-kvm: Failed to load ich9_ahci:ahci
2018-02-24T01:58:26.782374Z qemu-kvm: error while loading state for instance 0x0 of device '0000:00:1f.2/ich9_ahci'
2018-02-24T01:58:26.789790Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392167 kHz), and TSC scaling unavailable
2018-02-24T01:58:26.789866Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392167 kHz), and TSC scaling unavailable
2018-02-24T01:58:26.789901Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392167 kHz), and TSC scaling unavailable
2018-02-24T01:58:26.789940Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392167 kHz), and TSC scaling unavailable
2018-02-24T01:58:26.790068Z qemu-kvm: load of migration failed: Operation not permitted

Migration: [100 %]error: internal error: qemu unexpectedly closed the monitor: 2018-02-24T01:58:41.126756Z qemu-kvm: AHCI: Failed to start FIS receive engine: bad FIS receive buffer address
2018-02-24T01:58:41.126802Z qemu-kvm: Failed to load ich9_ahci:ahci
2018-02-24T01:58:41.126807Z qemu-kvm: error while loading state for instance 0x0 of device '0000:00:1f.2/ich9_ahci'
2018-02-24T01:58:41.126855Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392167 kHz), and TSC scaling unavailable
2018-02-24T01:58:41.126919Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392167 kHz), and TSC scaling unavailable
2018-02-24T01:58:41.126971Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392167 kHz), and TSC scaling unavailable
2018-02-24T01:58:41.127013Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392167 kHz), and TSC scaling unavailable
2018-02-24T01:58:41.127150Z qemu-kvm: load of migration failed: Operation not permitted

Migration: [100 %]error: internal error: qemu unexpectedly closed the monitor: 2018-02-24T01:58:55.583743Z qemu-kvm: AHCI: Failed to start FIS receive engine: bad FIS receive buffer address
2018-02-24T01:58:55.583788Z qemu-kvm: Failed to load ich9_ahci:ahci
2018-02-24T01:58:55.583793Z qemu-kvm: error while loading state for instance 0x0 of device '0000:00:1f.2/ich9_ahci'
2018-02-24T01:58:55.583840Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392167 kHz), and TSC scaling unavailable
2018-02-24T01:58:55.583937Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392167 kHz), and TSC scaling unavailable
2018-02-24T01:58:55.583982Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392167 kHz), and TSC scaling unavailable
2018-02-24T01:58:55.584040Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392167 kHz), and TSC scaling unavailable
2018-02-24T01:58:55.584212Z qemu-kvm: load of migration failed: Operation not permitted

Migration: [100 %]error: internal error: qemu unexpectedly closed the monitor: 2018-02-24T01:59:10.125045Z qemu-kvm: AHCI: Failed to start FIS receive engine: bad FIS receive buffer address
2018-02-24T01:59:10.125091Z qemu-kvm: Failed to load ich9_ahci:ahci
2018-02-24T01:59:10.125096Z qemu-kvm: error while loading state for instance 0x0 of device '0000:00:1f.2/ich9_ahci'
2018-02-24T01:59:10.131670Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392167 kHz), and TSC scaling unavailable
2018-02-24T01:59:10.131751Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392167 kHz), and TSC scaling unavailable
2018-02-24T01:59:10.131802Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392167 kHz), and TSC scaling unavailable
2018-02-24T01:59:10.131836Z qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392167 kHz), and TSC scaling unavailable
2018-02-24T01:59:10.131958Z qemu-kvm: load of migration failed: Operation not permitted

Then cancel to migrate by 'Ctrl + c'.

Migration: [ 93 %]^J^Cerror: operation aborted: migration job: canceled by client

Migration: [ 81 %]^C^Jerror: operation aborted: migration job: canceled by client

Migration: [ 38 %]^C^Jerror: operation aborted: migration job: canceled by client

^Cerror: End of file while reading data: Killed by signal 2.: Input/output error


Migration: [ 36 %]^J^Cerror: operation aborted: migration job: canceled by client


^Cerror: internal error: client socket is closed

Actual result:
The migration failed , The src guest run normally without crashed.
It accord with expected result on comment 0 .

So I think that this bug is fixed with qemu-kvm-rhev-2.10.0-21.el7, then change the status to VERIFIED .
Thanks .

Comment 20 Qunfang Zhang 2018-02-24 02:24:41 UTC
Thanks Yongxue, setting to VERIFIED.

Comment 22 errata-xmlrpc 2018-04-11 00:58:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:1104


Note You need to log in before you can comment on or make changes to this bug.