RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1086168 - qemu-kvm can not cancel migration in src host when network of dst host failed
Summary: qemu-kvm can not cancel migration in src host when network of dst host failed
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm
Version: 7.0
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Dr. David Alan Gilbert
QA Contact: Virtualization Bugs
URL:
Whiteboard:
: 1168156 (view as bug list)
Depends On:
Blocks: 1167197 1168790
TreeView+ depends on / blocked
 
Reported: 2014-04-10 09:43 UTC by Jun Li
Modified: 2015-11-19 04:52 UTC (History)
12 users (show)

Fixed In Version: qemu-kvm-1.5.3-87.el7
Doc Type: Bug Fix
Doc Text:
A failure in the destination host or network during a migration could lead to a long (~15 min) TCP timeout before migration_cancel could be used. Employ the shutdown(2) system call in migration_cancel to force the socket to be closed quickly.
Clone Of:
: 1167197 1168790 (view as bug list)
Environment:
Last Closed: 2015-11-19 04:52:23 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2015:2213 0 normal SHIPPED_LIVE qemu-kvm bug fix and enhancement update 2015-11-19 08:16:10 UTC

Description Jun Li 2014-04-10 09:43:15 UTC
Description of problem:
qemu-kvm can not cancel migration in src host when network of dst host failed.
Network of dst host will be failed such as:
Scenario 1.Net cable of dst host was unplug;
Scenario 2.use iptables to drop the data from src host.

The following will via iptables(scenario 2) to descript this issue. 

Version-Release number of selected component (if applicable):
qemu-kvm-1.5.3-60.el7.x86_64
3.10.0-121.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1.boot guest via following cli in src host and dst host.
src:
gdb --args /usr/libexec/qemu-kvm -M pc -m 4G -cpu SandyBridge,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff,hv_time -smp 8,maxcpus=8 -qmp tcp::8888,server,nowait -vnc :1 -monitor stdio -boot menu=on -rtc base=utc,clock=host,driftfix=slew -no-kvm-pit-reinjection -netdev tap,id=tap0,vhost=on,script=/etc/qemu-ifup,queues=4 -device virtio-net-pci,netdev=tap0,id=net0,mq=on,mac=24:be:05:15:11:11 -drive file=/dev/sdc,if=none,id=img,rerror=stop,werror=stop,format=raw -device virtio-blk-pci,drive=img,id=sys-img,scsi=off,addr=0x4 -drive file=/mnt/ISO/en_windows_server_2012_r2_x64_dvd_2707946.iso,if=none,id=cdrom0,rerror=stop,werror=stop -device ide-cd,drive=cdrom0,bus=ide.0,unit=0,id=disk-cdrom0,bootindex=1 -drive file=/mnt/virtio-win-1.7.0.iso,if=none,id=cdrom1,rerror=stop,werror=stop -device ide-cd,drive=cdrom1,bus=ide.1,unit=0,id=disk-cdrom1 -S
--
dst:
gdb --args /usr/libexec/qemu-kvm -M pc -m 4G -cpu SandyBridge,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff,hv_time -smp 8,maxcpus=8 -qmp tcp::8888,server,nowait -vnc :1 -monitor stdio -boot menu=on -rtc base=utc,clock=host,driftfix=slew -no-kvm-pit-reinjection -netdev tap,id=tap0,vhost=on,script=/etc/qemu-ifup,queues=4 -device virtio-net-pci,netdev=tap0,id=net0,mq=on,mac=24:be:05:15:11:11 -drive file=/dev/sdc,if=none,id=img,rerror=stop,werror=stop,format=raw -device virtio-blk-pci,drive=img,id=sys-img,scsi=off,addr=0x4 -drive file=/mnt/ISO/en_windows_server_2012_r2_x64_dvd_2707946.iso,if=none,id=cdrom0,rerror=stop,werror=stop -device ide-cd,drive=cdrom0,bus=ide.0,unit=0,id=disk-cdrom0,bootindex=1 -drive file=/mnt/virtio-win-1.7.0.iso,if=none,id=cdrom1,rerror=stop,werror=stop -device ide-cd,drive=cdrom1,bus=ide.1,unit=0,id=disk-cdrom1 -S \
-incoming tcp::5800,server,nowait 
2.do migration from src host to dst host.
(qemu) migrate -d tcp:10.66.4.247:5800
3.during migration in progress , use firewall on destination host.
# iptables -A INPUT -p tcp -d 10.66.4.247 --dport 5800 -j DROP
4.cancel this migration in src host(as this migration can not finish).
(qemu) migrate_cancel 
5.Check migration is cancel or not in src host.
(qemu) info migrate


Actual results:
after step5, can not cancel this migration in src host.
(qemu) info migrate
capabilities: xbzrle: off x-rdma-pin-all: off auto-converge: off zero-blocks: off 
Migration status: active
total time: 19296 milliseconds
expected downtime: 30 milliseconds
setup: 94 milliseconds
transferred ram: 284456 kbytes
throughput: 268.57 mbps
remaining ram: 550920 kbytes
total ram: 4211404 kbytes
duplicate: 1182023 pages
skipped: 0 pages
normal: 611672 pages
normal bytes: 2446688 kbytes

Expected results:
can cancel this migration when run migrate_cancel.

Additional info:

Comment 2 Jun Li 2014-04-10 10:09:50 UTC
Also test with qemu-kvm-rhev-1.5.3-50.el7.x86_64, hit this issue, too.

Comment 3 Qian Guo 2014-11-27 01:13:27 UTC
*** Bug 1168156 has been marked as a duplicate of this bug. ***

Comment 4 Dr. David Alan Gilbert 2015-01-08 11:13:40 UTC
Posted fix upstream to qemu-devel.

That still leaves a ~2 min timeout if you migrate to a host that's already dead; but that's a lot better than the ~15 mins that you get if it happens in the middle.

Comment 8 Miroslav Rezanina 2015-03-18 11:24:03 UTC
Fix included in qemu-kvm-1.5.3-87.el7

Comment 9 huiqingding 2015-03-27 07:37:31 UTC
Reproduce this bug using the following version:
kernel-3.10.0-234.el7.x86_64
qemu-kvm-1.5.3-60.el7.x86_64

Steps to Reproduce:
1.boot guest via following cli in src host and dst host.
src:
# /usr/libexec/qemu-kvm -M pc -m 4G -cpu SandyBridge,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff,hv_time -smp 8,maxcpus=8 -qmp tcp::8888,server,nowait -vnc :1 -monitor stdio -boot menu=on -rtc base=utc,clock=host,driftfix=slew -no-kvm-pit-reinjection -netdev tap,id=tap0,vhost=on,script=/etc/qemu-ifup,queues=4 -device virtio-net-pci,netdev=tap0,id=net0,mq=on,mac=24:be:05:15:11:11 -drive file=/dev/sdc,if=none,id=img,rerror=stop,werror=stop,format=raw -device virtio-blk-pci,drive=img,id=sys-img,scsi=off,addr=0x4 -drive file=/mnt/ISO/en_windows_server_2012_r2_x64_dvd_2707946.iso,if=none,id=cdrom0,rerror=stop,werror=stop -device ide-cd,drive=cdrom0,bus=ide.0,unit=0,id=disk-cdrom0,bootindex=1 -drive file=/mnt/virtio-win-1.7.0.iso,if=none,id=cdrom1,rerror=stop,werror=stop -device ide-cd,drive=cdrom1,bus=ide.1,unit=0,id=disk-cdrom1 -S
--
dst:
# /usr/libexec/qemu-kvm -M pc -m 4G -cpu SandyBridge,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff,hv_time -smp 8,maxcpus=8 -qmp tcp::8888,server,nowait -vnc :1 -monitor stdio -boot menu=on -rtc base=utc,clock=host,driftfix=slew -no-kvm-pit-reinjection -netdev tap,id=tap0,vhost=on,script=/etc/qemu-ifup,queues=4 -device virtio-net-pci,netdev=tap0,id=net0,mq=on,mac=24:be:05:15:11:11 -drive file=/dev/sdc,if=none,id=img,rerror=stop,werror=stop,format=raw -device virtio-blk-pci,drive=img,id=sys-img,scsi=off,addr=0x4 -drive file=/mnt/ISO/en_windows_server_2012_r2_x64_dvd_2707946.iso,if=none,id=cdrom0,rerror=stop,werror=stop -device ide-cd,drive=cdrom0,bus=ide.0,unit=0,id=disk-cdrom0,bootindex=1 -drive file=/mnt/virtio-win-1.7.0.iso,if=none,id=cdrom1,rerror=stop,werror=stop -device ide-cd,drive=cdrom1,bus=ide.1,unit=0,id=disk-cdrom1 -S \
-incoming tcp::5800,server,nowait 
2.do migration from src host to dst host.
(qemu) migrate -d tcp:10.66.9.152:5800
3. unplug the net cable of the dest host
4.cancel this migration in src host(as this migration can not finish).
(qemu) migrate_cancel 
5.Check migration is cancel or not in src host.
(qemu) info migrate


Actual results:
after step5, can not cancel this migration in src host.
(qemu) info migrate
capabilities: xbzrle: off x-rdma-pin-all: off auto-converge: off zero-blocks: off 
Migration status: active
total time: 47332 milliseconds
expected downtime: 30 milliseconds
setup: 15 milliseconds
transferred ram: 472101 kbytes
throughput: 268.57 mbps
remaining ram: 3732476 kbytes
total ram: 4211404 kbytes
duplicate: 4393 pages
skipped: 0 pages
normal: 1111387 pages
normal bytes: 4445548 kbytes
(qemu) info status
VM status: running
(qemu) info status
VM status: running

Comment 10 huiqingding 2015-03-27 07:43:54 UTC
Test this bug using the following version:
kernel-3.10.0-234.el7.x86_64
qemu-kvm-1.5.3-87.el7.x86_64

Steps to Test:
1.boot guest via following cli in src host and dst host.
src:
# /usr/libexec/qemu-kvm -M pc -m 4G -cpu SandyBridge,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff,hv_time -smp 8,maxcpus=8 -qmp tcp::8888,server,nowait -vnc :1 -monitor stdio -boot menu=on -rtc base=utc,clock=host,driftfix=slew -no-kvm-pit-reinjection -netdev tap,id=tap0,vhost=on,script=/etc/qemu-ifup,queues=4 -device virtio-net-pci,netdev=tap0,id=net0,mq=on,mac=24:be:05:15:11:11 -drive file=/dev/sdc,if=none,id=img,rerror=stop,werror=stop,format=raw -device virtio-blk-pci,drive=img,id=sys-img,scsi=off,addr=0x4 -drive file=/mnt/ISO/en_windows_server_2012_r2_x64_dvd_2707946.iso,if=none,id=cdrom0,rerror=stop,werror=stop -device ide-cd,drive=cdrom0,bus=ide.0,unit=0,id=disk-cdrom0,bootindex=1 -drive file=/mnt/virtio-win-1.7.0.iso,if=none,id=cdrom1,rerror=stop,werror=stop -device ide-cd,drive=cdrom1,bus=ide.1,unit=0,id=disk-cdrom1 -S
--
dst:
# /usr/libexec/qemu-kvm -M pc -m 4G -cpu SandyBridge,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff,hv_time -smp 8,maxcpus=8 -qmp tcp::8888,server,nowait -vnc :1 -monitor stdio -boot menu=on -rtc base=utc,clock=host,driftfix=slew -no-kvm-pit-reinjection -netdev tap,id=tap0,vhost=on,script=/etc/qemu-ifup,queues=4 -device virtio-net-pci,netdev=tap0,id=net0,mq=on,mac=24:be:05:15:11:11 -drive file=/dev/sdc,if=none,id=img,rerror=stop,werror=stop,format=raw -device virtio-blk-pci,drive=img,id=sys-img,scsi=off,addr=0x4 -drive file=/mnt/ISO/en_windows_server_2012_r2_x64_dvd_2707946.iso,if=none,id=cdrom0,rerror=stop,werror=stop -device ide-cd,drive=cdrom0,bus=ide.0,unit=0,id=disk-cdrom0,bootindex=1 -drive file=/mnt/virtio-win-1.7.0.iso,if=none,id=cdrom1,rerror=stop,werror=stop -device ide-cd,drive=cdrom1,bus=ide.1,unit=0,id=disk-cdrom1 -S \
-incoming tcp::5800,server,nowait 
2.do migration from src host to dst host.
(qemu) migrate -d tcp:10.66.9.152:5800
3. unplug the net cable of the dest host
4.cancel this migration in src host(as this migration can not finish).
(qemu) migrate_cancel 
5.Check migration is cancel or not in src host.
(qemu) info migrate


Actual results:
after step5, can cancel this migration in src host.
(qemu) info migrate
capabilities: xbzrle: off x-rdma-pin-all: off auto-converge: off zero-blocks: off 
Migration status: cancelled
total time: 0 milliseconds

Based on the above results, I think this bug has been fixed.

Comment 12 Shaolong Hu 2015-06-18 09:18:52 UTC
Verified on qemu-kvm-1.5.3-92.el7.x86_64:


1. cmd:

/usr/libexec/qemu-kvm -enable-kvm -M pc -smp 4 -m 4G -name rhel6.3-64 -uuid 3f2ea5cd-3d29-48ff-aab2-23df1b6ae213 -drive file=/root/RHEL-Server-7.2-64-virtio.qcow2,cache=none,if=none,rerror=stop,werror=stop,id=drive-virtio-disk0,format=qcow2,aio=native -device virtio-blk-pci,drive=drive-virtio-disk0,id=device-virtio-disk0,bootindex=1 -netdev tap,script=/etc/qemu-ifup,id=netdev0 -device virtio-net-pci,netdev=netdev0,id=device-net0,mac=aa:54:00:11:22:33 -boot order=cd -monitor stdio -usb -device usb-tablet,id=input0 -chardev socket,id=s1,path=/tmp/s1,server,nowait -device isa-serial,chardev=s1 -monitor tcp::1234,server,nowait -vga qxl -global qxl-vga.revision=3 -spice port=5920,disable-ticketing -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -vnc :10 -qmp tcp:0:5555,server,nowait

2. start migration, on des host, drop migration incoming packet with:
iptables -A INPUT -p tcp -d 10.66.84.12 --dport 5556 -j DROP

3. cancel migration on src host, migration cancelled immediately:

(qemu) migrate_cancel 
(qemu) info migrate
capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off 
Migration status: cancelled
total time: 0 milliseconds

Comment 16 errata-xmlrpc 2015-11-19 04:52:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2213.html


Note You need to log in before you can comment on or make changes to this bug.