Bug 1086168
| Summary: | qemu-kvm can not cancel migration in src host when network of dst host failed | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Jun Li <juli> | |
| Component: | qemu-kvm | Assignee: | Dr. David Alan Gilbert <dgilbert> | |
| Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> | |
| Severity: | medium | Docs Contact: | ||
| Priority: | medium | |||
| Version: | 7.0 | CC: | dgilbert, hhuang, huding, jen, juzhang, lmiksik, meyang, michen, qzhang, rbalakri, virt-maint, xfu | |
| Target Milestone: | rc | |||
| Target Release: | --- | |||
| Hardware: | x86_64 | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | qemu-kvm-1.5.3-87.el7 | Doc Type: | Bug Fix | |
| Doc Text: |
A failure in the destination host or network during a migration could lead to a long (~15 min) TCP timeout before migration_cancel could be used. Employ the shutdown(2) system call in migration_cancel to force the socket to be closed quickly.
|
Story Points: | --- | |
| Clone Of: | ||||
| : | 1167197 1168790 (view as bug list) | Environment: | ||
| Last Closed: | 2015-11-19 04:52:23 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1167197, 1168790 | |||
|
Description
Jun Li
2014-04-10 09:43:15 UTC
Also test with qemu-kvm-rhev-1.5.3-50.el7.x86_64, hit this issue, too. *** Bug 1168156 has been marked as a duplicate of this bug. *** Posted fix upstream to qemu-devel. That still leaves a ~2 min timeout if you migrate to a host that's already dead; but that's a lot better than the ~15 mins that you get if it happens in the middle. Fix included in qemu-kvm-1.5.3-87.el7 Reproduce this bug using the following version: kernel-3.10.0-234.el7.x86_64 qemu-kvm-1.5.3-60.el7.x86_64 Steps to Reproduce: 1.boot guest via following cli in src host and dst host. src: # /usr/libexec/qemu-kvm -M pc -m 4G -cpu SandyBridge,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff,hv_time -smp 8,maxcpus=8 -qmp tcp::8888,server,nowait -vnc :1 -monitor stdio -boot menu=on -rtc base=utc,clock=host,driftfix=slew -no-kvm-pit-reinjection -netdev tap,id=tap0,vhost=on,script=/etc/qemu-ifup,queues=4 -device virtio-net-pci,netdev=tap0,id=net0,mq=on,mac=24:be:05:15:11:11 -drive file=/dev/sdc,if=none,id=img,rerror=stop,werror=stop,format=raw -device virtio-blk-pci,drive=img,id=sys-img,scsi=off,addr=0x4 -drive file=/mnt/ISO/en_windows_server_2012_r2_x64_dvd_2707946.iso,if=none,id=cdrom0,rerror=stop,werror=stop -device ide-cd,drive=cdrom0,bus=ide.0,unit=0,id=disk-cdrom0,bootindex=1 -drive file=/mnt/virtio-win-1.7.0.iso,if=none,id=cdrom1,rerror=stop,werror=stop -device ide-cd,drive=cdrom1,bus=ide.1,unit=0,id=disk-cdrom1 -S -- dst: # /usr/libexec/qemu-kvm -M pc -m 4G -cpu SandyBridge,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff,hv_time -smp 8,maxcpus=8 -qmp tcp::8888,server,nowait -vnc :1 -monitor stdio -boot menu=on -rtc base=utc,clock=host,driftfix=slew -no-kvm-pit-reinjection -netdev tap,id=tap0,vhost=on,script=/etc/qemu-ifup,queues=4 -device virtio-net-pci,netdev=tap0,id=net0,mq=on,mac=24:be:05:15:11:11 -drive file=/dev/sdc,if=none,id=img,rerror=stop,werror=stop,format=raw -device virtio-blk-pci,drive=img,id=sys-img,scsi=off,addr=0x4 -drive file=/mnt/ISO/en_windows_server_2012_r2_x64_dvd_2707946.iso,if=none,id=cdrom0,rerror=stop,werror=stop -device ide-cd,drive=cdrom0,bus=ide.0,unit=0,id=disk-cdrom0,bootindex=1 -drive file=/mnt/virtio-win-1.7.0.iso,if=none,id=cdrom1,rerror=stop,werror=stop -device ide-cd,drive=cdrom1,bus=ide.1,unit=0,id=disk-cdrom1 -S \ -incoming tcp::5800,server,nowait 2.do migration from src host to dst host. (qemu) migrate -d tcp:10.66.9.152:5800 3. unplug the net cable of the dest host 4.cancel this migration in src host(as this migration can not finish). (qemu) migrate_cancel 5.Check migration is cancel or not in src host. (qemu) info migrate Actual results: after step5, can not cancel this migration in src host. (qemu) info migrate capabilities: xbzrle: off x-rdma-pin-all: off auto-converge: off zero-blocks: off Migration status: active total time: 47332 milliseconds expected downtime: 30 milliseconds setup: 15 milliseconds transferred ram: 472101 kbytes throughput: 268.57 mbps remaining ram: 3732476 kbytes total ram: 4211404 kbytes duplicate: 4393 pages skipped: 0 pages normal: 1111387 pages normal bytes: 4445548 kbytes (qemu) info status VM status: running (qemu) info status VM status: running Test this bug using the following version: kernel-3.10.0-234.el7.x86_64 qemu-kvm-1.5.3-87.el7.x86_64 Steps to Test: 1.boot guest via following cli in src host and dst host. src: # /usr/libexec/qemu-kvm -M pc -m 4G -cpu SandyBridge,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff,hv_time -smp 8,maxcpus=8 -qmp tcp::8888,server,nowait -vnc :1 -monitor stdio -boot menu=on -rtc base=utc,clock=host,driftfix=slew -no-kvm-pit-reinjection -netdev tap,id=tap0,vhost=on,script=/etc/qemu-ifup,queues=4 -device virtio-net-pci,netdev=tap0,id=net0,mq=on,mac=24:be:05:15:11:11 -drive file=/dev/sdc,if=none,id=img,rerror=stop,werror=stop,format=raw -device virtio-blk-pci,drive=img,id=sys-img,scsi=off,addr=0x4 -drive file=/mnt/ISO/en_windows_server_2012_r2_x64_dvd_2707946.iso,if=none,id=cdrom0,rerror=stop,werror=stop -device ide-cd,drive=cdrom0,bus=ide.0,unit=0,id=disk-cdrom0,bootindex=1 -drive file=/mnt/virtio-win-1.7.0.iso,if=none,id=cdrom1,rerror=stop,werror=stop -device ide-cd,drive=cdrom1,bus=ide.1,unit=0,id=disk-cdrom1 -S -- dst: # /usr/libexec/qemu-kvm -M pc -m 4G -cpu SandyBridge,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff,hv_time -smp 8,maxcpus=8 -qmp tcp::8888,server,nowait -vnc :1 -monitor stdio -boot menu=on -rtc base=utc,clock=host,driftfix=slew -no-kvm-pit-reinjection -netdev tap,id=tap0,vhost=on,script=/etc/qemu-ifup,queues=4 -device virtio-net-pci,netdev=tap0,id=net0,mq=on,mac=24:be:05:15:11:11 -drive file=/dev/sdc,if=none,id=img,rerror=stop,werror=stop,format=raw -device virtio-blk-pci,drive=img,id=sys-img,scsi=off,addr=0x4 -drive file=/mnt/ISO/en_windows_server_2012_r2_x64_dvd_2707946.iso,if=none,id=cdrom0,rerror=stop,werror=stop -device ide-cd,drive=cdrom0,bus=ide.0,unit=0,id=disk-cdrom0,bootindex=1 -drive file=/mnt/virtio-win-1.7.0.iso,if=none,id=cdrom1,rerror=stop,werror=stop -device ide-cd,drive=cdrom1,bus=ide.1,unit=0,id=disk-cdrom1 -S \ -incoming tcp::5800,server,nowait 2.do migration from src host to dst host. (qemu) migrate -d tcp:10.66.9.152:5800 3. unplug the net cable of the dest host 4.cancel this migration in src host(as this migration can not finish). (qemu) migrate_cancel 5.Check migration is cancel or not in src host. (qemu) info migrate Actual results: after step5, can cancel this migration in src host. (qemu) info migrate capabilities: xbzrle: off x-rdma-pin-all: off auto-converge: off zero-blocks: off Migration status: cancelled total time: 0 milliseconds Based on the above results, I think this bug has been fixed. Verified on qemu-kvm-1.5.3-92.el7.x86_64: 1. cmd: /usr/libexec/qemu-kvm -enable-kvm -M pc -smp 4 -m 4G -name rhel6.3-64 -uuid 3f2ea5cd-3d29-48ff-aab2-23df1b6ae213 -drive file=/root/RHEL-Server-7.2-64-virtio.qcow2,cache=none,if=none,rerror=stop,werror=stop,id=drive-virtio-disk0,format=qcow2,aio=native -device virtio-blk-pci,drive=drive-virtio-disk0,id=device-virtio-disk0,bootindex=1 -netdev tap,script=/etc/qemu-ifup,id=netdev0 -device virtio-net-pci,netdev=netdev0,id=device-net0,mac=aa:54:00:11:22:33 -boot order=cd -monitor stdio -usb -device usb-tablet,id=input0 -chardev socket,id=s1,path=/tmp/s1,server,nowait -device isa-serial,chardev=s1 -monitor tcp::1234,server,nowait -vga qxl -global qxl-vga.revision=3 -spice port=5920,disable-ticketing -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -vnc :10 -qmp tcp:0:5555,server,nowait 2. start migration, on des host, drop migration incoming packet with: iptables -A INPUT -p tcp -d 10.66.84.12 --dport 5556 -j DROP 3. cancel migration on src host, migration cancelled immediately: (qemu) migrate_cancel (qemu) info migrate capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off Migration status: cancelled total time: 0 milliseconds Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-2213.html |