Bug 1355683
Summary: | qemu core dump when do postcopy migration again after canceling a migration in postcopy phase | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Qianqian Zhu <qizhu> |
Component: | qemu-kvm-rhev | Assignee: | Dr. David Alan Gilbert <dgilbert> |
Status: | CLOSED ERRATA | QA Contact: | Qianqian Zhu <qizhu> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 7.3 | CC: | amit.shah, chayang, juzhang, knoel, quintela, virt-maint |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | qemu-kvm-rhev-2.6.0-17.el7 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2016-11-07 21:23:14 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Qianqian Zhu
2016-07-12 08:51:13 UTC
Yes, I can recreate this. It should be an unusual circumstance in practice; cancelling after postcopy has started is unsafe unless you control the destination. If the destination hasn't started running it's OK to restart the source and try again, so libvirt could potentially do that - however, it would issue a continue to the source before retrying the migration so wouldn't hit this case. I'll look into it. Test with: qemu-kvm-rhev-2.6.0-13.el7.1355683a.x86_64 kernel-3.10.0-461.el7.x86_64 Steps: 1.Launch src guest 2.Launch guest on dest host with same cmd 3.Start postcopy migration then cancel it immediately (qemu) migrate_set_capability postcopy-ram on (qemu) migrate -d tcp:10.73.72.55:1234 (qemu) migrate_start_postcopy (qemu) migrate_cancel 4.Launch guest on dest host again. 5.Start postcopy migration again (qemu) migrate -d tcp:10.73.72.55:1234 (qemu) migrate_start_postcopy Results: No core dump, postcopy migration succeed and guest works well After step5. Normal migration cancelling, succeed, but with below error: (qemu) migrate_cancel (qemu) 2016-07-20T08:14:09.855908Z qemu-kvm: socket_writev_buffer: Got err=32 for (73885/18446744073709551615) Cancelling in postcopy phase: (qemu) 2016-07-20T08:06:34.581064Z qemu-kvm: socket_writev_buffer: Got err=32 for (131337/18446744073709551615) 2016-07-20T08:06:34.581090Z qemu-kvm: RP: Received invalid message 0x0000 length 0x0000 Fix included in qemu-kvm-rhev-2.6.0-17.el7 Verified with: qemu-kvm-rhev-2.6.0-20.el7.x86_64 kernel-3.10.0-491.el7.x86_64 Steps same as comment 4. cli: /usr/libexec/qemu-kvm -name linux -cpu SandyBridge -m 2048 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid 7bef3814-631a-48bb-bae8-2b1de75f7a13 -nodefaults -monitor stdio -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot order=c,menu=on -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x6 -drive file=/mntnfs/RHEL-Server-7.3-64-virtio.qcow2,if=none,cache=none,id=drive-virtio-disk0,format=qcow2 -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk0,id=virtio-disk0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 -msg timestamp=on -spice port=5901,disable-ticketing -vga qxl -global qxl-vga.revision=3 -netdev tap,id=hostnet0,vhost=on -device virtio-net-pci,netdev=hostnet0,id=net0,mac=3C:D9:2B:09:AB:44,bus=pci.0,addr=0x3 -qmp tcp::5555,server,nowait Result: Postcopy migration succeed and guest works well.(qemu) Cancelling with the same warning: 2016-07-20T08:06:34.581064Z qemu-kvm: socket_writev_buffer: Got err=32 for (131337/18446744073709551615) 2016-07-20T08:06:34.581090Z qemu-kvm: RP: Received invalid message 0x0000 length 0x0000 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-2673.html |