Bug 1395360
Summary: | Post-copy migration fails with XBZRLE compression | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Marcel Kolaja <mkolaja> |
Component: | qemu-kvm-rhev | Assignee: | Dr. David Alan Gilbert <dgilbert> |
Status: | CLOSED ERRATA | QA Contact: | xianwang <xianwang> |
Severity: | unspecified | Docs Contact: | |
Priority: | high | ||
Version: | 7.3 | CC: | chayang, dgilbert, hhuang, jherrman, juzhang, knoel, michal.skrivanek, mrezanin, mzamazal, qizhu, qzhang, virt-maint, xianwang, zhengtli |
Target Milestone: | rc | Keywords: | ZStream |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | qemu-kvm-rhev-2.6.0-28.el7_3.1 | Doc Type: | Bug Fix |
Doc Text: |
Using post-copy migration with XOR-based zero run-lenth enconding (XBZRLE) compression previously caused the migration to fail and the guest to stay in a paused state. This update disables XBZRLE page compression for post-copy migration, and thus avoids the described problem.
|
Story Points: | --- |
Clone Of: | 1368422 | Environment: | |
Last Closed: | 2017-01-17 20:10:19 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1368422 | ||
Bug Blocks: |
Description
Marcel Kolaja
2016-11-15 19:12:16 UTC
Fix included in qemu-kvm-rhev-2.6.0-28.el7_3.1 Hi Qunfang, there are issues with build target configuration. Package with fix should be qemu-kvm-rhev-2.6.0-28.el7_3.1. As soon as target is fixed, I'll build correct version. Version -28 added some arm only fixes so it wasn't released for x86_64/ppc64. However, we will keep -28 for z-stream instead of -27. Mirek Hi, Mirek Got it, Thanks for the information. Bug reproduced: This bug has been reproduced in PPC platform. Version-Release number of selected component (if applicable): Host: kernel:3.10.0-514.el7.ppc64le qemu-kvm-rhev-2.6.0-20.el7.ppc64le SLOF-20160223-6.gitdbbfda4.el7 Guest: 3.10.0-514.el7.ppc64le Steps to Reproduce: 1. This production is in single one host,ie.,the src=dst. Boot a vm with qemu cli in ppc64le host,the full cli is as "Additional info",then, boot another vm in same host with same cli as first one and appending "-incoming tcp:0:5801" 2. Run "test" which is a program in guest that make memory intensive and can produce dirty pages during migration,the detail of program is as "Additional info". #gcc test.c -o test #./test 3. Set migration configuration in HMP and do migration (qemu) migrate_set_speed 10 (qemu) migrate_set_capability xbzrle on (qemu) migrate_set_capability postcopy-ram on (qemu) migrate -d tcp:127.0.0.1:5801 4. Check migration status, after producing dirty pages switch to post-copy. (qemu) info migrate capabilities: xbzrle: on rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off postcopy-ram: on dirty sync count: 8 dirty pages rate: 5650 pages .......other info.... (qemu) migrate_start_postcopy Actual results: The migration fails and the VM gets paused. In src HMP: (qemu) 2016-12-05T08:48:52.199109Z qemu-kvm: RP: Sibling indicated error 1 2016-12-05T08:48:52.279863Z qemu-kvm: socket_writev_buffer: Got err=104 for (32768/18446744073709551615) (qemu) info status VM status: paused (postmigrate) (qemu) info migrate capabilities: xbzrle: on rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off postcopy-ram: on Migration status: failed total time: 0 milliseconds While in dst HMP: (qemu) 2016-12-05T08:48:52.157951Z qemu-kvm: Unknown combination of migration flags: 0x40 (postcopy mode) 2016-12-05T08:48:52.158039Z qemu-kvm: error while loading state section id 2(ram) 2016-12-05T08:48:52.158049Z qemu-kvm: postcopy_ram_listen_thread: loadvm failed: -22 Additional info: (1)the full qemu cli: /usr/libexec/qemu-kvm \ -name 'avocado-vt-vm1' \ -sandbox off \ -nodefaults \ -machine pseries-rhel7.3.0 \ -vga std \ -device virtio-serial-pci,id=virtio_serial_pci0,bus=pci.0,addr=03 \ -device virtio-scsi-pci,id=scsi1,bus=pci.0,addr=0x4 \ -device ich9-usb-ehci1,id=usb1,addr=1d.7,multifunction=on,bus=pci.0 \ -chardev socket,id=console0,path=/tmp/console0,server,nowait \ -device spapr-vty,chardev=console0 \ -chardev socket,id=console1,path=/tmp/console1,server,nowait \ -device spapr-vty,chardev=console1 \ -drive file=/root/R1.qcow2,if=none,id=blk1 \ -device virtio-blk-pci,scsi=off,drive=blk1,id=blk-disk1,bootindex=1 \ -device virtio-net-pci,mac=9a:7b:7c:7d:7e:71,id=idtlLxAk,vectors=4,netdev=idlkwV8e,bus=pci.0,addr=05 \ -netdev tap,id=idlkwV8e,vhost=on,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown \ -m 4096 \ -smp 4 \ -cpu host \ -device usb-kbd \ -device usb-mouse \ -qmp tcp:0:8881,server,nowait \ -vnc :1 \ -msg timestamp=on \ -rtc base=localtime,clock=vm,driftfix=slew \ -boot order=cdn,once=c,menu=off,strict=off \ -monitor stdio \ -enable-kvm (2)the program that specified in step 2 #gcc test.c -o test #./test #cat test.c #include <stdlib.h> #include <stdio.h> #include <signal.h> int main() { void wakeup(); signal(SIGALRM,wakeup); alarm(120); char *buf = (char *) calloc(40960, 4096); while (1) { int i; for (i = 0; i < 40960 * 4; i++) { buf[i * 4096 / 4]++; } printf("."); } } void wakeup() { exit(0); } Bug verify Bug is verified pass both in ppc64le and x86 with qemu-kvm-rhev-2.6.0-28.el7_3.1 Bug is verified in ppc version: Host: 3.10.0-514.el7.ppc64le qemu-kvm-rhev-2.6.0-28.el7_3.1.ppc64le SLOF-20160223-6.gitdbbfda4.el7 Guest: 3.10.0-514.el7.ppc64le steps: the same as bug reproduced. Actual results: The migration successed and the VM is running In src HMP: (qemu) info migrate capabilities: xbzrle: on rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off postcopy-ram: on Migration status: completed dirty sync count: 7 postcopy request count: 15 In dst HMP: (qemu) info status VM status: running Bug is verified in x86 version: Host: 3.10.0-514.el7.x86_64 qemu-kvm-rhev-2.6.0-28.el7_3.1.x86_64 Guest: 3.10.0-514.el7.x86_64 steps: the same as bug reproduced. Actual results: The migration successed and the VM is running In src HMP: (qemu) info migrate capabilities: xbzrle: on rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off postcopy-ram: on Migration status: completed dirty sync count: 4 postcopy request count: 46 In dst HMP: (qemu) info status VM status: running So, this bug is verified, it should be changed status to verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2017-0115.html |