Bug 1368422
Summary: | Post-copy migration fails with XBZRLE compression | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Milan Zamazal <mzamazal> | |
Component: | qemu-kvm-rhev | Assignee: | Dr. David Alan Gilbert <dgilbert> | |
Status: | CLOSED ERRATA | QA Contact: | xianwang <xianwang> | |
Severity: | unspecified | Docs Contact: | ||
Priority: | high | |||
Version: | 7.3 | CC: | chayang, dgilbert, hhuang, jherrman, juzhang, michal.skrivanek, mrezanin, mtessun, mzamazal, qizhu, qzhang, virt-maint, xianwang | |
Target Milestone: | rc | Keywords: | ZStream | |
Target Release: | 7.4 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | qemu-kvm-rhev-2.8.0-1 | Doc Type: | Bug Fix | |
Doc Text: |
Using post-copy migration with XOR-based zero run-lenth enconding (XBZRLE) compression previously caused the migration to fail and the guest to stay in a paused state. This update disables XBZRLE page compression for post-copy migration, and thus avoids the described problem.
|
Story Points: | --- | |
Clone Of: | ||||
: | 1395360 (view as bug list) | Environment: | ||
Last Closed: | 2017-08-01 23:34:44 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1395265, 1395360, 1401400 |
Description
Milan Zamazal
2016-08-19 10:49:44 UTC
This bug has been verified both for ppc and x86. Bug reproduced in PPC platform: Version-Release number of selected component (if applicable): Host: kernel:3.10.0-558.el7.ppc64le qemu-kvm-rhev-2.6.0-22.el7.ppc64le SLOF-20160223-6.gitdbbfda4.el7.noarch Guest: 3.10.0-558.el7.ppc64le Steps to Reproduce: 1.Boot a vm in src host with qemu cli: /usr/libexec/qemu-kvm \ -name 'avocado-vt-vm1' \ -sandbox off \ -nodefaults \ -machine pseries-rhel7.3.0 \ -vga std \ -device virtio-serial-pci,id=virtio_serial_pci0,bus=pci.0,addr=03 \ -device virtio-scsi-pci,id=scsi1,bus=pci.0,addr=0x4 \ -chardev socket,id=devorg.qemu.guest_agent.0,path=/tmp/virtio_port-org.qemu.guest_agent.0-20160516-164929-dHQ00mMM,server,nowait \ -device virtserialport,chardev=devorg.qemu.guest_agent.0,name=org.qemu.guest_agent.0,id=org.qemu.guest_agent.0,bus=virtio_serial_pci0.0 \ -device nec-usb-xhci,id=usb1,addr=1d.7,multifunction=on,bus=pci.0 \ -drive file=/root/RHEL.7.3.qcow2,if=none,id=blk1 \ -device virtio-blk-pci,scsi=off,drive=blk1,id=blk-disk1,bootindex=1 \ -drive id=drive_cd1,if=none,snapshot=off,aio=native,cache=none,media=cdrom,file=/root/RHEL-7.3-20161019.0-Server-ppc64le-dvd1.iso \ -device scsi-cd,id=cd1,drive=drive_cd1,bootindex=2 \ -device virtio-net-pci,mac=9a:7b:7c:7d:7e:71,id=idtlLxAk,vectors=4,netdev=idlkwV8e,bus=pci.0,addr=05 \ -netdev tap,id=idlkwV8e,vhost=on,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown \ -m 8G \ -smp 2 \ -cpu host \ -device usb-kbd \ -device usb-tablet \ -qmp tcp:0:8881,server,nowait \ -vnc :1 \ -msg timestamp=on \ -rtc base=localtime,clock=vm,driftfix=slew \ -boot order=cdn,once=c,menu=off,strict=off \ -monitor stdio \ -enable-kvm 2.Boot a vm in dst host with qemu cli the same as src host and appending "-incoming tcp:0:5801" 3. Run "test" which is a program in guest that make memory intensive and can produce dirty pages during migration,the detail of program is as "Additional info". #gcc test.c -o test #./test 4. Set migration configuration in HMP and do migration (qemu) migrate_set_capability xbzrle on (qemu) migrate_set_capability postcopy-ram on (qemu) migrate -d tcp:10.19.112.39:5801 5. Check migration status, after producing dirty pages switch to post-copy. (qemu) info migrate capabilities: xbzrle: on rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off postcopy-ram: on x-colo: off dirty sync count: 7 dirty pages rate: 13587 pages .......other info.... (qemu) migrate_start_postcopy Actual results: The migration fails and the VM gets paused. In src HMP: (qemu) migrate_start_postcopy (qemu) 2017-02-13T06:56:00.913043Z qemu-kvm: RP: Sibling indicated error 1 2017-02-13T06:56:01.105488Z qemu-kvm: socket_writev_buffer: Got err=104 for (32768/18446744073709551615) (qemu) info migrate capabilities: xbzrle: on rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off postcopy-ram: on Migration status: failed total time: 0 milliseconds (qemu) info status VM status: paused (postmigrate) While in dst HMP: (qemu) 2017-02-13T06:56:00.911136Z qemu-kvm: Unknown combination of migration flags: 0x40 (postcopy m) 2017-02-13T06:56:00.911222Z qemu-kvm: error while loading state section id 2(ram) 2017-02-13T06:56:00.911233Z qemu-kvm: postcopy_ram_listen_thread: loadvm failed: -22 Additional info: (1)the program that specified in step 3 #gcc test.c -o test #./test #cat test.c #include <stdlib.h> #include <stdio.h> #include <signal.h> int main() { void wakeup(); signal(SIGALRM,wakeup); alarm(120); char *buf = (char *) calloc(40960, 4096); while (1) { int i; for (i = 0; i < 40960 * 4; i++) { buf[i * 4096 / 4]++; } printf("."); } } void wakeup() { exit(0); } Bug verify in ppc platform Bug is verified in following version: Host: kernel:3.10.0-558.el7.ppc64le qemu-kvm-rhev-2.8.0-1.el7.ppc64le SLOF-20160223-6.gitdbbfda4.el7.noarch Guest: 3.10.0-558.el7.ppc64le steps: the same as bug reproduced. Actual results: The migration successed and the VM is running In src HMP: (qemu) info migrate capabilities: xbzrle: on rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off postcopy-ram: on x-colo: off Migration status: completed dirty sync count: 22 postcopy request count: 1492 In dst HMP: (qemu) info status VM status: running Bug verify in x86 platform Bug is verified in following version: Host: 3.10.0-563.el7.x86_64 qemu-kvm-rhev-2.8.0-1.el7.x86_64 Guest: 3.10.0-514.10.1.el7.x86_64 steps: the same as bug reproduced. Actual results: The migration successed and the VM is running In src HMP: (qemu) info migrate capabilities: xbzrle: on rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off postcopy-ram: on x-colo: off Migration status: completed dirty sync count: 14 postcopy request count: 3010 In dst HMP: (qemu) info status VM status: running So, this bug is fixed. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:2392 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:2392 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:2392 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:2392 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:2392 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:2392 |