Bug 1395360
| Summary: | Post-copy migration fails with XBZRLE compression | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Marcel Kolaja <mkolaja> |
| Component: | qemu-kvm-rhev | Assignee: | Dr. David Alan Gilbert <dgilbert> |
| Status: | CLOSED ERRATA | QA Contact: | xianwang <xianwang> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | high | ||
| Version: | 7.3 | CC: | chayang, dgilbert, hhuang, jherrman, juzhang, knoel, michal.skrivanek, mrezanin, mzamazal, qizhu, qzhang, virt-maint, xianwang, zhengtli |
| Target Milestone: | rc | Keywords: | ZStream |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | qemu-kvm-rhev-2.6.0-28.el7_3.1 | Doc Type: | Bug Fix |
| Doc Text: |
Using post-copy migration with XOR-based zero run-lenth enconding (XBZRLE) compression previously caused the migration to fail and the guest to stay in a paused state. This update disables XBZRLE page compression for post-copy migration, and thus avoids the described problem.
|
Story Points: | --- |
| Clone Of: | 1368422 | Environment: | |
| Last Closed: | 2017-01-17 20:10:19 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1368422 | ||
| Bug Blocks: | |||
|
Description
Marcel Kolaja
2016-11-15 19:12:16 UTC
Fix included in qemu-kvm-rhev-2.6.0-28.el7_3.1 Hi Qunfang, there are issues with build target configuration. Package with fix should be qemu-kvm-rhev-2.6.0-28.el7_3.1. As soon as target is fixed, I'll build correct version. Version -28 added some arm only fixes so it wasn't released for x86_64/ppc64. However, we will keep -28 for z-stream instead of -27. Mirek Hi, Mirek Got it, Thanks for the information. Bug reproduced:
This bug has been reproduced in PPC platform.
Version-Release number of selected component (if applicable):
Host:
kernel:3.10.0-514.el7.ppc64le
qemu-kvm-rhev-2.6.0-20.el7.ppc64le
SLOF-20160223-6.gitdbbfda4.el7
Guest:
3.10.0-514.el7.ppc64le
Steps to Reproduce:
1. This production is in single one host,ie.,the src=dst.
Boot a vm with qemu cli in ppc64le host,the full cli is as "Additional info",then, boot another vm in same host with same cli as first one and appending "-incoming tcp:0:5801"
2. Run "test" which is a program in guest that make memory intensive and can produce dirty pages during migration,the detail of program is as "Additional info".
#gcc test.c -o test
#./test
3. Set migration configuration in HMP and do migration
(qemu) migrate_set_speed 10
(qemu) migrate_set_capability xbzrle on
(qemu) migrate_set_capability postcopy-ram on
(qemu) migrate -d tcp:127.0.0.1:5801
4. Check migration status, after producing dirty pages switch to post-copy.
(qemu) info migrate
capabilities: xbzrle: on rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off postcopy-ram: on
dirty sync count: 8
dirty pages rate: 5650 pages
.......other info....
(qemu) migrate_start_postcopy
Actual results:
The migration fails and the VM gets paused.
In src HMP:
(qemu) 2016-12-05T08:48:52.199109Z qemu-kvm: RP: Sibling indicated error 1
2016-12-05T08:48:52.279863Z qemu-kvm: socket_writev_buffer: Got err=104 for (32768/18446744073709551615)
(qemu) info status
VM status: paused (postmigrate)
(qemu) info migrate
capabilities: xbzrle: on rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off postcopy-ram: on
Migration status: failed
total time: 0 milliseconds
While in dst HMP:
(qemu) 2016-12-05T08:48:52.157951Z qemu-kvm: Unknown combination of migration flags: 0x40 (postcopy mode)
2016-12-05T08:48:52.158039Z qemu-kvm: error while loading state section id 2(ram)
2016-12-05T08:48:52.158049Z qemu-kvm: postcopy_ram_listen_thread: loadvm failed: -22
Additional info:
(1)the full qemu cli:
/usr/libexec/qemu-kvm \
-name 'avocado-vt-vm1' \
-sandbox off \
-nodefaults \
-machine pseries-rhel7.3.0 \
-vga std \
-device virtio-serial-pci,id=virtio_serial_pci0,bus=pci.0,addr=03 \
-device virtio-scsi-pci,id=scsi1,bus=pci.0,addr=0x4 \
-device ich9-usb-ehci1,id=usb1,addr=1d.7,multifunction=on,bus=pci.0 \
-chardev socket,id=console0,path=/tmp/console0,server,nowait \
-device spapr-vty,chardev=console0 \
-chardev socket,id=console1,path=/tmp/console1,server,nowait \
-device spapr-vty,chardev=console1 \
-drive file=/root/R1.qcow2,if=none,id=blk1 \
-device virtio-blk-pci,scsi=off,drive=blk1,id=blk-disk1,bootindex=1 \
-device virtio-net-pci,mac=9a:7b:7c:7d:7e:71,id=idtlLxAk,vectors=4,netdev=idlkwV8e,bus=pci.0,addr=05 \
-netdev tap,id=idlkwV8e,vhost=on,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown \
-m 4096 \
-smp 4 \
-cpu host \
-device usb-kbd \
-device usb-mouse \
-qmp tcp:0:8881,server,nowait \
-vnc :1 \
-msg timestamp=on \
-rtc base=localtime,clock=vm,driftfix=slew \
-boot order=cdn,once=c,menu=off,strict=off \
-monitor stdio \
-enable-kvm
(2)the program that specified in step 2
#gcc test.c -o test
#./test
#cat test.c
#include <stdlib.h>
#include <stdio.h>
#include <signal.h>
int main()
{
void wakeup();
signal(SIGALRM,wakeup);
alarm(120);
char *buf = (char *) calloc(40960, 4096);
while (1) {
int i;
for (i = 0; i < 40960 * 4; i++) {
buf[i * 4096 / 4]++;
}
printf(".");
}
}
void wakeup()
{
exit(0);
}
Bug verify
Bug is verified pass both in ppc64le and x86 with qemu-kvm-rhev-2.6.0-28.el7_3.1
Bug is verified in ppc version:
Host:
3.10.0-514.el7.ppc64le
qemu-kvm-rhev-2.6.0-28.el7_3.1.ppc64le
SLOF-20160223-6.gitdbbfda4.el7
Guest:
3.10.0-514.el7.ppc64le
steps:
the same as bug reproduced.
Actual results:
The migration successed and the VM is running
In src HMP:
(qemu) info migrate
capabilities: xbzrle: on rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off postcopy-ram: on
Migration status: completed
dirty sync count: 7
postcopy request count: 15
In dst HMP:
(qemu) info status
VM status: running
Bug is verified in x86 version:
Host:
3.10.0-514.el7.x86_64
qemu-kvm-rhev-2.6.0-28.el7_3.1.x86_64
Guest:
3.10.0-514.el7.x86_64
steps:
the same as bug reproduced.
Actual results:
The migration successed and the VM is running
In src HMP:
(qemu) info migrate
capabilities: xbzrle: on rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off postcopy-ram: on
Migration status: completed
dirty sync count: 4
postcopy request count: 46
In dst HMP:
(qemu) info status
VM status: running
So, this bug is verified, it should be changed status to verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2017-0115.html |