Hide Forgot
Description of problem: Migrate a guest with xbzrle enabled, running the following command (copy cdrom data) inside guest during migration, guest sometimes core dumped after migration on dst host side. Guest running: #while true; do cp -r /media/RHEL7\ X86_64 /home/test; sleep 1; rm -rf /home/test; done Version-Release number of selected component (if applicable): kernel-3.10.0-63.el7.x86_64 qemu-kvm-1.5.3-30.el7.x86_64 How reproducible: 1/10 Steps to Reproduce: 1.Boot up a guest with a cdrom attached. eg: /usr/libexec/qemu-kvm -cpu SandyBridge -enable-kvm -m 2048 -smp 2,sockets=2,cores=1,threads=1 -enable-kvm -name t2-rhel6.4-32 -uuid 61b6c504-5a8b-4fe1-8347-6c929b750dde -k en-us -rtc base=localtime,clock=host,driftfix=slew -no-kvm-pit-reinjection -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device usb-tablet,id=input0 -drive file=/mnt/RHEL-Server-6.4-64-virtio.qcow2,if=none,id=disk0,format=qcow2,werror=stop,rerror=stop,aio=native -device ide-drive,bus=ide.0,unit=1,drive=disk0,id=disk0,bootindex=1 -drive file=/mnt/boot.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,drive=drive-ide0-1-0,bus=ide.1,unit=0,id=cdrom -netdev tap,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=44:37:E6:5E:91:85,bus=pci.0,addr=0x5 -monitor stdio -qmp tcp:0:6666,server,nowait -chardev socket,path=/tmp/isa-serial,server,nowait,id=isa1 -device isa-serial,chardev=isa1,id=isa-serial1 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x8 -chardev socket,id=charchannel0,path=/tmp/serial-socket,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.rhevm.vdsm -chardev socket,path=/tmp/foo,server,nowait,id=foo -device virtconsole,chardev=foo,id=console0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x9 -vnc :10 -k en-us -boot dc -chardev socket,path=/tmp/qga.sock,server,nowait,id=qga0 -device virtserialport,bus=virtio-serial0.0,chardev=qga0,name=org.qemu.guest_agent.0 -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 2. Boot the guest on dst host with "-incoming tcp:0:5800" 3. On src host: (qemu) migrate_set_capability xbzrle on (qemu) migrate_set_cache_size 512M 4. Inside guest: #while true; do cp -r /media/RHEL7\ X86_64 /home/test; sleep 1; rm -rf /home/test; done 5. Migrate guest (qemu) migrate -d tcp:t2:5800 6. Change the cache size during migration (I'm not sure whether this is relevant with the issue, but this is what I did when I encountered the bug). (qemu) migrate_set_cache_size 1G (qemu) migrate_set_cache_size 512M (qemu) migrate_set_cache_size 1G Actual results: Migration finished but guest core dumped and then restart automatically on dst host side. Expected results: Guest should not core dumped. Additional info: Core dumped file will be attached.
Created attachment 839325 [details] vmcore-dmesg.txt file inside guest
The difference of the steps for the above issue (comment 4) and bug 1066338 is: This bug: Switching cache size between 512M and 1G for several time (about 4~5 times in my test) after migration starts. Bug 1066338: Waiting for migration "remaining ram" becomes small, and before migration finish, change the cache size to 128M. So in that bug I only changed the migration cache size for once.
Hi Qunfang, Can you retest this on the latest 7.0.z please (at least qemu-kvm-1.5.3-60.el7_0.3 or newer) and the current 7.1 world (at least qemu-kvm-1.5.3-63.el7) I'm hoping this is fixed by the fixes in bz 1066338/bz1110191. Thanks, Dave
(In reply to Dr. David Alan Gilbert from comment #10) > Hi Qunfang, > Can you retest this on the latest 7.0.z please > (at least qemu-kvm-1.5.3-60.el7_0.3 or newer) > and the current 7.1 world (at least qemu-kvm-1.5.3-63.el7) > > I'm hoping this is fixed by the fixes in bz 1066338/bz1110191. > > Thanks, > > Dave Hi, Dave I just have a test on both rhel7.0-z build qemu-kvm-1.5.3-60.el7_0.5.x86_64.rpm and rhel7.1 build qemu-kvm-1.5.3-66.el7.x86_64.rpm, for each build I did ping-pong migration for 10 times. This issue could not be reproduced any more. Guest works well after migration. The test steps are the same as comment 0. So, this bug should be fixed IMO. Thanks, Qunfang
Thanks Qunfang; marking as duplicate with bz 1066338. *** This bug has been marked as a duplicate of bug 1066338 ***