=Comment: #0================================================= SANTWANA SAMANTRAY <santwana.samantray.com> - Local migration of kvm guest fails in Fedora12-Alpha using qemu-kvm. After the guest is local migrated using qemu-kvm, it stops responding. Checking the status of the guest using "info status" in qemu monitor shows the status as "Paused". Issuing "cont" for the guest to unpause and start again, shows the below BUG and Call Trace in the migrated guest. Attachment: /var/log/messages of the guest Call Trace: Sep 7 14:52:40 rhel6 kernel: [<ffffffff8102907f>] ? kvm_mmu_op+0x30/0x55 Sep 7 14:52:40 rhel6 kernel: [<ffffffff810291cd>] ? kvm_deferred_mmu_op+0x46/0x94 Sep 7 14:52:40 rhel6 kernel: [<ffffffff8102927a>] ? kvm_mmu_write+0x33/0x3a Sep 7 14:52:40 rhel6 kernel: [<ffffffff81029322>] ? kvm_set_pte_at+0x25/0x2a Sep 7 14:52:40 rhel6 kernel: [<ffffffff810b49c6>] ? __do_fault+0x300/0x3d5 Sep 7 14:52:40 rhel6 kernel: [<ffffffff810b6a3e>] ? handle_mm_fault+0x349/0x7c5 Sep 7 14:52:40 rhel6 kernel: [<ffffffff813ad4b8>] ? do_page_fault+0x5b5/0x9e9 Sep 7 14:52:40 rhel6 kernel: [<ffffffff810bca82>] ? do_mmap_pgoff+0x304/0x367 Sep 7 14:52:40 rhel6 kernel: [<ffffffff8102a12a>] ? default_spin_lock_flags+0x9/0xf Sep 7 14:52:40 rhel6 kernel: [<ffffffff813aa959>] ? trace_hardirqs_off_thunk+0x3a/0x6c Sep 7 14:52:40 rhel6 kernel: [<ffffffff813ab015>] ? page_fault+0x25/0x30 Commands used for migration are as below: qemu-kvm -no-hpet -drive file=/var/lib/libvirt/images/rhel6.raw,if=ide,cache=writeback,index=0 -smp 4 -cpu qemu64,+sse2 -m 2048 -net nic,macaddr=00:21:9B:85:98:E8,model=virtio -net tap,script=/home/qemu-ifup-latest -vnc :1 -name rhel6_qemu and qemu-kvm -no-hpet -drive file=/var/lib/libvirt/images/rhel6.raw,if=ide,cache=writeback,index=0 -smp 4 -cpu qemu64,+sse2 -m 2048 -net nic,macaddr=00:21:9B:85:98:E8,model=virtio -net tap,script=/home/qemu-ifup-latest -vnc :2 -name rhel6_qemu_migrate -incoming tcp:0:4564 =Comment: #3================================================= SANTWANA SAMANTRAY <santwana.samantray.com> - I tried local migration with the latest version of qemu-kvm(qemu-0.10.92-4.fc12.x86_64). After the local migration of the guest, the below messages were noticed in the dmesg of the guest: bad partial csum: csum=8448/34203 len=54 bad partial csum: csum=8448/34203 len=54 bad partial csum: csum=8448/34203 len=54 The guest was accessible for sometime, later it didn't respond to any keystrokes or mouse cursor. Even the ssh session of the guest was not responsive. Below message was seen in the /var/log/message of the guest, just after which, it stopped responding. gnome-session[1893]: WARNING: Detected that screensaver has left the bus
Created attachment 361658 [details] /var/log/messages of the guest
(In reply to comment #0) > Sep 7 14:52:40 rhel6 kernel: [<ffffffff8102907f>] ? kvm_mmu_op+0x30/0x55 Okay, looks like a guest kernel PV MMU issue 2.6.29.4-1.el6.x86_64 is the kernel version Santwana: could you try and reproduce with 2.6.30 from F11 updates or using 2.6.31 from F12?
Fedora 12-Alpha was using 2.6.31-rc5 which contains a known pvmmu bug. Should be fixed in 2.6.31. Closing bug, please reopen if problems still persist with FC12.
------- Comment From santwana.samantray.com 2009-10-07 01:25 EDT------- Hello Redhat, I verified this issue in F12 release(k.v- 2.6.31-33.fc12.x86_64) as host, and this issue is still reproducible. After local migration of the guest, the guest was accessible for sometime, later it didn't respond to any keystrokes or mouse cursor. Even the ssh session of the guest was not responsive. Checking the status of the guest using "info status" in qemu monitor shows as "running", but still the guest is unresponsive. Thanks, Santwana
The bug is the guest code, so you should upgrade it also.
------- Comment From santwana.samantray.com 2009-10-15 02:29 EDT------- Hello Redhat, I verified this issue in F12 release(k.v- 2.6.31-33.fc12.x86_64) as host, and guest kernel was 2.6.31-27.el6.x86_64. After local migration of the guest, the guest didn't respond to any keystrokes or mouse cursor. "info status" in qemu monitor shows as "running", but still the guest is unresponsive. Thanks, Santwana
AFAIK the fix which Marcelo thought resolves this was in 2.6.31-27.el6.x86_64, so we may be looking at a different problem
OK, can reproduce it. Migration seems stable with either "no-kvmclock" or commit 11ed4b344c0eb6f1c5d11a07c307e94174a13900 Author: Glauber Costa <glommer> Date: Fri Oct 16 15:27:38 2009 -0400 properly save kvm system time msr registers Currently, the msrs involved in setting up pvclock are not saved over migration and/or save/restore. This patch puts their value in special fields in our CPUState, and deal with them using vmstate. kvm also has to account for it, by including them in the msr list for the ioctls.
Glauber: should the fix Marcelo points out be backported to 0.11.0 for Fedora 12?
Yes. If we have vmstate in place, it should be quite easy. If not, I am backporting it to RHEL5, and we can use the same patch
There's no vmstate in 0.11.0 Note, this is on F12VirtBlocker and it's less than two weeks to F12 GA freeze
Ok, so we'll probably be able to use the same patch I'll write for RHEL. I'll work out something.
This should be fixed with: * Wed Oct 21 2009 Glauber Costa <gcosta> - 2:0.11.0-8 - Properly save kvm time registers (#524229)
------- Comment From santwana.samantray.com 2009-10-27 04:40 EDT------- Hi, I was able to reproduce this issue in the latest F12 rawhide (k.v- 2.6.31.5-96.fc12.x86_64). The guest becomes unresponsive after migration. However, using the "no-kvmclock" option, the guest is responsive after local migration. Thanks, Santwana
The fix is in qemu-kvm-0.11.0-8. Please make sure that you are using this version of qemu-kvm.
------- Comment From santwana.samantray.com 2009-10-29 05:33 EDT------- Hello Redhat, The qemu-kvm version in the latest F12 rawhide (k.v-2.6.31.5-96.fc12.x86_64) is qemu-kvm-0.11.0-7.fc12.x86_64. Can you give us a pointer for downloading "qemu-kvm-0.11.0-8", so that we can update the bug after verifying in the qemu-kvm-0.11.0-8 release. Thanks, Santwana
Should be in rawhide since the 27th: http://www.redhat.com/archives/fedora-test-list/2009-October/msg00674.html you can also download it from: http://koji.fedoraproject.org/koji/buildinfo?buildID=137730
------- Comment From santwana.samantray.com 2009-10-29 08:50 EDT------- Hello Redhat, Thanks for the link. It was there in rawhide since 27th. After installing "qemu-kvm-0.11.0-8", the issue is resolved now. We can close this issue. Thanks for your support Santwana ------- Comment From bnpoorni.com 2009-10-29 08:55 EDT------- Closing as per the above comment...