Red Hat Bugzilla – Bug 517230
Guest VM freeze during live migration
Last modified: 2009-10-26 15:16:02 EDT
Description of problem:
When initiating a live migration of RHEL 5.3 Guest VM on a Fedora 11 box, the migration succeeds. But the Guest VM on the migrated physical host is frozen. It doesn't ping to network also.
Version-Release number of selected component (if applicable):
Fedora 11 - Host
RHEL 5.3 - Guest
Steps to Reproduce:
1. Create a RHEL 5.3 Guest VM
2. Initiate a migrate using `migrate --live qemu+tcp://target-host/system
3. Migration will succeed.
4. Go to target-host and see VM status. It is frozen.
Guest VM freezes
Guest VM shouldn't freeze
I've followed the configuration/recommendations as mentioned on KVM Migration page.
What specific version of the qemu and libvirt packages are you running? The versions that shipped with F-11 had a number of migration problems, which should be fixed in the versions in updates-testing. There's also a difference in behavior in the very newest qemu packages that will require a libvirt update to address. So knowing these versions should give us a first step to try to help.
The test was done on an up-to-date Fedora 11. But not with the testing repo.
Ritesh: could you attach the guest log file from /var/log/libvirt/qemu on the target host?
You could also try running libvirtd on both sides with LIBVIRT_DEBUG and attaching those log files
See also http://fedoraproject.org/wiki/Reporting_virtualization_bugs for other things to help with debugging
(In reply to comment #1)
> What specific version of the qemu and libvirt packages are you running? The
> versions that shipped with F-11 had a number of migration problems, which
> should be fixed in the versions in updates-testing.
What fixes are you talking about?
> There's also a difference
> in behavior in the very newest qemu packages that will require a libvirt update
> to address.
Yep, that's bug #516187 - that qemu change is only in 0.10.6, so that's not the problem here
(In reply to comment #3)
> Ritesh: could you attach the guest log file from /var/log/libvirt/qemu on the
> target host?
> You could also try running libvirtd on both sides with LIBVIRT_DEBUG and
> attaching those log files
Even when run in debug mode, things weren't different.
I didn't mention earlier that virt-manager would not allow me to migrate. I think you have a bugzilla for that.
I used virsh to migrate using qemu+tcp.
On the target, the VM gets migrated. virsh, on the target lists the VM. libvirtd shows that a process (with the same syntax) is created.
But as I said in this report, it is the VM that is frozen/dead/crashed.
I tried RHEL5, Debian kFreeBSD.
RHEL5 was frozen. I suspend the vga driver. RHEL5 was running in init 5.
For Debian kFreeBSD, I noticed that after migrate it would have reset itself and was at the FreeBSD Bootloader prompt. *BUT*, boot wouldn't proceed. So basically, the process is corrupt.
I suspect it could be the video driver that qemu is trying to emulate.
BTW, in good old qemu, Alt + 2 worked. How do I emulate the same in virsh/virt-manager ?
Created attachment 357529 [details]
A shared storage was used to host the VM image files, which was accessible with both the physical hosts.
To keep things simple to be sure that it was a real bug, svirt was disabled in this test.
The only kvm relevant messages I can find on the machine is this:
kvm: 8243: cpu0 unimplemented perfctr wrmsr: 0x186 data 0x130079
kvm: 8243: cpu0 unimplemented perfctr wrmsr: 0xc1 data 0xffdf7146
kvm: 8243: cpu0 unimplemented perfctr wrmsr: 0x186 data 0x530079
Created attachment 357530 [details]
(In reply to comment #3)
> (In reply to comment #1)
> > What specific version of the qemu and libvirt packages are you running? The
> > versions that shipped with F-11 had a number of migration problems, which
> > should be fixed in the versions in updates-testing.
> What fixes are you talking about?
The ones that went in (the F-11) package here:
* Tue May 12 2009 Mark McLoughlin <firstname.lastname@example.org> - 2:0.10.4-1
- Update to 0.10.4
- Fix yet more qcow2 corruption (#498405)
- AIO cancellation fixes (#497170)
- Fix VPC image size overflow (#491981)
- Fix oops with 2.6.25 virtio guest (#470386)
- Enable pulseaudio driver (#495964, #496627)
- Fix cpuid initialization
- Fix HPET emulation
- Fix storage hotplug error handling
- Migration fixes
- Block range checking fixes
- Make PCI config status register read-only
- Handle newer Xorg keymap names
- Don't leak memory on NIC hot-unplug
- Hook up keypad keys for qemu console emulation
- Correctly run on kernels lacking mmu notifiers
- Support DDIM option ROMs
- Fix PCI NIC error handling
That was shipped post F-11 GA, so I wasn't sure if he was still using the GA packages or not. Apparently not, so it's got to be something else.
Okay, no real idea what the problem here is, it needs someone to dig down into it
Thanks for the data
Apart from the kernel messages (which I think is safe to ignore), I am not sure where to start with.
The emulator IMO is not frozen/hung because in the kFreeBSD example, it did accept the key press and then froze.
Okay!! Just to re-cap.
* I'm omitting the RHEL5 VM Migration because I couldn't find much info on what really happened with its state.
* On the kFreeBSD VM, upon Migration (which is reported as Successful on the source), on the target host the VM's OS had been reset. It was at the boot loader prompt. So actually, the migration had failed. And, proceeding further from that state also failed, which would mean that the process also was garbage.
So, it looks like a migration problem, where it does migrate the process, but then corrupts it.
One route to get further with this is to try and reproduce without libvirt
http://wiki.libvirt.org/page/QEMUSwitchToLibvirt might be of some help
If we can reproduce without libvirt, it'll help us narrow down the problem
Ritesh - any updates here, or will you just be waiting for RHEL 6 now? Will close based on that - please re-open if needed.