Bug 517230
Summary: | Guest VM freeze during live migration | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Ritesh Raj Sarraf <rsarraf> | ||||||
Component: | libvirt | Assignee: | Daniel Veillard <veillard> | ||||||
Status: | CLOSED NOTABUG | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 11 | CC: | andriusb, berrange, clalance, coughlan, crobinso, ehabkost, gcosta, itamar, markmc, quintela, veillard, virt-maint, xdl-redhat-bugzilla | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2009-10-26 19:16:02 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 480594 | ||||||||
Attachments: |
|
Description
Ritesh Raj Sarraf
2009-08-13 06:15:43 UTC
What specific version of the qemu and libvirt packages are you running? The versions that shipped with F-11 had a number of migration problems, which should be fixed in the versions in updates-testing. There's also a difference in behavior in the very newest qemu packages that will require a libvirt update to address. So knowing these versions should give us a first step to try to help. Chris Lalancette qemu-kvm-0.10.5-3.fc11.x86_64 libvirt-0.6.2-13.fc11.x86_64 libvirt-python-0.6.2-13.fc11.x86_64 The test was done on an up-to-date Fedora 11. But not with the testing repo. Ritesh: could you attach the guest log file from /var/log/libvirt/qemu on the target host? You could also try running libvirtd on both sides with LIBVIRT_DEBUG and attaching those log files See also http://fedoraproject.org/wiki/Reporting_virtualization_bugs for other things to help with debugging (In reply to comment #1) > What specific version of the qemu and libvirt packages are you running? The > versions that shipped with F-11 had a number of migration problems, which > should be fixed in the versions in updates-testing. What fixes are you talking about? > There's also a difference > in behavior in the very newest qemu packages that will require a libvirt update > to address. Yep, that's bug #516187 - that qemu change is only in 0.10.6, so that's not the problem here (In reply to comment #3) > Ritesh: could you attach the guest log file from /var/log/libvirt/qemu on the > target host? > Sure. Attached. > You could also try running libvirtd on both sides with LIBVIRT_DEBUG and > attaching those log files > Even when run in debug mode, things weren't different. I didn't mention earlier that virt-manager would not allow me to migrate. I think you have a bugzilla for that. I used virsh to migrate using qemu+tcp. On the target, the VM gets migrated. virsh, on the target lists the VM. libvirtd shows that a process (with the same syntax) is created. But as I said in this report, it is the VM that is frozen/dead/crashed. I tried RHEL5, Debian kFreeBSD. RHEL5 was frozen. I suspend the vga driver. RHEL5 was running in init 5. For Debian kFreeBSD, I noticed that after migrate it would have reset itself and was at the FreeBSD Bootloader prompt. *BUT*, boot wouldn't proceed. So basically, the process is corrupt. I suspect it could be the video driver that qemu is trying to emulate. BTW, in good old qemu, Alt + 2 worked. How do I emulate the same in virsh/virt-manager ? Created attachment 357529 [details]
log file
A shared storage was used to host the VM image files, which was accessible with both the physical hosts.
To keep things simple to be sure that it was a real bug, svirt was disabled in this test.
The only kvm relevant messages I can find on the machine is this: kvm: 8243: cpu0 unimplemented perfctr wrmsr: 0x186 data 0x130079 kvm: 8243: cpu0 unimplemented perfctr wrmsr: 0xc1 data 0xffdf7146 kvm: 8243: cpu0 unimplemented perfctr wrmsr: 0x186 data 0x530079 Created attachment 357530 [details]
more logs
cpuinfo
lspci -vvv
dmidecode
virsh-capabilities
(In reply to comment #3) > (In reply to comment #1) > > What specific version of the qemu and libvirt packages are you running? The > > versions that shipped with F-11 had a number of migration problems, which > > should be fixed in the versions in updates-testing. > > What fixes are you talking about? The ones that went in (the F-11) package here: * Tue May 12 2009 Mark McLoughlin <markmc> - 2:0.10.4-1 - Update to 0.10.4 - Fix yet more qcow2 corruption (#498405) - AIO cancellation fixes (#497170) - Fix VPC image size overflow (#491981) - Fix oops with 2.6.25 virtio guest (#470386) - Enable pulseaudio driver (#495964, #496627) - Fix cpuid initialization - Fix HPET emulation - Fix storage hotplug error handling - Migration fixes - Block range checking fixes - Make PCI config status register read-only - Handle newer Xorg keymap names - Don't leak memory on NIC hot-unplug - Hook up keypad keys for qemu console emulation - Correctly run on kernels lacking mmu notifiers - Support DDIM option ROMs - Fix PCI NIC error handling That was shipped post F-11 GA, so I wasn't sure if he was still using the GA packages or not. Apparently not, so it's got to be something else. Chris Lalancette Okay, no real idea what the problem here is, it needs someone to dig down into it Thanks for the data Apart from the kernel messages (which I think is safe to ignore), I am not sure where to start with. The emulator IMO is not frozen/hung because in the kFreeBSD example, it did accept the key press and then froze. Okay!! Just to re-cap. * I'm omitting the RHEL5 VM Migration because I couldn't find much info on what really happened with its state. * On the kFreeBSD VM, upon Migration (which is reported as Successful on the source), on the target host the VM's OS had been reset. It was at the boot loader prompt. So actually, the migration had failed. And, proceeding further from that state also failed, which would mean that the process also was garbage. So, it looks like a migration problem, where it does migrate the process, but then corrupts it. One route to get further with this is to try and reproduce without libvirt http://wiki.libvirt.org/page/QEMUSwitchToLibvirt might be of some help If we can reproduce without libvirt, it'll help us narrow down the problem Ritesh - any updates here, or will you just be waiting for RHEL 6 now? Will close based on that - please re-open if needed. |