Bug 517230

Summary: Guest VM freeze during live migration
Product: [Fedora] Fedora Reporter: Ritesh Raj Sarraf <rsarraf>
Component: libvirtAssignee: Daniel Veillard <veillard>
Status: CLOSED NOTABUG QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: medium    
Version: 11CC: andriusb, berrange, clalance, coughlan, crobinso, ehabkost, gcosta, itamar, markmc, quintela, veillard, virt-maint, xdl-redhat-bugzilla
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-10-26 19:16:02 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 480594    
Attachments:
Description Flags
log file
none
more logs none

Description Ritesh Raj Sarraf 2009-08-13 06:15:43 UTC
Description of problem:
When initiating a live migration of RHEL 5.3 Guest VM on a Fedora 11 box, the migration succeeds. But the Guest VM on the migrated physical host is frozen. It doesn't ping to network also.


Version-Release number of selected component (if applicable):
Fedora 11  - Host
RHEL 5.3 - Guest

How reproducible:
Always

Steps to Reproduce:
1. Create a RHEL 5.3 Guest VM
2. Initiate a migrate using `migrate --live qemu+tcp://target-host/system
3. Migration will succeed.
4. Go to target-host and see VM status. It is frozen.

Actual results:
Guest VM freezes

Expected results:
Guest VM shouldn't freeze

Additional info:
I've followed the configuration/recommendations as mentioned on KVM Migration page.

Comment 1 Chris Lalancette 2009-08-13 06:57:53 UTC
What specific version of the qemu and libvirt packages are you running?  The versions that shipped with F-11 had a number of migration problems, which should be fixed in the versions in updates-testing.  There's also a difference in behavior in the very newest qemu packages that will require a libvirt update to address.  So knowing these versions should give us a first step to try to help.

Chris Lalancette

Comment 2 Ritesh Raj Sarraf 2009-08-13 07:23:48 UTC
qemu-kvm-0.10.5-3.fc11.x86_64
libvirt-0.6.2-13.fc11.x86_64
libvirt-python-0.6.2-13.fc11.x86_64


The test was done on an up-to-date Fedora 11. But not with the testing repo.

Comment 3 Mark McLoughlin 2009-08-14 18:05:45 UTC
Ritesh: could you attach the guest log file from /var/log/libvirt/qemu on the target host?

You could also try running libvirtd on both sides with LIBVIRT_DEBUG and attaching those log files

See also http://fedoraproject.org/wiki/Reporting_virtualization_bugs for other things to help with debugging


(In reply to comment #1)
> What specific version of the qemu and libvirt packages are you running?  The
> versions that shipped with F-11 had a number of migration problems, which
> should be fixed in the versions in updates-testing.

What fixes are you talking about?

> There's also a difference
> in behavior in the very newest qemu packages that will require a libvirt update
> to address.

Yep, that's bug #516187 - that qemu change is only in 0.10.6, so that's not the problem here

Comment 4 Ritesh Raj Sarraf 2009-08-15 06:41:51 UTC
(In reply to comment #3)
> Ritesh: could you attach the guest log file from /var/log/libvirt/qemu on the
> target host?
> 

Sure. Attached.

> You could also try running libvirtd on both sides with LIBVIRT_DEBUG and
> attaching those log files
>

Even when run in debug mode, things weren't different.
I didn't mention earlier that virt-manager would not allow me to migrate. I think you have a bugzilla for that.
I used virsh to migrate using qemu+tcp.
On the target, the VM gets migrated. virsh, on the target lists the VM. libvirtd shows that a process (with the same syntax) is created.
But as I said in this report, it is the VM that is frozen/dead/crashed.



I tried RHEL5, Debian kFreeBSD.
RHEL5 was frozen. I suspend the vga driver. RHEL5 was running in init 5.
For Debian kFreeBSD, I noticed that after migrate it would have reset itself and was at the FreeBSD Bootloader prompt. *BUT*, boot wouldn't proceed. So basically, the process is corrupt.

I suspect it could be the video driver that qemu is trying to emulate.

BTW, in good old qemu, Alt + 2 worked. How do I emulate the same in virsh/virt-manager ?

Comment 5 Ritesh Raj Sarraf 2009-08-15 06:44:38 UTC
Created attachment 357529 [details]
log file

A shared storage was used to host the VM image files, which was accessible with both the physical hosts.

To keep things simple to be sure that it was a real bug, svirt was disabled in this test.

Comment 6 Ritesh Raj Sarraf 2009-08-15 06:57:56 UTC
The only kvm relevant messages I can find on the machine is this:

kvm: 8243: cpu0 unimplemented perfctr wrmsr: 0x186 data 0x130079
kvm: 8243: cpu0 unimplemented perfctr wrmsr: 0xc1 data 0xffdf7146
kvm: 8243: cpu0 unimplemented perfctr wrmsr: 0x186 data 0x530079

Comment 7 Ritesh Raj Sarraf 2009-08-15 07:02:32 UTC
Created attachment 357530 [details]
more logs

cpuinfo
lspci -vvv
dmidecode
virsh-capabilities

Comment 8 Chris Lalancette 2009-08-17 08:41:10 UTC
(In reply to comment #3)
> (In reply to comment #1)
> > What specific version of the qemu and libvirt packages are you running?  The
> > versions that shipped with F-11 had a number of migration problems, which
> > should be fixed in the versions in updates-testing.
> 
> What fixes are you talking about?

The ones that went in (the F-11) package here:

* Tue May 12 2009 Mark McLoughlin <markmc> - 2:0.10.4-1
- Update to 0.10.4
- Fix yet more qcow2 corruption (#498405)
- AIO cancellation fixes (#497170)
- Fix VPC image size overflow (#491981)
- Fix oops with 2.6.25 virtio guest (#470386)
- Enable pulseaudio driver (#495964, #496627)
- Fix cpuid initialization
- Fix HPET emulation
- Fix storage hotplug error handling
- Migration fixes
- Block range checking fixes
- Make PCI config status register read-only
- Handle newer Xorg keymap names
- Don't leak memory on NIC hot-unplug
- Hook up keypad keys for qemu console emulation
- Correctly run on kernels lacking mmu notifiers
- Support DDIM option ROMs
- Fix PCI NIC error handling

That was shipped post F-11 GA, so I wasn't sure if he was still using the GA packages or not.  Apparently not, so it's got to be something else.

Chris Lalancette

Comment 9 Mark McLoughlin 2009-08-18 11:04:43 UTC
Okay, no real idea what the problem here is, it needs someone to dig down into it

Thanks for the data

Comment 10 Ritesh Raj Sarraf 2009-08-18 13:59:11 UTC
Apart from the kernel messages (which I think is safe to ignore), I am not sure where to start with.

The emulator IMO is not frozen/hung because in the kFreeBSD example, it did accept the key press and then froze.

Okay!! Just to re-cap.

* I'm omitting the RHEL5 VM Migration because I couldn't find much info on what really happened with its state.
* On the kFreeBSD VM, upon Migration (which is reported as Successful on the source),  on the target host the VM's OS had been reset. It was at the boot loader prompt. So actually, the migration had failed. And, proceeding further from that state also failed, which would mean that the process also was garbage.

So, it looks like a migration problem, where it does migrate the process, but then corrupts it.

Comment 11 Mark McLoughlin 2009-08-18 14:20:18 UTC
One route to get further with this is to try and reproduce without libvirt

http://wiki.libvirt.org/page/QEMUSwitchToLibvirt might be of some help

If we can reproduce without libvirt, it'll help us narrow down the problem

Comment 12 Andrius Benokraitis 2009-10-26 19:16:02 UTC
Ritesh - any updates here, or will you just be waiting for RHEL 6 now? Will close based on that - please re-open if needed.