Bug 753169

Summary: QEMU driver mistakenly passes a plain file FD to QEMU for migration
Product: Red Hat Enterprise Linux 6 Reporter: Daniel Berrangé <berrange>
Component: libvirtAssignee: Jiri Denemark <jdenemar>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: high    
Version: 6.3CC: acathrow, ajia, dallan, dyuan, eblake, juzhang, mzhan, rwu, weizhan
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-0.9.10-1.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-06-20 06:36:33 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Daniel Berrangé 2011-11-11 14:07:10 UTC
Description of problem:
The QEMU migration 'fd' protocol, requires that any FD passed to it properly supports non-blocking I/O, ie will report EAGAIN.

File handles for regular files or block devices do not respect this.

In the libvirt O_DIRECT codepath we spawn the libvirt_iohelper and pass QEMU a pipe FD. We should use the I/O helper in the non-O_DIRECT codepath too.

Failure to respect this leads to blocking the QEMU main loop, which causes the migration to become non-live.

While this isn't a problem for save-to-file, this *is* particularly bad for libvirt with the core dump API, since we want the guest to remain live.


Version-Release number of selected component (if applicable):
libvirt-0.9.4-23.el6

How reproducible:
Sometimes

Steps to Reproduce:
It is mostly related to

virsh dump / virsh save

but actually measuring the problem is hard. Probably requires a systemtap script to instrument internals of QEMU to identify long pauses in use of the main loop

Actual results:


Expected results:


Additional info:

Comment 4 Jiri Denemark 2011-11-28 10:27:20 UTC
To test this, you need to run
# virsh dump --live guest guest.dump
and check if you can still interact with the guest while this virsh dump operation is running. Checking domain's state using virsh or virt-manager. By interacting with the guest, I mean either, for example, talking to it via virsh console or through ssh connection or even just pinging guest's IP.

Comment 5 Jiri Denemark 2012-02-06 14:52:25 UTC
Patches were sent upstream for review: https://www.redhat.com/archives/libvir-list/2012-February/msg00293.html

Comment 6 Jiri Denemark 2012-02-08 10:48:16 UTC
This is now fixed upstream by v0.9.10-rc2-3-gc8683f2:

commit c8683f231dd227da8540f3249d7e332ec7a75ad7
Author: Jiri Denemark <jdenemar>
Date:   Mon Feb 6 14:53:24 2012 +0100

    qemu: Always use iohelper for dumping domain core
    
    Qemu uses non-blocking I/O which doesn't play nice with regular file
    descriptors. We need to pass a pipe to qemu instead, which can easily be
    done using iohelper.

Comment 9 weizhang 2012-02-14 12:27:05 UTC
I test with
qemu-kvm-0.12.1.2-2.225.el6.x86_64
kernel-2.6.32-230.el6.x86_64
libvirt-0.9.10-1.el6.x86_64

It seems the phenomenon of live dump guest is similar as libvirt-0.9.4-23.el6.x86_64, I check with loop virsh domstate during live dump, at first, it is live, but it will pause a small time before the dump finished. It both happened on libvirt-0.9.10-1.el6.x86_64 and libvirt-0.9.4-23.el6.x86_64, but just less time paused than libvirt-0.9.4-23.el6.x86_64, it is that what we want? Or we need to see it always running?

Comment 10 Jiri Denemark 2012-02-14 12:38:44 UTC
The guest will always pause for some time before live dump finishes and the length of this pause depends mainly on what the guest is doing. To be honest I'm not sure what would be the best way of testing this in case more or significantly longer stalls cannot be observed in the guest.

Comment 12 Eric Blake 2012-02-15 16:31:34 UTC
If the dump takes long enough, you can inspect ps and lsof output to see if the destination file descriptor is owned by an instance of libvirt_iohelper, with libvirt_iohelper and qemu sharing a pipe (fixed version), or whether the destination fd is directly owned by qemu (pre-fixed version).

Also, note that the fix for this bug may have caused bug 790668.

Comment 13 weizhang 2012-02-16 03:08:14 UTC
(In reply to comment #12)
> If the dump takes long enough, you can inspect ps and lsof output to see if the
> destination file descriptor is owned by an instance of libvirt_iohelper, with
> libvirt_iohelper and qemu sharing a pipe (fixed version), or whether the
> destination fd is directly owned by qemu (pre-fixed version).


Thanks for Eric's help. I test with old and new version of libvirt and find the difference as Eric said:

1. For libvirt-0.9.4-23.el6.x86_64, when doing live dump, I check the fd with lsof
# lsof /root/mig-rhel.dump 
COMMAND  PID USER   FD   TYPE DEVICE  SIZE/OFF    NODE NAME
libvirtd 465 root   22w   REG    8,1 210058004 1573224 /root/mig-rhel.dump
qemu-kvm 559 qemu   27w   REG    8,1 210058004 1573224 /root/mig-rhel.dump

The destination fd is directly owned by qemu

2. For libvirt-0.9.10-1.el6.x86_64
# lsof /root/mig-rhel.dump 
COMMAND    PID USER   FD   TYPE DEVICE SIZE/OFF   NODE NAME
libvirt_i 4210 root    1w   REG    8,1 25165824 262271 /root/mig-rhel.dump
# lsof |grep pipe|grep libvirt_i
libvirt_i  4210      root    0r     FIFO                0,8        0t0     396315 pipe
# lsof |grep pipe|grep qemu-kvm
qemu-kvm   4552      qemu   18w     FIFO                0,8        0t0     396315 pipe
...

The destination fd is owned by libvirt_iohelper, and both libvirt_iohelper and qemu have a pipe with same node

So verify pass on libvirt-0.9.10-1.el6.x86_64

Comment 15 errata-xmlrpc 2012-06-20 06:36:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2012-0748.html