Bug 1086704

Summary: Don't allow aio=native without cache=none
Product: Red Hat Enterprise Linux 7 Reporter: Kevin Wolf <kwolf>
Component: libvirtAssignee: Giuseppe Scrivano <gscrivan>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.0CC: dyuan, jdenemar, juzhang, mikolaj, mzhan, rbalakri, sluo, srao, xuzhang
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-1.2.7-1.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-03-05 07:33:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Kevin Wolf 2014-04-11 11:04:40 UTC
When qemu is configured with a block device that has aio=native set, but the
cache mode doesn't use O_DIRECT (i.e. isn't cache=none/directsync or any unnamed
mode with explicit cache.direct=on), then the raw-posix block driver for local
files and block devices will silently fall back to aio=threads.

The blockdev-add interface rejects such combinations, but qemu can't change the
existing legacy interfaces that libvirt uses today, so the behaviour will
stay there. Still, in the common case such behaviour is surprising and
management should protect the user against it by rejecting or (if at all
possible on the libvirt level) warning against it.

danpb suggested on IRC that libvirt doesn't have such a check yet and I should
file a bug.

Comment 2 Sibiao Luo 2014-04-14 02:39:30 UTC
(In reply to Kevin Wolf from comment #0)
> When qemu is configured with a block device that has aio=native set, but the
> cache mode doesn't use O_DIRECT (i.e. isn't cache=none/directsync or any
> unnamed
> mode with explicit cache.direct=on), then the raw-posix block driver for
> local
> files and block devices will silently fall back to aio=threads.
> 
> The blockdev-add interface rejects such combinations, but qemu can't change
> the
> existing legacy interfaces that libvirt uses today, so the behaviour will
> stay there. Still, in the common case such behaviour is surprising and
> management should protect the user against it by rejecting or (if at all
> possible on the libvirt level) warning against it.
> 
> danpb suggested on IRC that libvirt doesn't have such a check yet and I
> should
> file a bug.
FYI: We have such a bug 1086502 for qemu-kvm component which caused QEMU core dumped.

Best Regards,
sluo

Comment 3 Giuseppe Scrivano 2014-07-02 11:45:14 UTC
This patch seems to be enough for showing a warning and it is the solution I am in favour of:

diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
index 4829176..6af76ce 100644
--- a/src/qemu/qemu_command.c
+++ b/src/qemu/qemu_command.c
@@ -3449,6 +3449,14 @@ qemuBuildDriveStr(virConnectPtr conn,
             mode = qemuDiskCacheV1TypeToString(disk->cachemode);
         }
 
+        if (disk->iomode == VIR_DOMAIN_DISK_IO_NATIVE &&
+            disk->cachemode != VIR_DOMAIN_DISK_CACHE_DIRECTSYNC) {
+            VIR_WARN(_("native I/O needs either no disk cache "
+                       "or directsync cache mode, QEMU will fallback "
+                       "to aio=threads"));
+            goto error;
+        }
+
         virBufferAsprintf(&opt, ",cache=%s", mode);
     } else if (disk->shared && !disk->readonly) {
         virBufferAddLit(&opt, ",cache=off");

is a warning enough in this case or should VIR_WARN be promoted to virReportError and deny the execution of the VM?

Comment 4 Giuseppe Scrivano 2014-07-09 16:30:07 UTC
fixed upstream by:

commit 058384003db776c580d0e5a3016a6384e8eb7b92
Author: Giuseppe Scrivano <gscrivan>
Date:   Tue Jul 8 16:08:57 2014 +0200

    qemu: raise an eror when using aio=native without cache=none
    
    Qemu will fallback to aio=threads when the cache mode doesn't use
    O_DIRECT, even if aio=native was explictly set.
    
    Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1086704
    
    Signed-off-by: Giuseppe Scrivano <gscrivan>


Fine to close this as UPSTREAM?

Comment 5 Jiri Denemark 2014-07-15 09:30:30 UTC
No, we need to make sure this is properly tested by QA.

Comment 7 Xuesong Zhang 2014-11-24 09:21:40 UTC
Test with the following package version, this bug is verified.

Package version:
libvirt-1.2.8-8.el7.x86_64
qemu-kvm-rhev-2.1.2-12.el7.x86_64
kernel-3.10.0-208.el7.x86_64

Testing Matrix:
1. combine option testing "io='native'" and "cache='writethrough'", then start guest.
#virsh dumpxml rhel7.1
......
 <disk type='block' device='disk'>
      <driver name='qemu' type='raw' cache='writethrough' io='native'/>
      <source dev='/dev/sdc'/>
      <target dev='vdb' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </disk>
......
# virsh start rhel7.1
error: Failed to start domain rhel7.1
error: unsupported configuration: native I/O needs either no disk cache or directsync cache mode, QEMU will fallback to aio=threads


2. combine option testing "io='native'" and "cache='writeback'", then start guest.
#virsh dumpxml rhel7.1
......
 <disk type='block' device='disk'>
      <driver name='qemu' type='raw' cache='writeback' io='native'/>
      <source dev='/dev/sdc'/>
      <target dev='vdb' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </disk>
......
# virsh start rhel7.1
error: Failed to start domain rhel7.1
error: unsupported configuration: native I/O needs either no disk cache or directsync cache mode, QEMU will fallback to aio=threads

3. combine option testing "io='native'" and "cache='unsafe'", then start guest.
#virsh dumpxml rhel7.1
......
 <disk type='block' device='disk'>
      <driver name='qemu' type='raw' cache='unsafe' io='native'/>
      <source dev='/dev/sdc'/>
      <target dev='vdb' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </disk>
......
# virsh start rhel7.1
error: Failed to start domain rhel7.1
error: unsupported configuration: native I/O needs either no disk cache or directsync cache mode, QEMU will fallback to aio=threads


4. as for other cache type and io type combination testing, the guest can be started up as expected. Following is the option combination testing matrix for your reference:
4.1 "io='native'" and "cache='default'" ----> guest can be started up as expected.
4.2 "io='native'" and "cache='none'" ----> guest can be started up as expected.
4.3 "io='native'" and "cache='directsync'" ----> guest can be started up as 
4.4 "io='threads'" and "cache='writethrough'" ----> guest can be started up as expected.
4.5 "io='threads'" and "cache='none'" ----> guest can be started up as expected.
4.6 "io='threads'" and "cache='directsync'" ----> guest can be started up as expected.
4.7 "io='threads'" and "cache='default'" ----> guest can be started up as expected.
4.8 "io='threads'" and "cache='writeback'" ----> guest can be started up as expected.
4.9 "io='threads'" and "cache='unsafe'" ----> guest can be started up as expected.

Comment 9 errata-xmlrpc 2015-03-05 07:33:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-0323.html

Comment 10 Tomas Jelinek 2015-09-09 09:09:49 UTC
*** Bug 1259398 has been marked as a duplicate of this bug. ***