Bug 994517

Summary: cache=none/O_DIRECT workaround doesn't work for images with backing files
Product: [Community] Virtualization Tools Reporter: Richard W.M. Jones <rjones>
Component: libguestfsAssignee: Richard W.M. Jones <rjones>
Status: CLOSED UPSTREAM QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: acathrow, mbooth
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1003291 (view as bug list) Environment:
Last Closed: 2013-09-01 17:59:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1003291    

Description Richard W.M. Jones 2013-08-07 12:36:25 UTC
Description of problem:

Libguestfs uses cache=none which improves reliability in the case
of a host crash.  However cache=none causes O_DIRECT to be used
by qemu, and not all filesystems support this.  In particular,
tmpfs and ecryptfs don't support it[1].

In order to fix this, we try to open the image file with O_DIRECT
and don't use cache=none if it fails (see 'test_cache_none' function
in src/drives.c).

However this does not work in the case where a qcow2 image file
contains a backing disk or disks, and those disks are on another
filesystem type.  libguestfs needs to check that the "top" disk
and all backing disks are located on filesystems that can do
O_DIRECT.  The error message you get from qemu is not especially
helpful either.

This caused virt-sparsify to fail on a disk which was located in
an ecryptfs home directory.  virt-sparsify creates an overlay
on /tmp (in this case, ext4).  libguestfs checked the overlay
file and found that /tmp supports O_DIRECT, but since ecryptfs
does not, qemu failed to open the file.

(Reported by librarian on IRC)

Version-Release number of selected component (if applicable):

libguestfs 1.23 (but bug present since forever)

How reproducible:

100%

Steps to Reproduce:
1. Create a backing file on a filesystem that doesn't support O_DIRECT.
2. Create a qcow2 overlay on a filesystem that does support O_DIRECT.
3. Try to open the qcow2 overlay in libguestfs.

Actual results:

qemu gives an error "Invalid argument" when opening the file.

Expected results:

libguestfs should detect this case and remove cache=none
flag, allowing qemu to succeed.

It would be nice if qemu actually described the real error
rather than giving a stupid and non-actionable error message.

Additional info:

[1] http://stackoverflow.com/questions/7233977/using-direct-io-with-ecryptfs-and-similar-stackable-file-systems

Comment 1 Richard W.M. Jones 2013-09-01 17:53:58 UTC
I ditched the whole cache=none business in libguestfs >= 1.23.20:

https://github.com/libguestfs/libguestfs/commit/749e947bb0103f19feda0f29b6cbbf3cbfa350da