Bug 689416 - Storage driver should flush host cache after cloning volumes to avoid possible data loss
Summary: Storage driver should flush host cache after cloning volumes to avoid possibl...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: libvirt
Version: 6.1
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: rc
: ---
Assignee: Michal Privoznik
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-03-21 12:41 UTC by Daniel Berrangé
Modified: 2011-12-06 10:58 UTC (History)
7 users (show)

Fixed In Version: libvirt-0.9.4-5.el6
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-12-06 10:58:03 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2011:1513 0 normal SHIPPED_LIVE libvirt bug fix and enhancement update 2011-12-06 01:23:30 UTC

Description Daniel Berrangé 2011-03-21 12:41:30 UTC
Description of problem:
After reading this thread:

  http://www.mail-archive.com/qemu-devel@nongnu.org/msg59049.html

It seems like, for extra safety, libvirt's storage driver ought to be calling fdatasync(fd) after creating any storage volumes (particularly if doing volume cloning) to ensure the new volume is committed to underlying storage. It seems like the current code allows for potential data loss to the newly created volume upon host crash.

Alternatively, we might want to use O_DIRECT when initializing storage volumes with explicit data writes.

Version-Release number of selected component (if applicable):
libvirt-0.8.7-11.el6

How reproducible:
Unclear, possible data loss from new storage volume if host crashes immediately after the volume has been copied.

Steps to Reproduce:
1. Create a new raw volume in libvirt, cloning data from an existing volume
2. Kill the host non-gracefully
3. See if data is still present in new volume after reboot.
  
Actual results:


Expected results:


Additional info:

Comment 2 RHEL Program Management 2011-04-04 02:06:05 UTC
Since RHEL 6.1 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.

Comment 3 Michal Privoznik 2011-08-19 09:38:20 UTC
Moving to POST:

http://post-office.corp.redhat.com/archives/rhvirt-patches/2011-August/msg00539.html

commit b4cdf3d74920c8c3dc3dbea9a467782238b9f585
Author: Michal Privoznik <mprivozn>
Date:   Thu Aug 18 14:40:03 2011 +0200

    storage: Flush host cache after write
    
    Although we are flushing cache after some critical writes (e.g.
    volume creation), after some others we do not (e.g. volume cloning).
    This patch fix this issue. That is for volume cloning, writing
    header of logical volume, and storage wipe.
    (cherry picked from commit b32f8b19894c2a05e5955056caa45cb4cb0babeb)

Comment 5 Nan Zhang 2011-08-25 03:33:27 UTC
Verified with libvirt-0.9.4-5.el6.x86_64, it's fixed. Move it to VERIFIED.


1) Clone a raw volume by libvirt.
# virsh vol-clone foo.img foo-clone.img --pool default
Vol foo-clone.img cloned from foo.img

# qemu-img info /var/lib/libvirt/images/foo-clone.img 
image: /var/lib/libvirt/images/foo-clone.img
file format: raw
virtual size: 6.0G (6442450944 bytes)
disk size: 6.0G

# virsh start foo-clone
Domain foo-clone started

2) In guest, copy a file via network, and then executing below command to make host crash immediately.
# echo 1 > /proc/sys/kernel/sysrq
# echo c > /proc/sysrq-trigger

3) Checking copied file which is still present in new volume after reboot.

Comment 6 errata-xmlrpc 2011-12-06 10:58:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2011-1513.html


Note You need to log in before you can comment on or make changes to this bug.