Bug 689416

Summary: Storage driver should flush host cache after cloning volumes to avoid possible data loss
Product: Red Hat Enterprise Linux 6 Reporter: Daniel Berrangé <berrange>
Component: libvirtAssignee: Michal Privoznik <mprivozn>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: high    
Version: 6.1CC: dallan, dyuan, eblake, mzhan, nzhang, veillard, yoyzhang
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-0.9.4-5.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-12-06 10:58:03 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Daniel Berrangé 2011-03-21 12:41:30 UTC
Description of problem:
After reading this thread:

  http://www.mail-archive.com/qemu-devel@nongnu.org/msg59049.html

It seems like, for extra safety, libvirt's storage driver ought to be calling fdatasync(fd) after creating any storage volumes (particularly if doing volume cloning) to ensure the new volume is committed to underlying storage. It seems like the current code allows for potential data loss to the newly created volume upon host crash.

Alternatively, we might want to use O_DIRECT when initializing storage volumes with explicit data writes.

Version-Release number of selected component (if applicable):
libvirt-0.8.7-11.el6

How reproducible:
Unclear, possible data loss from new storage volume if host crashes immediately after the volume has been copied.

Steps to Reproduce:
1. Create a new raw volume in libvirt, cloning data from an existing volume
2. Kill the host non-gracefully
3. See if data is still present in new volume after reboot.
  
Actual results:


Expected results:


Additional info:

Comment 2 RHEL Program Management 2011-04-04 02:06:05 UTC
Since RHEL 6.1 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.

Comment 3 Michal Privoznik 2011-08-19 09:38:20 UTC
Moving to POST:

http://post-office.corp.redhat.com/archives/rhvirt-patches/2011-August/msg00539.html

commit b4cdf3d74920c8c3dc3dbea9a467782238b9f585
Author: Michal Privoznik <mprivozn>
Date:   Thu Aug 18 14:40:03 2011 +0200

    storage: Flush host cache after write
    
    Although we are flushing cache after some critical writes (e.g.
    volume creation), after some others we do not (e.g. volume cloning).
    This patch fix this issue. That is for volume cloning, writing
    header of logical volume, and storage wipe.
    (cherry picked from commit b32f8b19894c2a05e5955056caa45cb4cb0babeb)

Comment 5 Nan Zhang 2011-08-25 03:33:27 UTC
Verified with libvirt-0.9.4-5.el6.x86_64, it's fixed. Move it to VERIFIED.


1) Clone a raw volume by libvirt.
# virsh vol-clone foo.img foo-clone.img --pool default
Vol foo-clone.img cloned from foo.img

# qemu-img info /var/lib/libvirt/images/foo-clone.img 
image: /var/lib/libvirt/images/foo-clone.img
file format: raw
virtual size: 6.0G (6442450944 bytes)
disk size: 6.0G

# virsh start foo-clone
Domain foo-clone started

2) In guest, copy a file via network, and then executing below command to make host crash immediately.
# echo 1 > /proc/sys/kernel/sysrq
# echo c > /proc/sysrq-trigger

3) Checking copied file which is still present in new volume after reboot.

Comment 6 errata-xmlrpc 2011-12-06 10:58:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2011-1513.html