RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1089921 - There will be file lost in guest after do blockcommit when guest with non-cached qcow2 disk as source file
Summary: There will be file lost in guest after do blockcommit when guest with non-cac...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: libvirt
Version: 7.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Libvirt Maintainers
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks: 1089924
TreeView+ depends on / blocked
 
Reported: 2014-04-22 09:00 UTC by Shanzhi Yu
Modified: 2014-04-23 17:20 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1089924 (view as bug list)
Environment:
Last Closed: 2014-04-23 12:01:20 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Shanzhi Yu 2014-04-22 09:00:26 UTC
Description of problem:

There will be file lost in guest after do blockcommit when guest with non-cached qcow2 disk as source file

Version-Release number of selected component (if applicable):

qemu-kvm-rhev-1.5.3-60.el7ev.x86_64
libvirt-1.1.1-29.el7.x86_64


How reproducible:

100%

Steps to Reproduce:

1.prepare an guest with non-cached qcow2 disk as source file

# virsh list
 Id    Name                           State
----------------------------------------------------
 17    rhel6                          running

# virsh domblklist rhel6
Target     Source
------------------------------------------------
sda        /var/lib/libvirt/images/base.img

#virsh dumpxml rhel6|grep disk -A 4

    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='none'/>
      <source file='/var/lib/libvirt/images/base.img'/>
      <target dev='sda' bus='scsi'/>
      <alias name='scsi0-0-0-0'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>


2.create  external disk-only snapshot, and login guest to create
file with same name as snapshot name

# virsh snapshot-create-as rhel6 s1 --disk-only  
Domain snapshot s1 created

login guest and create file s1

[guest]# echo "hello s1" > s1

3. repeat step2, instead s1 with s2, s3 and s4

# virsh snapshot-list rhel6
 Name                 Creation Time             State
------------------------------------------------------------
 s1                   2014-04-21 17:35:44 +0800 disk-snapshot
 s2                   2014-04-21 17:36:13 +0800 disk-snapshot
 s3                   2014-04-21 17:36:27 +0800 disk-snapshot
 s4                   2014-04-21 17:36:42 +0800 disk-snapshot

# virsh domblklist rhel6
Target     Source
------------------------------------------------
sda        /var/lib/libvirt/images/base.s4

login guest and check the file created in step 2&3

[guest]# ll s*
-rw-r--r--. 1 root root     9 Apr 21 18:10 s1
-rw-r--r--. 1 root root     9 Apr 21 18:10 s2
-rw-r--r--. 1 root root     9 Apr 21 18:10 s3
-rw-r--r--. 1 root root     9 Apr 21 18:11 s4


4. do blockcommit from s2 to base; then check the disk chain

# virsh blockcommit rhel6 sda --top /var/lib/libvirt/images/base.s2  --base /var/lib/libvirt/images/base.img  --verbose --wait
Block Commit: [100 %]
Commit complete

# qemu-img info  --backing-chain /var/lib/libvirt/images/base.s4

/var/lib/libvirt/images/base.img <-- /var/lib/libvirt/images/base.s3 <-- /var/lib/libvirt/images/base.s4


5. change guest source file from base.s4 to base.img

# virsh domblklist rhel6
Target     Source
------------------------------------------------
sda        /var/lib/libvirt/images/base.img

login guest and check the file created in step 2&3

# ll s*
-rw-r--r--. 1 root root 0 Apr 21 18:10 s1
-rw-r--r--. 1 root root 0 Apr 21 18:10 s2

6. change guest source file from base.img to base.s4,
restart guest and login guest check file created in step 2&3

[guest]# ll s*

ls: cannot access s3: Input/output error
-rw-r--r--. 1 root root 0 Apr 21 18:10 s1
-rw-r--r--. 1 root root 0 Apr 21 18:10 s2
srwxr-xr-x. 1 gdm  gdm  0 Apr 21 18:17 s4

7. do blockcommit from s3 to base;then check the disk chain

# virsh blockcommit rhel6 sda --top /var/lib/libvirt/images/base.s3 --base /var/lib/libvirt/images/base.img  --verbose --wait
Block Commit: [100 %]
Commit complete

# qemu-img info  --backing-chain /var/lib/libvirt/images/base.s4

/var/lib/libvirt/images/base.img <-- /var/lib/libvirt/images/base.s4

8. change guest source file from base.s4 to base.img,
restart guest and login guest check file created in step 2&3

[guest]# ll s*
-rw-r--r--. 1 root root 0 Apr 21 18:10 s1
-rw-r--r--. 1 root root 0 Apr 21 18:10 s2




Actual results:

In step 5), the content of file s1 and s2 lost
In step 6), file s3 lost, file s4 change to socket file.
In step 8), file s3, lost

Expected results:



Additional info:

base.img is an qcow2 v2 format file, with rhel6.5 installed.

Comment 1 Jiri Denemark 2014-04-23 12:01:20 UTC
By booting the domain from base.img in step 5, you made any image based on base.img completely useless (i.e., base.s3 and base.s4 contain just garbage). This is because booting from the image changes it (and in case of ext3 even mounting the image which was not cleanly unmounted read-only would change it too).

Comment 2 Shanzhi Yu 2014-04-23 16:48:23 UTC
(In reply to Jiri Denemark from comment #1)
> By booting the domain from base.img in step 5, you made any image based on
> base.img completely useless (i.e., base.s3 and base.s4 contain just
> garbage). This is because booting from the image changes it (and in case of
> ext3 even mounting the image which was not cleanly unmounted read-only would
> change it too).

Jiri,
 
As your explaination, I can unsterstand step 6,7,8 is useless here. But, after commmit base.s2 to base.img, should base.img include file both s1 and s2? If not, what does blockcommit really did here?

Comment 3 Eric Blake 2014-04-23 17:08:37 UTC
Visually, look at it this way, where XX represents a cluster that refers back to the parent file.

Pre-commit, you have:

base.img AA BB CC DD       # Guest saw AA BB CC DD at this point
base.s1  EE XX FF XX       # Guest saw EE BB FF DD at this point
base.s2  GG HH XX XX       # Guest saw GG HH FF DD at this point

After committing base.s2 into base.img, you have:

base.img GG HH FF DD       # base.img now contains all content from s1 and s2
base.s1  EE XX FF XX       # Reading this image would see EE HH FF DD, but
                           # that never happened - the image is now corrupt
base.s2  GG HH XX XX       # this would still read as GG HH FF DD, but since
                           # it relies on corrupt base.s1, it's risky

so after the commit, the best thing is to declare base.s1 and base.s2 as useless, and delete them.  The point of commit is to shorten the chain by modifying the base image and discarding the snapshots that are now no longer needed now that the base image includes the same content.

Comment 4 Eric Blake 2014-04-23 17:20:24 UTC
One other thing to be aware of - you took a --disk-only snapshot, but without requesting --quiesce.  Reverting to that snapshot behaves the same as if you had pulled the power cord from a running machine.  If the OS had not flushed the files to disk prior to when you yank the cord (aka the time when you took the disk snapshot), then changes you made to the filesystem prior to the snapshot may not appear after reverting to that state, because they had not yet been flushed.  Remember that the state of the disk is often inconsistent (lags) in comparison to the state of the file system of a running system, where pulling the power cord abruptly may lose up to several seconds worth of changes as it rolls back to the last known safe journaling point that was actually recorded in the disk (this is intentional - if all file system operations waited for the disk to be consistent, your system would be much slower; the point of journaling filesystems is to cache in-flight file system changes for several seconds and only later catch the disk up to that state, so that a running system has better throughput, which works only as long as you can guarantee that power isn't yanked abruptly).

You probably want to ensure that the guest does sync in between creating a file and taking the snapshot, and/or use the --quiesce flag when creating the snapshot, both in order to ensure that the state of the disk at the time of the snapshot actually contains enough file system state so that reverting to your snapshot will see the files that you are creating in between snapshots.


Note You need to log in before you can comment on or make changes to this bug.