Bug 1772321 - qcow2 image corruption due to incorrect locking in preallocation detection
Summary: qcow2 image corruption due to incorrect locking in preallocation detection
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: qemu-kvm
Version: 8.2
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: 8.1
Assignee: Kevin Wolf
QA Contact: CongLi
URL:
Whiteboard:
Depends On: 1764721
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-11-14 06:28 UTC by Tingting Mao
Modified: 2020-05-05 09:51 UTC (History)
12 users (show)

Fixed In Version: qemu-kvm-4.2.0-1.module+el8.2.0+4793+b09dd2fb
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1764721
Environment:
Last Closed: 2020-05-05 09:50:55 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2020:2017 0 None None None 2020-05-05 09:51:38 UTC

Description Tingting Mao 2019-11-14 06:28:47 UTC
+++ This bug was initially created as a clone of Bug #1764721 +++

Tested with:
qemu-kvm-4.1.0-14.module+el8.2.0+4677+51176c2e
kernel-4.18.0-148.el8


Steps:
1. Boot a guest from a image installed OS before.
# /usr/libexec/qemu-kvm -name 'guest' -machine q35 -m 4096 -monitor stdio -vnc :0 -drive file=rhel78-64-virtio-scsi.qcow2,id=tt

2. 'savevm' while writing data to guest
(guest) $ while true ; do dd if=/dev/zero of=ftest bs=1024k count=4000 ; done
(qemu) savevm foo
(qemu) quit

3. Do test loop for it
# /usr/libexec/qemu-kvm -name 'guest' -machine q35 -m 4096 -monitor stdio -vnc :0 -drive file=rhel78-64-virtio-scsi.qcow2,id=tt
QEMU 4.1.0 monitor - type 'help' for more information
(qemu) 
(qemu) 
(qemu) 
(qemu) savevm foo
(qemu) 
(qemu) quit 
[root@lenovo-sr630-02 test]# 
[root@lenovo-sr630-02 test]# while true ; do (echo loadvm foo ; echo c ; sleep 10 ; echo stop ; echo savevm foo ; echo quit ) | /usr/libexec/qemu-kvm -name 'guest' -machine q35 -m 4096 -monitor stdio -vnc :0 -drive file=rhel78-64-virtio-scsi.qcow2,id=tt; done
QEMU 4.1.0 monitor - type 'help' for more information
(qemu) loadvm foo
(qemu) c
(qemu) stop
(qemu) savevm foo
(qemu) quit
QEMU 4.1.0 monitor - type 'help' for more information
(qemu) loadvm foo
(qemu) c
(qemu) stop
(qemu) savevm foo
Error: Error while deleting snapshot on device 'tt': Failed to free the cluster and L1 table: Invalid argument
(qemu) quit
QEMU 4.1.0 monitor - type 'help' for more information
(qemu) loadvm foo
Error: Device 'tt' does not have the requested snapshot 'foo'
(qemu) c
(qemu) stop
(qemu) savevm foo
^Cqcow2: Marking image as corrupt: Preventing invalid write on metadata (overlaps with active L2 table); further corruption events will be suppressed
Error: Error while writing VM state: Input/output error
(qemu) qemu-kvm: terminating on signal 2
qemu-kvm: -drive file=rhel78-64-virtio-scsi.qcow2,id=tt: qcow2: Image is corrupt; cannot be opened read/write
^C

4. Check the image
# qemu-img check rhel78-64-virtio-scsi.qcow2
......
......
ERROR OFLAG_COPIED data cluster: l2_entry=4f3b0000 refcount=1
ERROR OFLAG_COPIED data cluster: l2_entry=4f3c0000 refcount=1
ERROR OFLAG_COPIED data cluster: l2_entry=4f3d0000 refcount=1
ERROR OFLAG_COPIED data cluster: l2_entry=4f3e0000 refcount=1
ERROR OFLAG_COPIED data cluster: l2_entry=4f3f0000 refcount=1
ERROR OFLAG_COPIED data cluster: l2_entry=4f400000 refcount=1

13656 errors were found on the image.
Data may be corrupted, or further writes to the image may corrupt it.

117027 leaked clusters were found on the image.
This means waste of disk space, but no harm to data.
149762/327680 = 45.70% allocated, 20.27% fragmented, 0.00% compressed clusters



Result:
As above, the image is corrupted.

Comment 2 Kevin Wolf 2019-11-14 08:28:38 UTC
This will be fixed with the rebase on upstream 4.2.

Comment 5 Tingting Mao 2019-11-19 03:37:16 UTC
Tried this bug in latest qemu version as below. The bug has been fixed already.


Tested with:
qemu-kvm-4.2.0-0.module+el8.2.0+4743+23ad88a2
kernel-4.18.0-148.el8.x86_64


Steps:
1.Create image file
# qemu-img create -f qcow2 -o preallocation=falloc base.qcow2 20G

2. Install guest with it
# /usr/libexec/qemu-kvm -name 'guest' -machine q35 -m 4096 -monitor stdio -vnc :0 -drive file=base.qcow2,id=tt -cdrom RHEL7.7-Server-x86_64.iso

3. Shutdown after installation, and check the image file
# qemu-img check base.qcow2 
No errors were found on the image.
327680/327680 = 100.00% allocated, 0.00% fragmented, 0.00% compressed clusters

4. Boot guest again from the image file
# /usr/libexec/qemu-kvm -name 'guest' -machine q35 -m 4096 -monitor stdio -vnc :0 -drive file=base.qcow2,id=tt

5. ‘Savevm’ while ‘dd’ in the guest
(guest) $ while true ; do dd if=/dev/zero of=ftest bs=1024k count=4000 ; done
(qemu) savevm foo
(qemu) quit

6. Do test loop
# while true ; do (echo loadvm foo ; echo c ; sleep 10 ; echo stop ; echo savevm foo ; echo quit ) | /usr/libexec/qemu-kvm -name 'guest' -machine q35 -m 4096 -monitor stdio -vnc :0 -drive file=base.qcow2,id=tt; done


Result:
For Step6, after 15-time loops, it still works. And check the image after quitting the loop, there is no error in the image. 
QEMU 4.1.91 monitor - type 'help' for more information
(qemu) loadvm foo
(qemu) c
(qemu) stop
(qemu) savevm foo
(qemu) quit
QEMU 4.1.91 monitor - type 'help' for more information
(qemu) loadvm foo
(qemu) c
(qemu) stop
(qemu) savevm foo
(qemu) quit
QEMU 4.1.91 monitor - type 'help' for more information
(qemu) loadvm foo
(qemu) c
(qemu) stop
(qemu) savevm foo
(qemu) quit
QEMU 4.1.91 monitor - type 'help' for more information
(qemu) loadvm foo
(qemu) c
(qemu) stop
(qemu) savevm foo
(qemu) quit
QEMU 4.1.91 monitor - type 'help' for more information
(qemu) loadvm foo
(qemu) c
(qemu) stop
(qemu) savevm foo
(qemu) quit
QEMU 4.1.91 monitor - type 'help' for more information
(qemu) loadvm foo
(qemu) c
(qemu) stop
(qemu) savevm foo
(qemu) quit
QEMU 4.1.91 monitor - type 'help' for more information
(qemu) loadvm foo
(qemu) c
(qemu) stop
(qemu) savevm foo
(qemu) quit
QEMU 4.1.91 monitor - type 'help' for more information
(qemu) loadvm foo
(qemu) c
(qemu) stop
(qemu) savevm foo
(qemu) quit
QEMU 4.1.91 monitor - type 'help' for more information
(qemu) loadvm foo
(qemu) c
(qemu) stop
(qemu) savevm foo
(qemu) quit
QEMU 4.1.91 monitor - type 'help' for more information
(qemu) loadvm foo
(qemu) c
(qemu) stop
(qemu) savevm foo
(qemu) quit
QEMU 4.1.91 monitor - type 'help' for more information
(qemu) loadvm foo
(qemu) c
(qemu) stop
(qemu) savevm foo
(qemu) quit
.......


# qemu-img check base.qcow2 
No errors were found on the image.
334340/327680 = 102.03% allocated, 2.31% fragmented, 0.00% compressed clusters
Image end offset: 21985427456

Comment 7 Tingting Mao 2019-11-29 02:51:52 UTC
Tried with latest qemu version 'qemu-kvm-4.2.0-1.module+el8.2.0+4793+b09dd2fb', there is no the issue anymore.


Tested with:
qemu-kvm-4.2.0-1.module+el8.2.0+4793+b09dd2fb
kernel-4.18.0-153.el8.x86_64


Steps:
As Comment 5.


Result:
Works well after about 18-times loop.


# while true ; do (echo loadvm foo ; echo c ; sleep 10 ; echo stop ; echo savevm foo ; echo quit ) | /usr/libexec/qemu-kvm -name 'guest' -machine q35 -m 4096 -monitor stdio -vnc :0 -drive file=rhel77-64-virtio-scsi.qcow2,id=tt; done
QEMU 4.2.0 monitor - type 'help' for more information
(qemu) loadvm foo
(qemu) c
(qemu) stop
(qemu) savevm foo
(qemu) quit
QEMU 4.2.0 monitor - type 'help' for more information
(qemu) loadvm foo
(qemu) c
(qemu) stop
(qemu) savevm foo
(qemu) quit
QEMU 4.2.0 monitor - type 'help' for more information
(qemu) loadvm foo
(qemu) c
(qemu) stop
(qemu) savevm foo
(qemu) quit
QEMU 4.2.0 monitor - type 'help' for more information
(qemu) loadvm foo
(qemu) c
(qemu) stop
(qemu) savevm foo
(qemu) quit
QEMU 4.2.0 monitor - type 'help' for more information
(qemu) loadvm foo
(qemu) c
(qemu) stop
(qemu) savevm foo
(qemu) quit
QEMU 4.2.0 monitor - type 'help' for more information
(qemu) loadvm foo
(qemu) c
(qemu) stop
(qemu) savevm foo
(qemu) quit
QEMU 4.2.0 monitor - type 'help' for more information
(qemu) loadvm foo
(qemu) c
(qemu) stop
(qemu) savevm foo
(qemu) quit
QEMU 4.2.0 monitor - type 'help' for more information
(qemu) loadvm foo
(qemu) c
(qemu) stop
(qemu) savevm foo
(qemu) quit
QEMU 4.2.0 monitor - type 'help' for more information
(qemu) loadvm foo
(qemu) c
(qemu) stop
(qemu) savevm foo
(qemu) quit
QEMU 4.2.0 monitor - type 'help' for more information
(qemu) loadvm foo
(qemu) c
(qemu) stop
(qemu) savevm foo
(qemu) quit
QEMU 4.2.0 monitor - type 'help' for more information
(qemu) loadvm foo
(qemu) c
(qemu) stop
(qemu) savevm foo
(qemu) quit
QEMU 4.2.0 monitor - type 'help' for more information
(qemu) loadvm foo
(qemu) c
(qemu) stop
(qemu) savevm foo
(qemu) quit
QEMU 4.2.0 monitor - type 'help' for more information
(qemu) loadvm foo
(qemu) c
(qemu) stop
(qemu) savevm foo
(qemu) quit
QEMU 4.2.0 monitor - type 'help' for more information
(qemu) loadvm foo
(qemu) c
(qemu) stop
(qemu) savevm foo
(qemu) quit
QEMU 4.2.0 monitor - type 'help' for more information
(qemu) loadvm foo
(qemu) c
(qemu) stop
(qemu) savevm foo
(qemu) quit
QEMU 4.2.0 monitor - type 'help' for more information
(qemu) loadvm foo
(qemu) c
(qemu) stop
(qemu) savevm foo
(qemu) quit
QEMU 4.2.0 monitor - type 'help' for more information
(qemu) loadvm foo
(qemu) c
(qemu) stop
(qemu) savevm foo
(qemu) quit
QEMU 4.2.0 monitor - type 'help' for more information
(qemu) loadvm foo
(qemu) c
(qemu) stop
(qemu) savevm foo
(qemu) quit
QEMU 4.2.0 monitor - type 'help' for more information
(qemu) loadvm foo
(qemu) c
(qemu) stop
(qemu) savevm foo
(qemu) quit
QEMU 4.2.0 monitor - type 'help' for more information
(qemu) loadvm foo
(qemu) c
(qemu) stop
(qemu) savevm foo

Comment 8 Ademar Reis 2020-02-05 23:08:18 UTC
QEMU has been recently split into sub-components and as a one-time operation to avoid breakage of tools, we are setting the QEMU sub-component of this BZ to "General". Please review and change the sub-component if necessary the next time you review this BZ. Thanks

Comment 11 errata-xmlrpc 2020-05-05 09:50:55 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2017


Note You need to log in before you can comment on or make changes to this bug.