Description of problem: When you use "qemu-img convert -c" to compress an image that contains a Windows installation, the new image file contains leaked clusters and errors. Of course, you can repair these errors. After repairing, "qemu-img check -r all" prints that no errors are found. However, if you now run "qemu-img convert -c" again on the repaired image, the process fails after a while ("error while reading sector ...: Input/output error"). Version-Release number of selected component (if applicable): qemu-img-4.1.0-2.fc31.x86_64 How reproducible: Always with Windows Server 2016 and Server 2019 images (haven't tried other Windows versions) Steps to Reproduce: 1. Create a VM for Windows Server 2019 or 2016 2. After the installation completed, shut down the VM. 3. Check the image: # qemu-img check /var/lib/libvirt/images/win2k16.qcow2 No errors were found on the image. 327680/327680 = 100.00% allocated, 0.00% fragmented, 0.00% compressed clusters Image end offset: 21478375424 5. Shrink the image with compression (-c): # qemu-img convert -p -c -O qcow2 /var/lib/libvirt/images/win2k16.qcow2 /var/lib/libvirt/images/win2k16.shrink1.qcow2 (100.00/100%) 6. Check the new image file. It contains leaked clusters and errors (for the full output, see the attached shrink1-check.log file): # qemu-img check /var/lib/libvirt/images/win2k16.shrink1.qcow2 ERROR refcount block 0 refcount=5 ERROR cluster 0 refcount=1 reference=4 ERROR cluster 2 refcount=1 reference=5 ERROR cluster 3 refcount=1 reference=5 ... 4758 errors were found on the image. Data may be corrupted, or further writes to the image may corrupt it. 4757 leaked clusters were found on the image. This means waste of disk space, but no harm to data. 136598/327680 = 41.69% allocated, 87.61% fragmented, 85.77% compressed clusters Image end offset: 4781047808 7. Repair the errors (for the full output, see the attached shrink1-repair.log file):: # qemu-img check -r all /var/lib/libvirt/images/win2k16.shrink1.qcow2 ... Leaked cluster 72950 refcount=4 reference=0 Leaked cluster 72951 refcount=3 reference=0 Rebuilding refcount structure Repairing cluster 1 refcount=1 reference=0 Repairing cluster 2 refcount=5 reference=4 Repairing cluster 32769 refcount=1 reference=0 Repairing cluster 65537 refcount=1 reference=0 Repairing OFLAG_COPIED L2 cluster: l1_index=0 l1_entry=8000000000040000 refcount=3 Repairing OFLAG_COPIED data cluster: l2_entry=8000000000200000 refcount=4 ... The following inconsistencies were found and repaired: 4757 leaked clusters 5694 corruptions Double checking the fixed image now... No errors were found on the image. 136598/327680 = 41.69% allocated, 87.61% fragmented, 85.77% compressed clusters Image end offset: 4781047808 8. Now run "qemu-img convert" with compression (-c) on the repaired image. The processes fails after a while: # qemu-img convert -p -c -O qcow2 /var/lib/libvirt/images/win2k16.shrink1.qcow2 /var/lib/libvirt/images/win2k16.shrink2.qcow2 qemu-img: error while reading sector 17662464: Input/output error Additional info: * The problem seems not to be related to VirtIO Windows drivers. I could reproduce the problem also with a SATA disk in the VM. * The problem seems to not occur if you shrink the image without compression. * This problem does not occur if RHEL is installed in the image file. * I never had this problem on Fedora 30 (qemu-img-3.1.1-2.fc30.x86_64.rpm)
Created attachment 1628898 [details] shrink1-check.log
Created attachment 1628899 [details] shrink1-repair.log
Could be related to general qcow2 problems, see: https://bugzilla.redhat.com/show_bug.cgi?id=1763519
The problem I reported doesn't seem to be related to BZ#1763519, because this BZ has been closed after qemu-img-4.1.0-5.fc31.x86_64 was released, but I still have the problems with this version. Note that the image corruption is always reproduceable here with Windows images (not Linux images), if you follow the steps I described in #c0.
There's a separate problem with compressed qcow2, which is tracked in another bug. duping to that one, check the linked qemu update in that bug *** This bug has been marked as a duplicate of bug 1768541 ***