Bug 1764721
Summary: | qcow2 image corruption due to incorrect locking in preallocation detection | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux Advanced Virtualization | Reporter: | Kevin Wolf <kwolf> | |
Component: | qemu-kvm | Assignee: | Virtualization Maintenance <virt-maint> | |
qemu-kvm sub component: | General | QA Contact: | Virtualization Bugs <virt-bugs> | |
Status: | CLOSED ERRATA | Docs Contact: | ||
Severity: | unspecified | |||
Priority: | unspecified | CC: | coli, ddepaula, hreitz, juzhang, kchamart, knoel, leiyang, lhh, virt-maint, ymankad | |
Version: | 8.1 | |||
Target Milestone: | rc | |||
Target Release: | 8.1 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | qemu-kvm-4.1.0-14.module+el8.1.0+4548+ed1300f4 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1772321 (view as bug list) | Environment: | ||
Last Closed: | 2019-11-06 12:58:42 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1745393, 1772321 |
Description
Kevin Wolf
2019-10-23 15:53:25 UTC
Failed to reproduce this bug again with below CML: Tested with: qemu-kvm-4.1.0-13.module+el8.1.0+4313+ef76ec61 Steps: 1.Create qcow2 base file # qemu-img create -f qcow2 base.qcow2 20G 2. Install guest with the file # /usr/libexec/qemu-kvm \ -name 'guest-rhel8.0' \ -machine q35 \ -nodefaults \ -vga qxl \ -device pcie-root-port,id=pcie.0-root-port-3,slot=3,chassis=3,addr=0x3,bus=pcie.0 \ -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie.0-root-port-3,addr=0x0 \ -drive id=drive_image1,if=none,snapshot=off,format=qcow2,file=base.qcow2 \ -device scsi-hd,id=image1,drive=drive_image1,serial=SYSTEM_DISK0 \ -drive id=drive_cd1,if=none,snapshot=off,aio=threads,cache=unsafe,media=cdrom,file=RHEL7.7-Server-x86_64.iso \ -device ide-cd,id=cd1,drive=drive_cd1,bus=ide.0,unit=0 \ -vnc :0 \ -monitor stdio \ -m 4096 \ -smp 8 \ -device pcie-root-port,id=pcie.0-root-port-4,slot=4,chassis=4,addr=0x4,bus=pcie.0 \ -device virtio-net-pci,mac=9a:33:6b:72:e4:b7,id=id84cDQ3,netdev=idjiXt3m,bus=pcie.0-root-port-4,addr=0x0 \ -netdev tap,id=idjiXt3m \ -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/timao/monitor-qmpmonitor1-20180220-094308-h9I6hRsI,server,nowait \ -mon chardev=qmp_id_qmpmonitor1,mode=control \ -device pcie-root-port,id=pcie.0-root-port-8,slot=8,chassis=8,addr=0x8,bus=pcie.0 \ 3. After installation, shutdown the guest and check the image then. # qemu-img check base.qcow2 No errors were found on the image. 25508/327680 = 7.78% allocated, 16.32% fragmented, 0.00% compressed clusters Image end offset: 1672675328 # qemu-img info base.qcow2 image: base.qcow2 file format: qcow2 virtual size: 20 GiB (21474836480 bytes) disk size: 1.56 GiB cluster_size: 65536 Format specific information: compat: 1.1 lazy refcounts: false refcount bits: 16 corrupt: false 4. Boot again from the base image(The command line is the same as the one in step2), and ‘dd’ in the guest. (guest)while true ; do dd if=/dev/zero of=test bs=1024k count=4000 ; done 5. Take internal snapshot while ‘dd’ is executed in guest (qemu) savevm foo (qemu) quit 6. Run loop # while true ; do (echo loadvm foo ; echo c ; sleep 10 ; echo stop ; echo savevm foo ; echo quit ) | /usr/libexec/qemu-kvm -name 'guest-rhel8.0' -machine q35 -nodefaults -vga qxl -device pcie-root-port,id=pcie.0-root-port-3,slot=3,chassis=3,addr=0x3,bus=pcie.0 -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie.0-root-port-3,addr=0x0 -drive id=drive_image1,if=none,snapshot=off,format=qcow2,file=base.qcow2 -device scsi-hd,id=image1,drive=drive_image1,serial=SYSTEM_DISK0 -vnc :0 -monitor stdio -m 4096 -smp 8; done QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit …… Result: Did not hit the error info even after 20 times loop. Does preallocating the test image help? It seems to be much better reproducible if I create the test image with: $ qemu-img create -f qcow2 -o preallocation=falloc qtest.qcow2 20G Max Reproduced successfully with preallocation=falloc when creatation. Thanks a lot Max. Tested with: qemu-kvm-4.1.0-13.module+el8.1.0+4313+ef76ec61 1.Create image file # qemu-img create -f qcow2 -o preallocation=falloc base.qcow2 20G 2. Install guest with it # /usr/libexec/qemu-kvm -name 'guest' -machine q35 -m 4096 -monitor stdio -vnc :0 -drive file=base.qcow2,id=tt -cdrom RHEL7.7-Server-x86_64.iso 3. Shutdown after installation, and check the image file # qemu-img check base.qcow2 No errors were found on the image. 327680/327680 = 100.00% allocated, 0.00% fragmented, 0.00% compressed clusters Image end offset: 21478375424 # qemu-img info base.qcow2 image: base.qcow2 file format: qcow2 virtual size: 20 GiB (21474836480 bytes) disk size: 24 GiB cluster_size: 65536 Format specific information: compat: 1.1 lazy refcounts: false refcount bits: 16 corrupt: false 4. Boot guest again from the image file # /usr/libexec/qemu-kvm -name 'guest' -machine q35 -m 4096 -monitor stdio -vnc :0 -drive file=base.qcow2,id=tt 5. ‘Savevm’ while ‘dd’ in the guest (guest) $ while true ; do dd if=/dev/zero of=ftest bs=1024k count=4000 ; done (qemu) savevm foo (qemu) quit 6. Do test loop # while true ; do (echo loadvm foo ; echo c ; sleep 10 ; echo stop ; echo savevm foo ; echo quit ) | /usr/libexec/qemu-kvm -name 'guest' -machine q35 -m 4096 -monitor stdio -vnc :0 -drive file=base.qcow2,id=tt; done QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo Error: Error while deleting snapshot on device 'tt': Failed to free the cluster and L1 table: Invalid argument ----------------------------------- Hit error here! (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo Error: Device 'tt' does not have the requested snapshot 'foo' (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) q q ^Cqemu-kvm: terminating on signal 2 Result: As above, hit the error info. Check the image after qemu quit, the image is corrupted. # qemu-img check base.qcow2 …... …... Leaked cluster 361093 refcount=3 reference=2 Leaked cluster 361094 refcount=3 reference=2 Leaked cluster 361095 refcount=3 reference=2 Leaked cluster 361096 refcount=3 reference=2 Leaked cluster 367004 refcount=1 reference=0 Leaked cluster 367005 refcount=1 reference=0 Leaked cluster 367006 refcount=1 reference=0 Leaked cluster 367007 refcount=1 reference=0 Leaked cluster 367008 refcount=1 reference=0 Leaked cluster 367009 refcount=1 reference=0 Leaked cluster 367010 refcount=1 reference=0 Leaked cluster 367011 refcount=1 reference=0 Leaked cluster 367018 refcount=1 reference=0 95 errors were found on the image. Data may be corrupted, or further writes to the image may corrupt it. 229436 leaked clusters were found on the image. This means waste of disk space, but no harm to data. 326148/327680 = 99.53% allocated, 0.27% fragmented, 0.00% compressed clusters Image end offset: 24072421376 Tried to test this bug as below: Tested with: qemu-kvm-4.1.0-14.module+el8.1.0+4548+ed1300f4 kernel-4.18.0-141.el8 Steps are the same as the ones in Comment 15. Result- Did not hit the issue after 20 times loop # while true ; do (echo loadvm foo ; echo c ; sleep 10 ; echo stop ; echo savevm foo ; echo quit ) | /usr/libexec/qemu-kvm -name 'guest' -machine q35 -m 4096 -monitor stdio -vnc :0 -drive file=base.qcow2,id=tt; done QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) stop (qemu) savevm foo (qemu) quit (In reply to Tingting Mao from comment #17) > Tried to test this bug as below: > > Tested with: > qemu-kvm-4.1.0-14.module+el8.1.0+4548+ed1300f4 > kernel-4.18.0-141.el8 > > > Steps are the same as the ones in Comment 15. > > > Result- Did not hit the issue after 20 times loop > > # while true ; do (echo loadvm foo ; echo c ; sleep 10 ; echo stop ; echo > savevm foo ; echo quit ) | /usr/libexec/qemu-kvm -name 'guest' -machine q35 > -m 4096 -monitor stdio -vnc :0 -drive file=base.qcow2,id=tt; done > QEMU 4.1.0 monitor - type 'help' for more information > (qemu) loadvm foo > (qemu) c > (qemu) stop > (qemu) savevm foo > (qemu) quit > QEMU 4.1.0 monitor - type 'help' for more information > (qemu) loadvm foo > (qemu) c > (qemu) stop > (qemu) savevm foo > (qemu) quit > QEMU 4.1.0 monitor - type 'help' for more information > (qemu) loadvm foo > (qemu) c > (qemu) stop > (qemu) savevm foo > (qemu) quit > QEMU 4.1.0 monitor - type 'help' for more information > (qemu) loadvm foo > (qemu) c > (qemu) stop > (qemu) savevm foo > (qemu) quit > QEMU 4.1.0 monitor - type 'help' for more information > (qemu) loadvm foo > (qemu) c > (qemu) stop > (qemu) savevm foo > (qemu) quit > QEMU 4.1.0 monitor - type 'help' for more information > (qemu) loadvm foo > (qemu) c > (qemu) stop > (qemu) savevm foo > (qemu) quit > QEMU 4.1.0 monitor - type 'help' for more information > (qemu) loadvm foo > (qemu) c > (qemu) stop > (qemu) savevm foo > (qemu) quit > QEMU 4.1.0 monitor - type 'help' for more information > (qemu) loadvm foo > (qemu) c > (qemu) stop > (qemu) savevm foo > (qemu) quit > QEMU 4.1.0 monitor - type 'help' for more information > (qemu) loadvm foo > (qemu) c > (qemu) stop > (qemu) savevm foo > (qemu) quit > QEMU 4.1.0 monitor - type 'help' for more information > (qemu) loadvm foo > (qemu) c > (qemu) stop > (qemu) savevm foo > (qemu) quit > QEMU 4.1.0 monitor - type 'help' for more information > (qemu) loadvm foo > (qemu) c > (qemu) stop > (qemu) savevm foo > (qemu) quit > QEMU 4.1.0 monitor - type 'help' for more information > (qemu) loadvm foo > (qemu) c > (qemu) stop > (qemu) savevm foo > (qemu) quit > QEMU 4.1.0 monitor - type 'help' for more information > (qemu) loadvm foo > (qemu) c > (qemu) stop > (qemu) savevm foo > (qemu) quit > QEMU 4.1.0 monitor - type 'help' for more information > (qemu) loadvm foo > (qemu) c > (qemu) stop > (qemu) savevm foo > (qemu) quit > QEMU 4.1.0 monitor - type 'help' for more information > (qemu) loadvm foo > (qemu) c > (qemu) stop > (qemu) savevm foo > (qemu) quit > QEMU 4.1.0 monitor - type 'help' for more information > (qemu) loadvm foo > (qemu) c > (qemu) stop > (qemu) savevm foo > (qemu) quit > QEMU 4.1.0 monitor - type 'help' for more information > (qemu) loadvm foo > (qemu) c > (qemu) stop > (qemu) savevm foo > (qemu) quit > QEMU 4.1.0 monitor - type 'help' for more information > (qemu) loadvm foo > (qemu) c > (qemu) stop > (qemu) savevm foo > (qemu) quit > QEMU 4.1.0 monitor - type 'help' for more information > (qemu) loadvm foo > (qemu) c > (qemu) stop > (qemu) savevm foo > (qemu) quit > QEMU 4.1.0 monitor - type 'help' for more information > (qemu) loadvm foo > (qemu) c > (qemu) stop > (qemu) savevm foo > (qemu) quit And after quit the loop, there is no corruption in the image. ... ... EMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo (qemu) c (qemu) ^Vstop (qemu) savevm foo ^C^C^Cq ^V^C(qemu) qemu-kvm: terminating on signal 2 QEMU 4.1.0 monitor - type 'help' for more information (qemu) loadvm foo ^C(qemu) qemu-kvm: terminating on signal 2 # qemu-img check base.qcow2 No errors were found on the image. 334488/327680 = 102.08% allocated, 3.26% fragmented, 0.00% compressed clusters Image end offset: 21991522304 Based on Comment 17, Comment 18 and Comment 19, set this bug as verified. Thanks all. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:3732 |