Description of problem: we have a consistent qemu image corruption that occurs on thinly provisioned virtIO disk on iscsi storage when first event of e_no_space occurs. this happens on latest qemu-kvm version kvm-83-164.el5_5.12. scenario: 1) make sure to run with 2 hosts 2) make sure to create a new vm with virtIO disk (thinly provisioned) on iscsi storage 3) start vm, access guest shell console (use fedora live cd) and run the following dd command: dd if=/dev/zero of=/dev/vda bs=1M 4) wait for the e_no_space event to occur and DO NOT perform lvextend (machine will go to pause) 5) wait several seconds and perform qemu-img check on that volume this is what I get: [root@red-vdsa ~]# qemu-img check -f qcow2 /rhev/data-center/a7edd3bc-d9cb-4e52-b319-2768775f7067/97e567fe-5bf6-4b66-8be2-617bae8995af/images/7a50c918-26c0-4e76-a3e2-1e942e1c0445/7c5badea-84a6-4e4c-a5d3-7bedd6f039db ERROR cluster 16378 refcount=1 reference=0 ERROR cluster 16379 refcount=1 reference=0 ERROR cluster 16380 refcount=1 reference=0 ERROR cluster 16381 refcount=1 reference=0 ERROR cluster 16382 refcount=1 reference=0 ERROR cluster 16383 refcount=1 reference=0 6 errors were found on the image. please note that running same scenario with IDE device results with 0 errors. [root@red-vdsa ~]# qemu-img check -f qcow2 /rhev/data-center/a7edd3bc-d9cb-4e52-b319-2768775f7067/97e567fe-5bf6-4b66-8be2-617bae8995af/images/59240738-cb07-4229-936e-2e1618e82456/b9f70592-c1c9-4670-8abd-84965611e8e4 No errors were found on the image. qemu-commands of both vms: virtIO vm qemu command: /usr/libexec/qemu-kvm -no-hpet -usbdevice tablet -rtc-td-hack -startdate 2010-06-22T00:01:40 -name virtio-live1 -smp 1,cores=1 -k en-us -m 2048 -boot dcn -net nic,vlan=1,macaddr=00:1a:4a:23:41:0c,model=virtio -net tap,vlan=1,ifname=virtio_10_1,script=no -drive file=/rhev/data-center/a7edd3bc-d9cb-4e52-b319-2768775f7067/97e567fe-5bf6-4b66-8be2-617bae8995af/images/7a50c918-26c0-4e76-a3e2-1e942e1c0445/7c5badea-84a6-4e4c-a5d3-7bedd6f039db,media=disk,if=virtio,cache=off,serial=76-a3e2-1e942e1c0445,boot=on,format=qcow2,werror=stop -drive file=/rhev/data-center/a7edd3bc-d9cb-4e52-b319-2768775f7067/1aa51232-24d2-447f-a289-64a374326e06/images/11111111-1111-1111-1111-111111111111/Fedora-13-i686-Live.iso,media=cdrom,index=2,if=ide -pidfile /var/vdsm/9c21fad9-0336-477d-9700-87818ea8dce3.pid -vnc 0:10,password -cpu qemu64,+sse2 -M rhel5.5.0 -notify all -balloon none -smbios type=1,manufacturer=Red Hat,product=RHEV Hypervisor,version=5.5-2.2-4.2,serial=3063E500-66E9-38C3-BCED-094F72230F15_00:14:5e:17:cf:d4,uuid=9c21fad9-0336-477d-9700-87818ea8dce3 -vmchannel di:0200,unix:/var/vdsm/9c21fad9-0336-477d-9700-87818ea8dce3.guest.socket,server -monitor unix:/var/vdsm/9c21fad9-0336-477d-9700-87818ea8dce3.monitor.socket,server IDE vm qemu command: /usr/libexec/qemu-kvm -no-hpet -usbdevice tablet -rtc-td-hack -startdate 2010-06-22T00:27:14 -name virtio-live2 -smp 1,cores=1 -k en-us -m 2048 -boot dcn -net nic,vlan=1,macaddr=00:1a:4a:23:41:0d,model=virtio -net tap,vlan=1,ifname=virtio_11_1,script=no -drive file=/rhev/data-center/a7edd3bc-d9cb-4e52-b319-2768775f7067/97e567fe-5bf6-4b66-8be2-617bae8995af/images/59240738-cb07-4229-936e-2e1618e82456/b9f70592-c1c9-4670-8abd-84965611e8e4,media=disk,if=ide,cache=off,index=0,serial=29-936e-2e1618e82456,boot=off,format=qcow2,werror=stop -drive file=/rhev/data-center/a7edd3bc-d9cb-4e52-b319-2768775f7067/1aa51232-24d2-447f-a289-64a374326e06/images/11111111-1111-1111-1111-111111111111/Fedora-13-i686-Live.iso,media=cdrom,index=2,if=ide -pidfile /var/vdsm/e76c7ce9-b4a6-4d98-b9a7-8d1eb5b75ae1.pid -vnc 0:11,password -cpu qemu64,+sse2 -M rhel5.5.0 -notify all -balloon none -smbios type=1,manufacturer=Red Hat,product=RHEV Hypervisor,version=5.5-2.2-4.2,serial=3063E500-66E9-38C3-BCED-094F72230F15_00:14:5e:17:cf:d4,uuid=e76c7ce9-b4a6-4d98-b9a7-8d1eb5b75ae1 -vmchannel di:0200,unix:/var/vdsm/e76c7ce9-b4a6-4d98-b9a7-8d1eb5b75ae1.guest.socket,server -monitor unix:/var/vdsm/e76c7ce9-b4a6-4d98-b9a7-8d1eb5b75ae1.monitor.socket,server Version-Release number of selected component (if applicable): list of packages: kvm-83-164.el5_5.12 kvm-qemu-img-83-164.el5_5.12 lvm2-2.02.56-8.el5_5.4 vdsm22-4.5-62.el5rhev iscsi-initiator-utils-6.2.0.871-0.16.el5 How reproducible: always
Please provide stdio of the VM as well as the repro steps (configure a VG of size X, etc.). Also test on FC, just to rule out iSCSI strangeness.
Is this only about the qemu-img check messages or can you see corruption inside the guest? The qemu-img check messages are about leaked clusters, they are harmless and no corruption.
Can we close the bug?
Seems like we need to make qemu-img check friendlier and even return known error code - 0 == ok, 1 == cluster leak, safe, -X corruption of type X
*** Bug 555281 has been marked as a duplicate of this bug. ***
Reproduce this bug with kvm-83-164.el5_5.12, there are some errors were found on the iscsi image when no space happens during guest with virtio blk installation. # qemu-img check -f qcow2 /dev/vgtest/lvtest ERROR cluster 49982 refcount=1 reference=0 ERROR cluster 49983 refcount=1 reference=0 ERROR cluster 49984 refcount=1 reference=0 ERROR cluster 49985 refcount=1 reference=0 ERROR cluster 49986 refcount=1 reference=0 ERROR cluster 49987 refcount=1 reference=0 ERROR cluster 49988 refcount=1 reference=0 ERROR cluster 49989 refcount=1 reference=0 ERROR cluster 49990 refcount=1 reference=0 ERROR cluster 49991 refcount=1 reference=0 ERROR cluster 49992 refcount=1 reference=0 ERROR cluster 49993 refcount=1 reference=0 12 errors were found on the image. Verify this bug with kvm-83-207.el5, there is no error but some warning message shows when did as reproduce step. ]# qemu-img check -f qcow2 /dev/vgtest1/lvtest1 Leaked cluster 71748 refcount=1 reference=0 Leaked cluster 71749 refcount=1 reference=0 Leaked cluster 71750 refcount=1 reference=0 Leaked cluster 71751 refcount=1 reference=0 Leaked cluster 71752 refcount=1 reference=0 Leaked cluster 71753 refcount=1 reference=0 Leaked cluster 71754 refcount=1 reference=0 Leaked cluster 71755 refcount=1 reference=0 Leaked cluster 71756 refcount=1 reference=0 Leaked cluster 71757 refcount=1 reference=0 Leaked cluster 71758 refcount=1 reference=0 Leaked cluster 71759 refcount=1 reference=0 12 leaked clusters were found on the image. This means waste of disk space, but no harm to data. So this bug has been resolved, change status to verified.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-0028.html