Bug 606651 - [kvm] qemu image check returns cluster errors when using virtIO block (thinly provisioned) during e_no_space events (along with EIO errors)
Summary: [kvm] qemu image check returns cluster errors when using virtIO block (thinly...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kvm
Version: 5.5.z
Hardware: All
OS: All
urgent
urgent
Target Milestone: rc
: ---
Assignee: Kevin Wolf
QA Contact: Virtualization Bugs
URL:
Whiteboard:
: 555281 (view as bug list)
Depends On:
Blocks: Rhel5KvmTier1 618206
TreeView+ depends on / blocked
 
Reported: 2010-06-22 07:33 UTC by Haim
Modified: 2018-11-14 18:48 UTC (History)
19 users (show)

Fixed In Version: kvm-83-192.el5
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 612164 (view as bug list)
Environment:
Last Closed: 2011-01-13 23:36:22 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2011:0028 0 normal SHIPPED_LIVE Low: kvm security and bug fix update 2011-01-13 11:03:39 UTC

Description Haim 2010-06-22 07:33:21 UTC
Description of problem:

we have a consistent qemu image corruption that occurs on thinly provisioned virtIO disk on iscsi storage when first event of e_no_space occurs. 
this happens on latest qemu-kvm version kvm-83-164.el5_5.12. 

scenario: 

1) make sure to run with 2 hosts 
2) make sure to create a new vm with virtIO disk (thinly provisioned) 
   on iscsi storage 
3) start vm, access guest shell console (use fedora live cd) and run the  
   following dd command: dd if=/dev/zero of=/dev/vda bs=1M 
4) wait for the e_no_space event to occur and DO NOT perform lvextend (machine 
   will go to pause) 
5) wait several seconds and perform qemu-img check on that volume 

this is what I get: 

[root@red-vdsa ~]# qemu-img check -f qcow2 /rhev/data-center/a7edd3bc-d9cb-4e52-b319-2768775f7067/97e567fe-5bf6-4b66-8be2-617bae8995af/images/7a50c918-26c0-4e76-a3e2-1e942e1c0445/7c5badea-84a6-4e4c-a5d3-7bedd6f039db 
ERROR cluster 16378 refcount=1 reference=0
ERROR cluster 16379 refcount=1 reference=0
ERROR cluster 16380 refcount=1 reference=0
ERROR cluster 16381 refcount=1 reference=0
ERROR cluster 16382 refcount=1 reference=0
ERROR cluster 16383 refcount=1 reference=0
6 errors were found on the image.

please note that running same scenario with IDE device results with 0 errors.

[root@red-vdsa ~]# qemu-img check -f qcow2 /rhev/data-center/a7edd3bc-d9cb-4e52-b319-2768775f7067/97e567fe-5bf6-4b66-8be2-617bae8995af/images/59240738-cb07-4229-936e-2e1618e82456/b9f70592-c1c9-4670-8abd-84965611e8e4
No errors were found on the image.

qemu-commands of both vms: 

virtIO vm qemu command:

/usr/libexec/qemu-kvm -no-hpet -usbdevice tablet -rtc-td-hack -startdate 2010-06-22T00:01:40 -name virtio-live1 -smp 1,cores=1 -k en-us -m 2048 -boot dcn -net nic,vlan=1,macaddr=00:1a:4a:23:41:0c,model=virtio -net tap,vlan=1,ifname=virtio_10_1,script=no -drive file=/rhev/data-center/a7edd3bc-d9cb-4e52-b319-2768775f7067/97e567fe-5bf6-4b66-8be2-617bae8995af/images/7a50c918-26c0-4e76-a3e2-1e942e1c0445/7c5badea-84a6-4e4c-a5d3-7bedd6f039db,media=disk,if=virtio,cache=off,serial=76-a3e2-1e942e1c0445,boot=on,format=qcow2,werror=stop -drive file=/rhev/data-center/a7edd3bc-d9cb-4e52-b319-2768775f7067/1aa51232-24d2-447f-a289-64a374326e06/images/11111111-1111-1111-1111-111111111111/Fedora-13-i686-Live.iso,media=cdrom,index=2,if=ide -pidfile /var/vdsm/9c21fad9-0336-477d-9700-87818ea8dce3.pid -vnc 0:10,password -cpu qemu64,+sse2 -M rhel5.5.0 -notify all -balloon none -smbios type=1,manufacturer=Red Hat,product=RHEV Hypervisor,version=5.5-2.2-4.2,serial=3063E500-66E9-38C3-BCED-094F72230F15_00:14:5e:17:cf:d4,uuid=9c21fad9-0336-477d-9700-87818ea8dce3 -vmchannel di:0200,unix:/var/vdsm/9c21fad9-0336-477d-9700-87818ea8dce3.guest.socket,server -monitor unix:/var/vdsm/9c21fad9-0336-477d-9700-87818ea8dce3.monitor.socket,server

IDE vm qemu command:

/usr/libexec/qemu-kvm -no-hpet -usbdevice tablet -rtc-td-hack -startdate 2010-06-22T00:27:14 -name virtio-live2 -smp 1,cores=1 -k en-us -m 2048 -boot dcn -net nic,vlan=1,macaddr=00:1a:4a:23:41:0d,model=virtio -net tap,vlan=1,ifname=virtio_11_1,script=no -drive file=/rhev/data-center/a7edd3bc-d9cb-4e52-b319-2768775f7067/97e567fe-5bf6-4b66-8be2-617bae8995af/images/59240738-cb07-4229-936e-2e1618e82456/b9f70592-c1c9-4670-8abd-84965611e8e4,media=disk,if=ide,cache=off,index=0,serial=29-936e-2e1618e82456,boot=off,format=qcow2,werror=stop -drive file=/rhev/data-center/a7edd3bc-d9cb-4e52-b319-2768775f7067/1aa51232-24d2-447f-a289-64a374326e06/images/11111111-1111-1111-1111-111111111111/Fedora-13-i686-Live.iso,media=cdrom,index=2,if=ide -pidfile /var/vdsm/e76c7ce9-b4a6-4d98-b9a7-8d1eb5b75ae1.pid -vnc 0:11,password -cpu qemu64,+sse2 -M rhel5.5.0 -notify all -balloon none -smbios type=1,manufacturer=Red Hat,product=RHEV Hypervisor,version=5.5-2.2-4.2,serial=3063E500-66E9-38C3-BCED-094F72230F15_00:14:5e:17:cf:d4,uuid=e76c7ce9-b4a6-4d98-b9a7-8d1eb5b75ae1 -vmchannel di:0200,unix:/var/vdsm/e76c7ce9-b4a6-4d98-b9a7-8d1eb5b75ae1.guest.socket,server -monitor unix:/var/vdsm/e76c7ce9-b4a6-4d98-b9a7-8d1eb5b75ae1.monitor.socket,server

Version-Release number of selected component (if applicable): 

list of packages: 
kvm-83-164.el5_5.12
kvm-qemu-img-83-164.el5_5.12
lvm2-2.02.56-8.el5_5.4
vdsm22-4.5-62.el5rhev
iscsi-initiator-utils-6.2.0.871-0.16.el5

How reproducible: always

Comment 1 Yaniv Kaul 2010-06-22 07:56:53 UTC
Please provide stdio of the VM as well as the repro steps (configure a VG of size X, etc.).
Also test on FC, just to rule out iSCSI strangeness.

Comment 4 Kevin Wolf 2010-06-22 12:16:05 UTC
Is this only about the qemu-img check messages or can you see corruption inside the guest?

The qemu-img check messages are about leaked clusters, they are harmless and no corruption.

Comment 6 Dor Laor 2010-06-23 13:06:12 UTC
Can we close the bug?

Comment 8 Dor Laor 2010-06-24 08:32:22 UTC
Seems like we need to make qemu-img check friendlier and even return known error code - 0 == ok, 1 == cluster leak, safe, -X corruption of type X

Comment 12 Kevin Wolf 2010-07-15 11:54:43 UTC
*** Bug 555281 has been marked as a duplicate of this bug. ***

Comment 16 Shirley Zhou 2010-11-03 05:11:57 UTC
Reproduce this bug with kvm-83-164.el5_5.12, there are some errors were found on the iscsi image when no space happens during guest with virtio blk installation.
# qemu-img check -f qcow2 /dev/vgtest/lvtest 
ERROR cluster 49982 refcount=1 reference=0
ERROR cluster 49983 refcount=1 reference=0
ERROR cluster 49984 refcount=1 reference=0
ERROR cluster 49985 refcount=1 reference=0
ERROR cluster 49986 refcount=1 reference=0
ERROR cluster 49987 refcount=1 reference=0
ERROR cluster 49988 refcount=1 reference=0
ERROR cluster 49989 refcount=1 reference=0
ERROR cluster 49990 refcount=1 reference=0
ERROR cluster 49991 refcount=1 reference=0
ERROR cluster 49992 refcount=1 reference=0
ERROR cluster 49993 refcount=1 reference=0
12 errors were found on the image.

Verify this bug with kvm-83-207.el5, there is no error but some warning message shows when did as reproduce step.

]# qemu-img check -f qcow2 /dev/vgtest1/lvtest1
Leaked cluster 71748 refcount=1 reference=0
Leaked cluster 71749 refcount=1 reference=0
Leaked cluster 71750 refcount=1 reference=0
Leaked cluster 71751 refcount=1 reference=0
Leaked cluster 71752 refcount=1 reference=0
Leaked cluster 71753 refcount=1 reference=0
Leaked cluster 71754 refcount=1 reference=0
Leaked cluster 71755 refcount=1 reference=0
Leaked cluster 71756 refcount=1 reference=0
Leaked cluster 71757 refcount=1 reference=0
Leaked cluster 71758 refcount=1 reference=0
Leaked cluster 71759 refcount=1 reference=0

12 leaked clusters were found on the image.
This means waste of disk space, but no harm to data.

So this bug has been resolved, change status to verified.

Comment 18 errata-xmlrpc 2011-01-13 23:36:22 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0028.html


Note You need to log in before you can comment on or make changes to this bug.