Bug 1023874

Summary: [Regression] Prompt error of trigger blkdebug: BLKDBG_CLUSTER_FREE event is not the same as expected
Product: Red Hat Enterprise Linux 6 Reporter: Qunfang Zhang <qzhang>
Component: qemu-kvmAssignee: Hanna Czenczek <hreitz>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: high    
Version: 6.5CC: areis, bsarathy, chayang, hhuang, hreitz, juzhang, kwolf, mazhang, michen, mkenneth, rbalakri, sluo, virt-maint
Target Milestone: rcKeywords: Regression
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qemu-kvm-0.12.1.2-2.425.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-10-14 06:53:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 1056252    

Description Qunfang Zhang 2013-10-28 07:55:52 UTC
Description of problem:
During re-verification for bug 796011, I found it's reproduced again in qemu-kvm-415.  As this bug is verified pass on qemu-kvm-375, so I downgrade my qemu-kvm version and found the regression was introduced on qemu-kvm-391 which fixed the bug 848070 ([RHEL 6.5] Add glusterfs support to qemu)

qemu-kvm-390: pass
qemu-kvm-391: fail
qemu-kvm-415: fail


Version-Release number of selected component (if applicable):
kernel-2.6.32-425.el6.x86_64
qemu-kvm-0.12.1.2-2.415.el6.x86_64

How reproducible:
100%

Steps to Reproduce:
1.Configure blkdebug file and store it on the specific location.
# cat /home/blkdebug.cfg
[inject-error]
event = "cluster_free"
#errno = "5"
errno = "28"
immediately = "off"

2.Create qcow2 image with small cluster size

#qemu-img create -f qcow2 -o cluster_size=512 disk.qcow2 10G

3. Create internal snapshot. 
# qemu-img snapshot -c disk_snap disk.qcow2

4. List the internal snapshot:

# qemu-img snapshot -l disk.qcow2 
Snapshot list:
ID        TAG                 VM SIZE                DATE       VM CLOCK
1         disk_snap                 0 2013-10-28 15:47:26   00:00:00.000

5. trigger blkdebug: BLKDBG_CLUSTER_FREE event and prompt error.

# qemu-img snapshot -d disk_snap blkdebug:blkdebug.cfg:disk.qcow2 


Actual results:
After step 5:
# qemu-img snapshot -d disk_snap blkdebug:blkdebug.cfg:disk.qcow2 
Could not delete snapshot 'disk_snap': -95 (Operation not supported)


Expected results:
# qemu-img snapshot -d disk_snap blkdebug:blkdebug.cfg:disk.qcow2 
qcow2_free_clusters failed: No space left on device
Could not delete snapshot 'disk_snap': -28 (No space left on device)


Additional info:

Comment 1 Kevin Wolf 2013-10-28 14:32:35 UTC
I bisected this to downstream commit f0d6b82f ('block: Produce zeros when
protocols reading beyond end of file'). As it's not completely obvious,
understanding how this change broke the error message would require some more
investigation.

In any case, this is about internal snapshots, which we consider unsupported,
and a rather minor impact (reporting "Operation not supported" instead of the real
error). While this is a regression technically, I don't think there's a reason
to treat this as a blocker as long as no customer complains.

I recommend moving this bug to 6.6.

(Also CCing Max who did some upstream fixes related to the patch in question)

Comment 7 Hanna Czenczek 2014-03-21 03:27:30 UTC
I personally find it much more interesting that I'm not able to use blkdebug on top of qcow2 at all – qemu-io always returns EIO for me and the BDS chain does not seem to involve qcow2 at all (which explains the ENOTSUP for deleting the snapshot).
I tracked the problem to bdrv_check_byte_request() which checks the request against the length of the block device. bdrv_getlength() always returns 0 for the blkdebug, which is not suprising considering blkdebug does not implement that function downstream. Backporting e1302255 fixes this and therefore qemu-io can finally successfully read from a qcow2 image with blkdebug.
This fixes the problem described in this bugzilla as well – I guess format detection simply did not work with blkdebug always reporting an empty image file and subsequently snapshot operations could not be executed.

Comment 8 Miroslav Rezanina 2014-04-29 06:01:27 UTC
Fix included in qemu-kvm-0.12.1.2-2.425.el6

Comment 10 Qunfang Zhang 2014-06-25 11:18:29 UTC
Verified pass on qemu-kvm-0.12.1.2-2.428.el6.x86_64 with the same steps in comment 0. 

Result:

After step 5:

[root@localhost qzhang]# qemu-img snapshot -d disk_snap blkdebug:blkdebug.cfg:disk.qcow2 
qcow2_free_clusters failed: No space left on device
Could not delete snapshot 'disk_snap': -28 (No space left on device)

So, this bug is fixed.

Comment 11 errata-xmlrpc 2014-10-14 06:53:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-1490.html