Bug 1114793

Summary:	drive-mirror with "mode":"existing" fails poorly if destination is not large enough
Product:	[Fedora] Fedora	Reporter:	Eric Blake <eblake>
Component:	qemu	Assignee:	Fedora Virtualization Maintainers <virt-maint>
Status:	CLOSED CURRENTRELEASE	QA Contact:	Fedora Extras Quality Assurance <extras-qa>
Severity:	unspecified	Docs Contact:
Priority:	unspecified
Version:	24	CC:	amit.shah, berrange, cfergeau, crobinso, dwmw2, itamar, pbonzini, rjones, scottt.tw, virt-maint
Target Milestone:	---
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:
Clones:	1115572 (view as bug list)		Environment:
Last Closed:	2016-05-02 20:44:28 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1115572

Description Eric Blake 2014-07-01 03:53:22 UTC

Description of problem:
https://lists.gnu.org/archive/html/qemu-devel/2014-06/msg07377.html
I tested on F20 with fedora-virt-preview, but suspect RHEL/RHEV may benefit from cloning this bug.  It would be nice when doing a diskcopy into an existing file if qemu would automatically resize the destination to be large enough, or at a bare minimum fail up front if the size is wrong. But the current behavior is to silently and successfully start the job, then fail when the destination is out of space; if management misses the 'BLOCK_JOB_COMPLETED with error' event, there is NO indication that the job failed or why.

Version-Release number of selected component (if applicable):
qemu-kvm-2.0.0-7.fc20.x86_64

How reproducible:
100%

Steps to Reproduce:
1.#!/bin/sh
cd /tmp

rm -f base.img snap1.img snap2.img copy.img
virsh destroy testvm1 2>/dev/null

# base.img <- snap1.img <- snap2.img
qemu-img create -f raw base.img 10M
qemu-img create -f qcow2 -b base.img -o backing_fmt=raw snap1.img
qemu-img create -f qcow2 -b snap1.img -o backing_fmt=qcow2 snap2.img
# set up blank space to hold the copy
touch copy.img
# cp base.img copy.img # uncomment this to see expected results

virsh create /dev/stdin <<EOF
<domain type='kvm'>
 <name>testvm1</name>
 <memory unit='MiB'>256</memory>
 <vcpu>1</vcpu>
 <os>
   <type arch='x86_64'>hvm</type>
 </os>
 <devices>
   <disk type='file' device='disk'>
     <driver name='qemu' type='qcow2'/>
     <source file='$PWD/snap2.img'/>
     <target dev='vda' bus='virtio'/>
   </disk>
   <graphics type='vnc'/>
 </devices>
</domain>
EOF

# check for events
virsh event testvm1 block-job --loop --timeout 10 &
pid=$!
sleep 1
# run the blockcopy
virsh blockcopy testvm1 vda --wait --verbose --raw /tmp/copy.img --reuse-external
echo job started
sleep 5
virsh blockjob testvm1 vda --abort
wait $pid


Actual results:
Block Copy: [  0 %]event 'block-job' for domain testvm1: Block Copy for /tmp/snap2.img failed

Now in mirroring phase
job started
event loop timed out
events received: 1

error: Requested operation is not valid: No active operation on device: drive-virtio-disk0



Expected results:
Block Copy: [  0 %]event 'block-job' for domain testvm1: Block Copy for /tmp/snap2.img ready
Block Copy: [100 %]
Now in mirroring phase
job started
event 'block-job' for domain testvm1: Block Copy for /tmp/snap2.img completed

event loop timed out
events received: 2


Additional info:

Comment 1 Cole Robinson 2014-07-02 14:40:19 UTC

Fedora qemu bugs have much less visibility than those filed against RHEL. Since your mention of this issue on the mailing list didn't get a response yet, I'd suggest cloning or fully moving this issue to RHEL where resources are more likely to be allocated.

Comment 2 Jaroslav Reznik 2015-03-03 16:05:23 UTC

This bug appears to have been reported against 'rawhide' during the Fedora 22 development cycle.
Changing version to '22'.

More information and reason for this action is here:
https://fedoraproject.org/wiki/Fedora_Program_Management/HouseKeeping/Fedora22

Comment 3 Cole Robinson 2016-05-02 20:44:28 UTC

According to https://bugzilla.redhat.com/show_bug.cgi?id=1114962#c6 this is fixed with latest qemu, so closing against f24