Bug 1092117

Summary:	live incremental migration of vm with common shared base, size(disk) > size(base) transfers unallocated sectors, explodes disk on dest
Product:	Red Hat Enterprise Linux 6	Reporter:	Chris Buben <cbuben>
Component:	qemu-kvm	Assignee:	Kevin Wolf <kwolf>
Status:	CLOSED ERRATA	QA Contact:	Virtualization Bugs <virt-bugs>
Severity:	high	Docs Contact:
Priority:	high
Version:	6.5	CC:	areis, brandon_nolte, bsarathy, cbuben, chayang, dornelas, jamills, jen, jhunsaker, juzhang, knoel, kwolf, lmiksik, lyarwood, mazhang, michen, mkenneth, mrezanin, pbonzini, qzhang, rbalakri, shu, virt-maint
Target Milestone:	rc	Keywords:	Regression, ZStream
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	qemu-kvm-0.12.1.2-2.431.el6	Doc Type:	Bug Fix
Doc Text:	In certain scenarios, when performing live incremental migration, the disk size could be expanded considerably due to the transfer of unallocated sectors past the end of the base image. With this update, the bdrv_is_allocated() function has been fixed to no longer return "True" for unallocated sectors, and the disk size no longer changes after performing live incremental migration.	Story Points:	---
Clone Of:
Clones:	1109715 1110681 1130582 (view as bug list)		Environment:
Last Closed:	2014-10-14 06:58:26 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1109715, 1110681

Description Chris Buben 2014-04-28 19:00:24 UTC

Description of problem:

Live incremental migration of an instance with a common shared base (qcow2), where the top-level disk in the chain (qcow2) is larger than the shared base, appears to transfer unallocated sectors past the end of the base.  The disk on the destination side is therefore "exploded", and does not preserve the relative sparsity of the source image.

The version of qemu-kvm in EL 6.4 preserved unallocated sectors during migration in this scenario, the latest release of qemu-kvm does not.

Version-Release number of selected component (if applicable):

0.12.1.2-2.415.el6_5.8

How reproducible:

Always.

Steps to Reproduce:

1. Use a 4.0 GB virtual size image as a backing file.

+ qemu-img info backing.qcow2
image: backing.qcow2
file format: qcow2
virtual size: 4.0G (4294967296 bytes)
disk size: 916M
cluster_size: 65536

2. Create a "source.qcow2", size 10G, backing_file=backing.qcow2

+ qemu-img create -f qcow2 -o backing_file=backing.qcow2 source.qcow2 10G

3. Create a "dest.qcow2", size 10G, backing_file=backing.qcow2

4. Look at disk size of dest.  It should be very small compared to the virtual size.

+ qemu-img info dest.qcow2
image: dest.qcow2
file format: qcow2
virtual size: 10G (10737418240 bytes)
disk size: 196K
cluster_size: 65536
backing file: backing.qcow2

5. Start a vm using dest.qcow2, set to receive an incoming migration.

+ /usr/libexec/qemu-kvm -monitor stdio -drive if=virtio,cache=none,file=dest.qcow2 -incoming tcp:0:4444

6. Start an instance using source.qcow2, invoke live migration to dest vm.

+ /usr/libexec/qemu-kvm -monitor stdio -drive if=virtio,cache=none,file=source.qcow2
VNC server running on `::1:5901'
QEMU 0.12.1 monitor - type 'help' for more information
(qemu) migrate_set_speed 8192m
(qemu) migrate -d -i tcp:127.0.0.1:4444

7. Stop both vms.

8. Examine dest with qemu-img.  disk size should be roughly the same size it was before the migration.  However, it will be ~6GB.

+ qemu-img info dest.qcow2
qemu: terminating on signal 15 from pid 539
image: dest.qcow2
file format: qcow2
virtual size: 10G (10737418240 bytes)
disk size: 6.0G
cluster_size: 65536
backing file: backing.qcow2

9. Examine dest with qemu-img map.  Appears that all sectors past the end of the base will be in file dest.qcow2.

Actual results:

The disk size is roughly equal to size(top) - size(base).  In the example above, the disk size is exploded to 6GB due to the transfer of unallocated sectors to dest.

Expected results:

The disk size is approximately the same as when it was first displayed.

(same output, but from EL 6.4 stock)

+ qemu-img info dest.qcow2
image: dest.qcow2
file format: qcow2
virtual size: 10G (10737418240 bytes)
disk size: 2.3M
cluster_size: 65536
backing file: backing.qcow2

Additional info:

Please see https://gist.github.com/cbuben/8076509bd0c657d7e6ca for some repro code / output.

Comment 2 Chris Buben 2014-04-28 19:30:02 UTC

Clarification:

Actual results:

The actual disk size is roughly equal to virtual_size(top) - virtual_size(base).  In the example above, the disk size is exploded to ~6GB (10GB-4GB) apparently due to the transfer of unallocated sectors past the end of base to dest.

Comment 3 Qunfang Zhang 2014-04-29 08:30:08 UTC

Reproduced this bug on qemu-kvm-0.12.1.2-2.424.el6.x86_64, and this issue does not exist on RHEL6.4-z qemu-kvm-0.12.1.2-2.355.el6_4.9.x86_64. 

Steps:

1. create a backing.qcow2 image and install a guest in it. 

# qemu-img info backing.qcow2 
image: backing.qcow2
file format: qcow2
virtual size: 4.0G (4294967296 bytes)
disk size: 2.0G
cluster_size: 65536

2. Create source.qcow2 and dest.qcow2 images with 10G. Both images' backing file is backing.qcow2

[root@t1 test]# qemu-img create -f qcow2 -o backing_file=backing.qcow2  source.qcow2 10G
Formatting 'source.qcow2', fmt=qcow2 size=10737418240 backing_file='backing.qcow2' encryption=off cluster_size=65536 
[root@t1 test]# 
[root@t1 test]# 
[root@t1 test]# qemu-img info source.qcow2 
image: source.qcow2
file format: qcow2
virtual size: 10G (10737418240 bytes)
disk size: 196K
cluster_size: 65536
backing file: backing.qcow2
[root@t1 test]# 
[root@t1 test]# 
[root@t1 test]# qemu-img create -f qcow2 -o backing_file=backing.qcow2  dest.qcow2 10G
Formatting 'dest.qcow2', fmt=qcow2 size=10737418240 backing_file='backing.qcow2' encryption=off cluster_size=65536 
[root@t1 test]# 
[root@t1 test]# 
[root@t1 test]# qemu-img info dest.qcow2 
image: dest.qcow2
file format: qcow2
virtual size: 10G (10737418240 bytes)
disk size: 196K
cluster_size: 65536
backing file: backing.qcow2

3. Boot up the source.qcow2 on rhel6 host and then boot up the dest.qcow2 image with listening mode "-incoming tcp:0:5800". Then do the live incremental migration. 

#  /usr/libexec/qemu-kvm -M rhel6.5.0 -cpu SandyBridge -m 2G -smp 2,sockets=1,cores=2,threads=1,maxcpus=16 -enable-kvm -name win7-32 -uuid 990ea161-6b67-47b2-b803-19fb01d30d12 -smbios type=1,manufacturer='Red Hat',product='RHEV Hypervisor',version=el6,serial=koTUXQrb,uuid=feebc8fd-f8b0-4e75-abc3-e63fcdb67170 -k en-us -rtc base=localtime,clock=host,driftfix=slew -nodefaults -monitor stdio -qmp tcp:0:6666,server,nowait -boot menu=on  -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -monitor unix:/tmp/monitor-unix,nowait,server -drive file=/home/test/source.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none,werror=stop,rerror=stop,aio=threads -device virtio-blk-pci,scsi=off,bus=pci.0,drive=drive-virtio-disk0,id=virtio-disk0 -netdev tap,id=hostnet0,vhost=on -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1a:4a:2e:28:1c,bus=pci.0,addr=0x4 -vga std -vnc :10 -usb -device usb-tablet 
QEMU 0.12.1 monitor - type 'help' for more information
(qemu) 
(qemu)  
(qemu) migrate_set_speed 100M 
(qemu) 
(qemu) migrate -d -i tcp:0:5800

Result: 
After migration, the dest.qcow2 image disk size is 6G, even larger than the base image  (4G). 

[root@t1 test]# qemu-img info source.qcow2 
image: source.qcow2
file format: qcow2
virtual size: 10G (10737418240 bytes)
disk size: 4.6M
cluster_size: 65536
backing file: backing.qcow2
[root@t1 test]# 
[root@t1 test]# 
[root@t1 test]# 
[root@t1 test]# qemu-img info dest.qcow2 
image: dest.qcow2
file format: qcow2
virtual size: 10G (10737418240 bytes)
disk size: 6.0G
cluster_size: 65536
backing file: backing.qcow2

==================

Re-test on the RHEL6.4 host (qemu-kvm-0.12.1.2-2.355.el6_4.9.x86_64), the issue does not exist.

After migration:
# qemu-img info dest.qcow2 
image: dest.qcow2
file format: qcow2
virtual size: 10G (10737418240 bytes)
disk size: 32M
cluster_size: 65536
backing file: backing.qcow2


# qemu-img info source.qcow2 
image: source.qcow2
file format: qcow2
virtual size: 10G (10737418240 bytes)
disk size: 4.4M
cluster_size: 65536
backing file: backing.qcow2

Comment 6 Ademar Reis 2014-04-29 21:01:15 UTC

Chris, thanks for taking the time to enter a bug report with us. We appreciate
the feedback and look to use reports such as this to guide our efforts at
improving our products. That being said, we're not able to guarantee the
timeliness or suitability of a resolution for issues entered here because this
is not a mechanism for requesting support.

If this issue is critical or in any way time sensitive, please raise a ticket
through your regular Red Hat support channels to make certain  it receives the
proper attention and prioritization to assure a timely resolution.

For information on how to contact the Red Hat production support team, please
visit: https://www.redhat.com/support/process/production/#howto

Comment 8 Chris Buben 2014-04-29 22:15:46 UTC

Thanks Ademar.  I figure this is the standard disclaimer given to any reporter whose e-mail address doesn't end with @redhat.com?

Thanks to the RH team for quick response and repro.  Our team will raise this issue (and reference this bz) via our support contract as well.

Comment 11 Ademar Reis 2014-05-02 12:39:23 UTC

(In reply to Chris Buben from comment #8)
> Thanks Ademar.  I figure this is the standard disclaimer given to any
> reporter whose e-mail address doesn't end with @redhat.com?
> 
> Thanks to the RH team for quick response and repro.  Our team will raise
> this issue (and reference this bz) via our support contract as well.

Yes Chris, it's a standard response. Having an actual customer case open in the customer portal helps us prioritize the bugs. Thanks for escalating it.

Comment 13 Kevin Wolf 2014-05-05 14:41:31 UTC

I can confirm the bug, it reproduced on the first attempt. Bisecting the
problem led to the patch "block: return BDRV_BLOCK_ZERO past end of backing
file" (upstream commit f0ad5712, RHEL 6 commit 2a217cc0). The problem is that
the block allocation status past the end of the backing file is wrong (all
blocks are reported to be allocated).

The same bug can be triggered using the following commands:

$ ./qemu-img create -f qcow2 /tmp/backing.qcow2 1G
Formatting '/tmp/backing.qcow2', fmt=qcow2 size=1073741824 encryption=off cluster_size=65536 lazy_refcounts=off 
$ ./qemu-img create -f qcow2 -b /tmp/backing.qcow2 /tmp/overlay.qcow2 2G
Formatting '/tmp/overlay.qcow2', fmt=qcow2 size=2147483648 backing_file='/tmp/backing.qcow2' encryption=off cluster_size=65536 lazy_refcounts=off 
$ ./qemu-io -c 'alloc 1G 64k' /tmp/overlay.qcow2
65536/65536 sectors allocated at offset 1 GiB

The next step is checking what the best fix is without breaking other use
cases.

Comment 15 Brandon Nolte 2014-05-16 15:00:18 UTC

Great troubleshooting; Thank you for your assistance on this issue. 

Is there any consideration of how to implement this fix? 
Or time frame for when we can hope for this to be implemented? 



Regards,
Brandon Nolte

Comment 24 Jeff Nelson 2014-06-17 11:43:09 UTC

Fix included in qemu-kvm-0.12.1.2-2.428.el6

Comment 28 Miroslav Rezanina 2014-07-23 10:35:12 UTC

Fix included in qemu-kvm-0.12.1.2-2.431.el6

Comment 30 Kevin Wolf 2014-08-06 07:51:25 UTC

*** Bug 1118185 has been marked as a duplicate of this bug. ***

Comment 31 Qunfang Zhang 2014-08-13 03:01:14 UTC

Verified this bug on qemu-kvm-rhev-0.12.1.2-2.436.el6.x86_64:

1. create a backing.qcow2 image and install a guest in it. 

# qemu-img info backing.qcow2 
image: backing.qcow2
file format: qcow2
virtual size: 4.0G (4294967296 bytes)
disk size: 2.0G
cluster_size: 65536

2. Create source.qcow2 and dest.qcow2 images with 10G. Both images' backing file is backing.qcow2

[root@localhost test]#  qemu-img create -f qcow2 -o backing_file=backing.qcow2  source.qcow2 10G
Formatting 'source.qcow2', fmt=qcow2 size=10737418240 backing_file='backing.qcow2' encryption=off cluster_size=65536 
[root@localhost test]# qemu-img info source.qcow2 
image: source.qcow2
file format: qcow2
virtual size: 10G (10737418240 bytes)
disk size: 196K
cluster_size: 65536
backing file: backing.qcow2
[root@localhost test]# 
[root@localhost test]# 
[root@localhost test]# qemu-img create -f qcow2 -o backing_file=backing.qcow2  dest.qcow2 10G
Formatting 'dest.qcow2', fmt=qcow2 size=10737418240 backing_file='backing.qcow2' encryption=off cluster_size=65536 
[root@localhost test]# 
[root@localhost test]# 
[root@localhost test]# qemu-img info dest.qcow2 
image: dest.qcow2
file format: qcow2
virtual size: 10G (10737418240 bytes)
disk size: 196K
cluster_size: 65536
backing file: backing.qcow2


3. Boot up the source.qcow2 on rhel6 host and then boot up the dest.qcow2 image with listening mode "-incoming tcp:0:5800". Then do the live incremental migration. 

[root@localhost test]# /usr/libexec/qemu-kvm -M rhel6.6.0 -cpu SandyBridge -m 2G -smp 2,sockets=1,cores=2,threads=1,maxcpus=16 -enable-kvm -name rhel6.6 -uuid 990ea161-6b67-47b2-b803-19fb01d30d12 -smbios type=1,manufacturer='Red Hat',product='RHEV Hypervisor',version=el6,serial=koTUXQrb,uuid=feebc8fd-f8b0-4e75-abc3-e63fcdb67170 -k en-us -rtc base=localtime,clock=host,driftfix=slew -nodefaults -monitor stdio -qmp tcp:0:6666,server,nowait -boot menu=on  -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -monitor unix:/tmp/monitor-unix,nowait,server -drive file=/root/test/source.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none,werror=stop,rerror=stop,aio=threads -device virtio-blk-pci,scsi=off,bus=pci.0,drive=drive-virtio-disk0,id=virtio-disk0 -netdev tap,id=hostnet0,vhost=on -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1a:4a:2e:28:1c,bus=pci.0,addr=0x4 -vga std -vnc :10 -usb -device usb-tablet
QEMU 0.12.1 monitor - type 'help' for more information
(qemu) 
(qemu) migrate_set_speed 100M 
(qemu) 
(qemu) migrate -d -i tcp:0:5800
(qemu) info migrate

Result:
After migration, check the source.qcow2 and dest.qcow2 image size:

[root@localhost test]# qemu-img info source.qcow2 
image: source.qcow2
file format: qcow2
virtual size: 10G (10737418240 bytes)
disk size: 9.1M
cluster_size: 65536
backing file: backing.qcow2
[root@localhost test]# 
[root@localhost test]# 
[root@localhost test]# qemu-img info dest.qcow2 
image: dest.qcow2
file format: qcow2
virtual size: 10G (10737418240 bytes)
disk size: 36M
cluster_size: 65536
backing file: backing.qcow2

The dest.qcow2 image is 36M, not larger than 4G any more.

Comment 32 Qunfang Zhang 2014-08-13 03:04:56 UTC

And also test Kevin's script provided in bug 1109715:

(1) On the old qemu-kvm-rhev-0.12.1.2-2.430.el6.x86_64:

[root@localhost test]# sh rhel6-test.sh 
Formatting '/tmp/backing.qcow2', fmt=qcow2 size=67108864 encryption=off cluster_size=65536 
Formatting '/tmp/test.qcow2', fmt=qcow2 size=1073741824 backing_file='/tmp/backing.qcow2' encryption=off cluster_size=65536 
wrote 65536/65536 bytes at offset 0
64 KiB, 1 ops; 0.0000 sec (636.708 KiB/sec and 9.9486 ops/sec)
wrote 65536/65536 bytes at offset 134217728
64 KiB, 1 ops; 0.0000 sec (1.378 MiB/sec and 22.0415 ops/sec)
VNC server running on `::1:5900'
_QEMU 0.12.1 monitor - type 'help' for more information
(qemu) __com.redhat_drive-mirror ide0-hd0 /tmp/copy.qcow2
Formatting '/tmp/copy.qcow2', fmt=qcow2 size=67108864 backing_file='/tmp/backing.qcow2' backing_fmt='qcow2' encryption=off cluster_size=65536 
(qemu) __com.redhat_drive-reopen ide0-hd0 /tmp/copy.qcow2
(qemu) quit
Pattern verification failed at offset 0, 65536 bytes
read 65536/65536 bytes at offset 0
64 KiB, 1 ops; 0.0000 sec (5.549 GiB/sec and 90909.0909 ops/sec)
read failed: Input/output error

(2) On the latest qemu-kvm-rhev-0.12.1.2-2.436.el6.x86_64:

[root@localhost test]# sh rhel6-test.sh 
Formatting '/tmp/backing.qcow2', fmt=qcow2 size=67108864 encryption=off cluster_size=65536 
Formatting '/tmp/test.qcow2', fmt=qcow2 size=1073741824 backing_file='/tmp/backing.qcow2' encryption=off cluster_size=65536 
wrote 65536/65536 bytes at offset 0
64 KiB, 1 ops; 0.0000 sec (579.322 KiB/sec and 9.0519 ops/sec)
wrote 65536/65536 bytes at offset 134217728
64 KiB, 1 ops; 0.0000 sec (1.172 MiB/sec and 18.7491 ops/sec)
VNC server running on `::1:5900'
_QEMU 0.12.1 monitor - type 'help' for more information
(qemu) __com.redhat_drive-mirror ide0-hd0 /tmp/copy.qcow2
Formatting '/tmp/copy.qcow2', fmt=qcow2 size=1073741824 backing_file='/tmp/backing.qcow2' backing_fmt='qcow2' encryption=off cluster_size=65536 
(qemu) __com.redhat_drive-reopen ide0-hd0 /tmp/copy.qcow2
(qemu) quit
read 65536/65536 bytes at offset 0
64 KiB, 1 ops; 0.0000 sec (657.895 MiB/sec and 10526.3158 ops/sec)
read 65536/65536 bytes at offset 134217728
64 KiB, 1 ops; 0.0000 sec (753.012 MiB/sec and 12048.1928 ops/sec)
	
Based on above, the separate issues in the old build (mentioned in bug 1109715 comment 14) do not exist any more.

Comment 33 Qunfang Zhang 2014-08-13 03:10:17 UTC

Hi, Kevin

According to comment 31 and comment 32, this bug is verified pass with the original test case and also your script. 

As you suggested us to run some function test for live snapshot and block mirroring before, so I want to confirm with you:
(1) Currently we are running a round of live snapshot and block mirroring function on the latest rhel6.5-z build for bug 1109715 *manually*. 
(2) If (1) pass without any new regression found, could we only run some *autotest* storage vm migration testing for rhel6.6 instead of a round of manual test?  The difference here is: autotest might cover some basic test cases for live snapshot, block mirroring and image stream, they are only part of the manual cases. The features are not 100% automated. 

Thanks,
Qunfang

Comment 34 Kevin Wolf 2014-08-13 13:46:39 UTC

If 6.5.z passes the manual testing, I think it is reasonable to run some relaxed
automated testing for 6.6. They are similar enough that I think the 6.5.z result
gives us some confidence for 6.6 as well.

Comment 35 Qunfang Zhang 2014-08-14 01:00:07 UTC

(In reply to Kevin Wolf from comment #34)
> If 6.5.z passes the manual testing, I think it is reasonable to run some
> relaxed
> automated testing for 6.6. They are similar enough that I think the 6.5.z
> result
> gives us some confidence for 6.6 as well.

Okay, thank you for the feedback!

Comment 39 errata-xmlrpc 2014-10-14 06:58:26 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-1490.html