Bug 1248996 - Larger size of qcow2 image file on target host after migration with non-shared storage with full disk copy
Summary: Larger size of qcow2 image file on target host after migration with non-share...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: qemu-kvm
Version: ---
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: rc
: 8.0
Assignee: Virtualization Maintenance
QA Contact: leidwang@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-07-31 09:57 UTC by Fangge Jin
Modified: 2021-01-06 10:16 UTC (History)
18 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-12-15 07:35:45 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
libvirtd.log on source and target host (514.74 KB, application/gzip)
2020-02-07 09:17 UTC, yafu
no flags Details

Description Fangge Jin 2015-07-31 09:57:22 UTC
Description:
The guest image file has type=qcow2, the Allocation/Capacity size is 3.04GiB/9.00GiB, then do migration with non-shared storage with full disk copy. After migration, the Allocation/Capacity size of guest image file is 9.00GiB/9.00GiB on target host.

Version:
libvirt-1.2.17-3.el7.x86_64
qemu-kvm-rhev-2.3.0-13.el7.x86_64
kernel-3.10.0-300.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
0.Prepare two hosts: source(RHEL7.2) and target(RHEL7.2)

1.Prepare a running guest on source host with qcow2 image file:
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='none'/>
      <source file='/var/lib/libvirt/images/rhel7-5.qcow2'/>
      <backingStore/>
      <target dev='vda' bus='virtio'/>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </disk>

2.Check the vol-info of guest image file on source host:
# virsh vol-info /var/lib/libvirt/images/rhel7-5.qcow2
Name:           rhel7-5.qcow2
Type:           file
Capacity:       9.00 GiB
Allocation:     3.04 GiB

3.Migrate the guest to target host(Note: don't pre-create storage on target):
# virsh migrate --live rhel7-5 qemu+ssh://10.66.4.141/system --copy-storage-all --verbose
Migration: [100 %]

4.Check the vol-info of guest image file on tartget host:
# virsh vol-info /var/lib/libvirt/images/rhel7-5.qcow2
Name:           rhel7-5.qcow2
Type:           file
Capacity:       9.00 GiB
Allocation:     9.00 GiB

Actual results:
Compare the result of step2 and step4, to find that the Allocation size equals to the Capacity on target host.

Expected results:
The Allocation size of the guest image file on target host should be same with on source host.

Comment 2 Peter Krempa 2015-07-31 11:27:30 UTC

*** This bug has been marked as a duplicate of bug 1219541 ***

Comment 4 Vasiliy G Tolstov 2017-05-10 19:05:26 UTC
qemu 2.6.0 and libvirt 3.3.0 have identical issue.

Comment 5 John Snow 2017-05-11 14:13:29 UTC
I've investigated some of the root causes of this and posted a long explanation in a related BZ, https://bugzilla.redhat.com/show_bug.cgi?id=1219541

Comment 6 John Snow 2017-11-30 02:11:43 UTC
I will attempt to patch this upstream for 2.12.

The workaround, which is best practice from the libvirt POV, is to pre-create the image on the destination before attempting the mirror. With a properly modern qcow2 image, this will not occur.

Having said that, here is a reproduction of my explanation of this bug, copied from https://bugzilla.redhat.com/show_bug.cgi?id=1219541#c23

Why does zero-copy with raw work (using BLKDISCARD, no less) but fails with qcow2 0.10? (it fails both to simply not allocate clusters, OR to allocate its zero clusters efficiently by using the raw file's efficient zero-writing mechanisms)

Firstly, mirror currently uses mirror_do_zero_or_discard in the mirror iteration process. This is almost always going to do "zero" instead of "discard" because, as far as I understand it, bdrv_get_block_status_above is almost always going to return either DATA or ZERO. (both would have to be false for mirror to choose DISCARD.)

Then, we invoke this sequence of write operations:

blk_aio_pwrite_zeroes
blk_aio_prwv
blk_aio_write_entry
blk_co_pwritev
bdrv_co_pwritev
bdrv_co_do_zero_pwritev
bdrv_aligned_pwritev
bdrv_co_do_pwrite_zeroes

Here's where things start getting juicy.

We will attempt to call drv->bdrv_co_pwrite_zeroes, in this case qcow2_co_pwrite_zeroes.
Then we'll call qcow2_zero_clusters, which... does not really like the fact that we're trying to do zero writes on a QCOW2 0.10 image.

We return -ENOTSUP, back up to bdrv_co_do_zero_pwritev, which will then fill a bounce buffer with literal zeroes and continue its journey with bdrv_driver_pwritev -- losing the semantic information that this is a zeroes write. Inevitably, eventually, the qcow2 driver will pass the data along to its backing driver (file-posix, most likely*) and instead of detecting the efficient write, will write out the dumb, big buffer of zeroes.

There are a few ways to optimize this in various ways:

(1) If we have no backing file, qcow2's write zeroes could literally just ignore the write if the clusters are already unmapped. It's the same net effect.

(2) If we cannot guarantee the safety of the above, we can allocate L2 entries as per usual, but forward the write_zeroes request down the stack. This way, the raw driver can decide if it is able to punch holes in the file to still accomplish sparse zero allocation with 0.10 images.

(3) Mirror could be improved to understand when it is able to discard target clusters instead of even attempting zero writes which may or may not get optimized to discards, provided that mirror was given unmap=true. (If the target has no backing file and has the zero_init property, simply unmapping should be sufficient here.)

Comment 8 John Snow 2020-01-23 23:10:03 UTC
I need to re-investigate and see if this is still a problem upstream; and/or confirm that libvirt still uses qcow2 0.10 images by default.

For now, I humbly suggest that as a workaround you pre-create qcow2 images as migration targets whenever possible to ensure you are using qcow2v3 images which will not balloon on migrate.

Comment 9 aihua liang 2020-02-04 06:15:18 UTC
Hi, John
 
   Test on qemu-kvm-4.2.0-8.module+el8.2.0+5607+dc756904, don't hit this issue any more.
   Do we need to verify it on RHEL7? Or just RHEL8 is ok?
   If just RHEL8 is ok, i will close it as currentrelease after receive your reply.

 Test Steps:
1. Start dst with qemu cmds:
       /usr/libexec/qemu-kvm \
    -name 'avocado-vt-vm1'  \
    -sandbox on  \
    -machine q35  \
    -nodefaults \
    -device VGA,bus=pcie.0,addr=0x1 \
    -m 14336  \
    -smp 16,maxcpus=16,cores=8,threads=1,dies=1,sockets=2  \
    -cpu 'EPYC',+kvm_pv_unhalt  \
    -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpmonitor1-20200203-033416-61dmcn93,server,nowait \
    -mon chardev=qmp_id_qmpmonitor1,mode=control  \
    -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/monitor-catch_monitor-20200203-033416-61dmcn92,server,nowait \
    -mon chardev=qmp_id_catch_monitor,mode=control \
    -device pvpanic,ioport=0x505,id=idy8YPXp \
    -chardev socket,path=/var/tmp/serial-serial0-20200203-033416-61dmcn92,server,nowait,id=chardev_serial0 \
    -device isa-serial,id=serial0,chardev=chardev_serial0  \
    -chardev socket,id=seabioslog_id_20200203-033416-61dmcn92,path=/var/tmp/seabios-20200203-033416-61dmcn92,server,nowait \
    -device isa-debugcon,chardev=seabioslog_id_20200203-033416-61dmcn92,iobase=0x402 \
    -device pcie-root-port,id=pcie.0-root-port-2,slot=2,chassis=2,addr=0x2,bus=pcie.0 \
    -device qemu-xhci,id=usb1,bus=pcie.0-root-port-2,addr=0x0 \
    -device pcie-root-port,id=pcie.0-root-port-3,slot=3,chassis=3,addr=0x3,bus=pcie.0 \
    -blockdev node-name=file_image1,driver=file,aio=threads,filename=/home/mirror.qcow2,cache.direct=on,cache.no-flush=off \
    -blockdev node-name=drive_image1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_image1 \
    -device virtio-blk-pci,id=image1,drive=drive_image1,write-cache=on,bus=pcie.0-root-port-3 \
    -device pcie-root-port,id=pcie.0-root-port-6,slot=6,chassis=6,addr=0x6,bus=pcie.0 \
    -blockdev node-name=file_data1,driver=file,aio=threads,filename=/home/data.qcow2,cache.direct=on,cache.no-flush=off \
    -blockdev node-name=drive_data1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_data1 \
    -device virtio-blk-pci,id=data1,drive=drive_data1,write-cache=on,bus=pcie.0-root-port-6 \
    -device pcie-root-port,id=pcie.0-root-port-4,slot=4,chassis=4,addr=0x4,bus=pcie.0 \
    -device virtio-net-pci,mac=9a:6c:ca:b7:36:85,id=idz4QyVp,netdev=idNnpx5D,bus=pcie.0-root-port-4,addr=0x0  \
    -netdev tap,id=idNnpx5D,vhost=on \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1  \
    -vnc :1  \
    -rtc base=utc,clock=host,driftfix=slew  \
    -boot menu=off,order=cdn,once=c,strict=off \
    -enable-kvm \
    -device pcie-root-port,id=pcie_extra_root_port_0,slot=5,chassis=5,addr=0x5,bus=pcie.0 \
    -monitor stdio \
    -qmp tcp:0:3001,server,nowait \
    -incoming tcp:0:5000 \

2. Expose dst mirror target
    { "execute": "nbd-server-start", "arguments": { "addr": { "type": "inet", "data": { "host": "10.73.196.71", "port": "3333" } } } }
    { "execute": "nbd-server-add", "arguments": { "device": "drive_image1", "writable": true } }

3. Start src with qemu cmds:
    /usr/libexec/qemu-kvm \
    -name 'avocado-vt-vm1'  \
    -sandbox on  \
    -machine q35  \
    -nodefaults \
    -device VGA,bus=pcie.0,addr=0x1 \
    -m 14336  \
    -smp 16,maxcpus=16,cores=8,threads=1,dies=1,sockets=2  \
    -cpu 'EPYC',+kvm_pv_unhalt  \
    -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpmonitor1-20200203-033416-61dmcn92,server,nowait \
    -mon chardev=qmp_id_qmpmonitor1,mode=control  \
    -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/monitor-catch_monitor-20200203-033416-61dmcn92,server,nowait \
    -mon chardev=qmp_id_catch_monitor,mode=control \
    -device pvpanic,ioport=0x505,id=idy8YPXp \
    -chardev socket,path=/var/tmp/serial-serial0-20200203-033416-61dmcn92,server,nowait,id=chardev_serial0 \
    -device isa-serial,id=serial0,chardev=chardev_serial0  \
    -chardev socket,id=seabioslog_id_20200203-033416-61dmcn92,path=/var/tmp/seabios-20200203-033416-61dmcn92,server,nowait \
    -device isa-debugcon,chardev=seabioslog_id_20200203-033416-61dmcn92,iobase=0x402 \
    -device pcie-root-port,id=pcie.0-root-port-2,slot=2,chassis=2,addr=0x2,bus=pcie.0 \
    -device qemu-xhci,id=usb1,bus=pcie.0-root-port-2,addr=0x0 \
    -device pcie-root-port,id=pcie.0-root-port-3,slot=3,chassis=3,addr=0x3,bus=pcie.0 \
    -blockdev node-name=file_image1,driver=file,aio=threads,filename=/home/kvm_autotest_root/images/rhel820-64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \
    -blockdev node-name=drive_image1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_image1 \
    -device virtio-blk-pci,id=image1,drive=drive_image1,write-cache=on,bus=pcie.0-root-port-3 \
    -device pcie-root-port,id=pcie.0-root-port-6,slot=6,chassis=6,addr=0x6,bus=pcie.0 \
    -blockdev node-name=file_data1,driver=file,aio=threads,filename=/home/data.qcow2,cache.direct=on,cache.no-flush=off \
    -blockdev node-name=drive_data1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_data1 \
    -device virtio-blk-pci,id=data1,drive=drive_data1,write-cache=on,bus=pcie.0-root-port-6 \
    -device pcie-root-port,id=pcie.0-root-port-4,slot=4,chassis=4,addr=0x4,bus=pcie.0 \
    -device virtio-net-pci,mac=9a:6c:ca:b7:36:85,id=idz4QyVp,netdev=idNnpx5D,bus=pcie.0-root-port-4,addr=0x0  \
    -netdev tap,id=idNnpx5D,vhost=on \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1  \
    -vnc :0  \
    -rtc base=utc,clock=host,driftfix=slew  \
    -boot menu=off,order=cdn,once=c,strict=off \
    -enable-kvm \
    -device pcie-root-port,id=pcie_extra_root_port_0,slot=5,chassis=5,addr=0x5,bus=pcie.0 \
    -monitor stdio \
    -qmp tcp:0:3000,server,nowait \

4. Add target node
    {"execute":"blockdev-add","arguments":{"driver":"nbd","node-name":"mirror","server":{"type":"inet","host":"10.73.196.71","port":"3333"},"export":"drive_image1"}}

5. Mirror from src to dst
    { "execute": "blockdev-mirror", "arguments": { "device": "drive_image1","target": "mirror", "sync": "full","job-id":"j1" } }
{"timestamp": {"seconds": 1580795869, "microseconds": 981273}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "j1"}}
{"timestamp": {"seconds": 1580795869, "microseconds": 981373}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "j1"}}
{"return": {}}
{"timestamp": {"seconds": 1580795921, "microseconds": 511968}, "event": "JOB_STATUS_CHANGE", "data": {"status": "ready", "id": "j1"}}
{"timestamp": {"seconds": 1580795921, "microseconds": 512201}, "event": "BLOCK_JOB_READY", "data": {"device": "j1", "len": 21475033088, "offset": 21475033088, "speed": 0, "type": "mirror"}}

6. Set migration capabilities in both src and dst.
    {"execute":"migrate-set-capabilities","arguments":{"capabilities":[{"capability":"pause-before-switchover","state":true}]}}

7. Migrate from src to dst
    {"execute": "migrate","arguments":{"uri": "tcp:10.73.196.71:5000"}}

8. Cancel block jobs
    {"execute":"block-job-cancel","arguments":{"device":"j1"}}

9. Continue migration
    {"execute":"migrate-continue","arguments":{"state":"pre-switchover"}}

10. Quit vm in both src and dst
    (qemu)quit

11. Check image info of both src and dst.
    # qemu-img info /home/kvm_autotest_root/images/rhel820-64-virtio-scsi.qcow2 
image: /home/kvm_autotest_root/images/rhel820-64-virtio-scsi.qcow2
file format: qcow2
virtual size: 20 GiB (21474836480 bytes)
disk size: 4.55 GiB
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

    #qemu-img info /home/mirror.qcow2
image: /home/mirror.qcow2
file format: qcow2
virtual size: 20 GiB (21474836480 bytes)
disk size: 4.55 GiB
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false


Thanks,
Aliang

Comment 11 Ademar Reis 2020-02-05 22:32:37 UTC
Setting the QEMU sub-component to "General" to avoid breakage of tools using the API. Please change it to the appropriate one if necessary in your next triage

Comment 12 yafu 2020-02-07 09:16:31 UTC
Tested with libvirt-6.0.0-4.module+el8.2.0+5642+838f3513.x86_64 and qemu-kvm-4.2.0-8.module+el8.2.0+5607+dc756904.x86_64. Pre-create qcow2 images as migration targets, the disk size on the target host is about 2 times of the disk size on the source host.

Test steps:
1.Start a guest with disk setting as follows:
<disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='none' io='threads'/>
      <source file='/var/lib/libvirt/images/RHEL-8.2-x86_64-latest.qcow2' index='1'/>
      <backingStore/>
      <target dev='vdb' bus='virtio'/>
      <alias name='ua-1035e984-8238-46e1-bf56-b546246e3335'/>
      <address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x0'/>
  </disk>

2.# qemu-img info /var/lib/libvirt/images/RHEL-8.2-x86_64-latest.qcow2 -U
image: /var/lib/libvirt/images/RHEL-8.2-x86_64-latest.qcow2
file format: qcow2
virtual size: 10 GiB (10737418240 bytes)
disk size: 739 MiB
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

3.Pre-create image on the target host:
#qemu-img create -f qcow2 /var/lib/libvirt/images/RHEL-8.2-x86_64-latest.qcow2 10G

4.Do storage migration:
# virsh migrate yafu  qemu+ssh://X.X.X.X/system --live --verbose --copy-storage-all 
root.X.X's password: 
Migration: [100 %]

5.Check the disk size of image on the target host:
# qemu-img info /var/lib/libvirt/images/RHEL-8.2-x86_64-latest.qcow2 -U
image: /var/lib/libvirt/images/RHEL-8.2-x86_64-latest.qcow2
file format: qcow2
virtual size: 10 GiB (10737418240 bytes)
disk size: 1.55 GiB
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

6.Please see the libvirtd.log both source and target host in the attachment

Comment 13 yafu 2020-02-07 09:17:29 UTC
Created attachment 1661625 [details]
libvirtd.log on source and target host

Comment 18 leidwang@redhat.com 2020-11-18 09:54:32 UTC
Hi Fangge,

This bug will be auto closed within 30 days, could you please confirm whether you still encounter this issue?

Thanks.

Comment 19 Fangge Jin 2020-11-18 10:00:15 UTC
(In reply to leidwang from comment #18)
> Hi Fangge,
> 
> This bug will be auto closed within 30 days, could you please confirm
> whether you still encounter this issue?
> 
> Thanks.

Yes, I can still reproduce it with:

qemu-kvm-5.1.0-13.module+el8.3.0+8424+e82f331d.x86_64
libvirt-6.6.0-8.virtcov.el8.x86_64

Comment 20 leidwang@redhat.com 2020-11-18 12:05:36 UTC
Hi Ademar,

This bug will be auto closed within 30 days, could you please confirm if we have plan to fix it?

Thanks.

Comment 26 RHEL Program Management 2020-12-15 07:35:45 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.

Comment 27 leidwang@redhat.com 2021-01-06 10:16:52 UTC
Agree to close this bug as WONTFIX, and set qe test_coverage-


Note You need to log in before you can comment on or make changes to this bug.