Bug 1018451 - 'qemu-img info' take too much time with 'cluster_size=512,preallocation=metadata' in the first time
'qemu-img info' take too much time with 'cluster_size=512,preallocation=metad...
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: qemu-kvm (Show other bugs)
6.5
x86_64 Linux
high Severity medium
: rc
: ---
Assigned To: Max Reitz
Virtualization Bugs
: Regression, ZStream
Depends On:
Blocks: 1029327
  Show dependency treegraph
 
Reported: 2013-10-12 03:27 EDT by Xu Han
Modified: 2014-10-14 02:53 EDT (History)
16 users (show)

See Also:
Fixed In Version: qemu-kvm-0.12.1.2-2.419.el6
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-10-14 02:53:06 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Xu Han 2013-10-12 03:27:19 EDT
Description of problem:
If create a image with option 'cluster_size=512,preallocation=metadata'.
'qemu-img info' take too much time in the first time. 

compare RHEL6.4 GA RHEL6.5 and RHEL7.0
RHEL6.4 GA
# rpm -qa | grep qemu-img
qemu-img-0.12.1.2-2.355.el6.x86_64
# time qemu-img info disk.img
image: disk.img
file format: qcow2
virtual size: 1.0G (1073741824 bytes)
disk size: 132M
cluster_size: 512

real	0m0.004s
user	0m0.002s
sys	0m0.001s

RHEL6.5
# rpm -qa | grep qemu-img
qemu-img-0.12.1.2-2.412.el6.x86_64
# time qemu-img info disk.img
image: disk.img
file format: qcow2
virtual size: 1.0G (1073741824 bytes)
disk size: 132M
cluster_size: 512

real	0m22.373s
user	0m0.004s
sys	0m0.945s

RHEL7.0
# rpm -qa | grep qemu-img
qemu-img-1.5.3-8.el7.x86_64
# time qemu-img info disk.img
image: disk.img
file format: qcow2
virtual size: 1.0G (1073741824 bytes)
disk size: 813M
cluster_size: 512

real	0m0.147s
user	0m0.003s
sys	0m0.000s

Base on this result above, QE set this as regression bug.   


Version-Release number of selected component (if applicable):
qemu-img-0.12.1.2-2.412.el6.x86_64

How reproducible:
100%


Steps to Reproduce:
1. create image with 'cluster_size=512,preallocation=metadata'.
# qemu-img create -f qcow2 -o cluster_size=512,preallocation=metadata disk.img 4G

2. query image info.
# time qemu-img info disk.img

3. query image info again.
# time qemu-img info disk.img


Actual results:
step2:
image: disk.img
file format: qcow2
virtual size: 4.0G (4294967296 bytes)
disk size: 528M
cluster_size: 512

real	2m10.142s
user	0m0.006s
sys	0m1.890s

step3:
image: disk.img
file format: qcow2
virtual size: 4.0G (4294967296 bytes)
disk size: 530M
cluster_size: 512

real	0m0.011s
user	0m0.003s
sys	0m0.006s


Expected results:
'qemu-img info' < 1sec

Additional info:
about RHEL6.4 GA host, create a 1G image will take more then 20mins.
Comment 1 juzhang 2013-10-12 03:45:43 EDT
Hi Kevin,

Could you have a look this issue?

From QE POV.
Compared to RHEL6.4 GA, 'qemu-img info' take too much time with 'cluster_size=512,preallocation=metadata' in first time. So, QE adds "regression" Keywords and suggest fixing on RHEL6.5.
Comment 2 Kevin Wolf 2013-10-15 05:33:45 EDT
Can you track this down to a specific build?

I tried the current RHEL 6.5 qemu-img and can't see the problem (on Fedora 18,
though, so if the problem is in the kernel or any other part, it's expected
that I can't see it here). The strace output of qemu-img looked reasonable as
well, so it didn't try to read big chunks of data.

Perhaps you can provide an strace log for qemu-img info as well?
Comment 4 Xu Han 2013-10-16 04:43:44 EDT
qemu-img-0.12.1.2-2.409.el6.x86_64 start to hit this issue.

From qemu-img-0.12.1.2-2.398.el6.x86_64 to qemu-img-0.12.1.2-2.408.el6.x86_64 hit other issue, not sure if is the same issue.
The result is query information of image can get output message immediately, but process quit after 1 min.


BTW, all test goes on kernel-2.6.32-421.el6.x86_64.
Comment 6 Max Reitz 2013-10-18 16:14:03 EDT
Older versions of qemu-img create seem to frequently flush the whole image to disk. This takes a considerable amount of time, resulting in the 20 min you observed for a 4G image on RHEL 6.4. This was obviously fixed anywhere until version 0.12.1-2-2.409. Now, qemu-img create apparently doesn't flush the whole image to disk but only some parts of it, thereby being notably faster.

This in turn means that the image won't be flushed to disk after qemu-img create. The first flush operation will therefore take some time. qemu-img info automatically flushes the image file when closing it, therefore, the first invocation takes considerably more time than subsequent ones (with Linux remembering the difference between the cached and on-disk state, which is obviously inexistent for subsequent qemu-img info calls).

Syncing the image (e.g. using sync) after qemu-img create makes the first qemu-img invocation just as fast as any successive ones (of course, the sync itself now takes the time the flush operation took in the first qemu-img info execution before).

The remaining question is why this only appears with 512 byte clusters, but not with the 64k standard cluster size: The reason is that the image file contains sparse areas which don't actually occupy disk space and therefore don't have to be flushed. Metadata (e.g., L2 tables) are filled with data during preallocation, thus, it has to be written to disk on a flush, in contrast to data clusters, which are empty right after the image creation. Obviously, images with a smaller cluster size require more metadata for the same amount of virtual disk space, therefore, flushing an empty image with a small cluster size right after creating it takes more time than flushing an empty image with a large cluster size.

All in all, I would not call this a bug. The image has to be flushed to disk at some point and whether this is still in qemu-img create or at the first flush afterwards (which might even occur during an essentially read-only operation such as qemu-img info), doesn't really make a difference.
Comment 7 Xu Han 2013-10-22 02:01:48 EDT
According to Comment 6, QE accept that it is not a bug.
Comment 8 Ademar Reis 2013-10-23 04:56:26 EDT
(In reply to Max Reitz from comment #6)
> Older versions of qemu-img create seem to frequently flush the whole image
> to disk. This takes a considerable amount of time, resulting in the 20 min
> you observed for a 4G image on RHEL 6.4. This was obviously fixed anywhere
> until version 0.12.1-2-2.409. Now, qemu-img create apparently doesn't flush
> the whole image to disk but only some parts of it, thereby being notably
> faster.
> 
> This in turn means that the image won't be flushed to disk after qemu-img
> create. The first flush operation will therefore take some time. qemu-img
> info automatically flushes the image file when closing it, therefore, the
> first invocation takes considerably more time than subsequent ones (with
> Linux remembering the difference between the cached and on-disk state, which
> is obviously inexistent for subsequent qemu-img info calls).
> 
> Syncing the image (e.g. using sync) after qemu-img create makes the first
> qemu-img invocation just as fast as any successive ones (of course, the sync
> itself now takes the time the flush operation took in the first qemu-img
> info execution before).
> 
> The remaining question is why this only appears with 512 byte clusters, but
> not with the 64k standard cluster size: The reason is that the image file
> contains sparse areas which don't actually occupy disk space and therefore
> don't have to be flushed. Metadata (e.g., L2 tables) are filled with data
> during preallocation, thus, it has to be written to disk on a flush, in
> contrast to data clusters, which are empty right after the image creation.
> Obviously, images with a smaller cluster size require more metadata for the
> same amount of virtual disk space, therefore, flushing an empty image with a
> small cluster size right after creating it takes more time than flushing an
> empty image with a large cluster size.
> 
> All in all, I would not call this a bug. The image has to be flushed to disk
> at some point and whether this is still in qemu-img create or at the first
> flush afterwards (which might even occur during an essentially read-only
> operation such as qemu-img info), doesn't really make a difference.

Thanks for the detailed analysis. Based on that, I was tempted to close it as NOTABUG, but after thinking more about it and also discussing with Kevin, it became clear to us that this is not expected from the user perspective.

The first flush may even happen during the first boot of the VM, and users won't realize this is because of the image flush. They'll perceive it as qemu slowness and will be puzzled (just as all of us were).

A proper fix would be flushing the image at the end of qemu-img create. Waiting longer at qemu-img create is expected behavior.

So I'm adding the Regression flag back.

I still wonder why RHEL7 is not affected and also why Kevin couldn't reproduce it using RHEL6.5 packages on a Fedora-18 environment.
Comment 11 Max Reitz 2013-10-23 14:41:48 EDT
(In reply to Ademar de Souza Reis Jr. from comment #8)
> I still wonder why RHEL7 is not affected and also why Kevin couldn't
> reproduce it using RHEL6.5 packages on a Fedora-18 environment.

At least for upstream master, I think this is because qemu-img info doesn't flush the image anymore. If you just open the image in qemu-io and close it right again (forcing a flush) or execute "sync", the issue will appear again. The same holds true for using the image as a qemu block drive when closing qemu.
Comment 22 Qunfang Zhang 2014-07-01 06:39:52 EDT
Reproduced this bug on qemu-kvm-0.12.1.2-2.415.el6.x86_64, the first time "qemu-img info" will take long time. 

[root@localhost home]# time qemu-img create -f qcow2 -o cluster_size=512,preallocation=metadata disk.img 4G; time qemu-img info disk.img
Formatting 'disk.img', fmt=qcow2 size=4294967296 encryption=off cluster_size=512 preallocation='metadata' 


real	0m11.431s
user	0m3.537s
sys	0m10.176s

image: disk.img
file format: qcow2
virtual size: 4.0G (4294967296 bytes)
disk size: 528M
cluster_size: 512

real	10m56.390s
user	0m0.003s
sys	0m1.411s


Verified this bug on qemu-kvm-0.12.1.2-2.428.el6.x86_64. Now the time will be spent on the "qemu-img create" install of "qemu-img info".

[root@localhost home]# time qemu-img create -f qcow2 -o cluster_size=512,preallocation=metadata disk.img 4G; time qemu-img info disk.img
Formatting 'disk.img', fmt=qcow2 size=4294967296 encryption=off cluster_size=512 preallocation='metadata' 


real	11m3.516s
user	1m24.906s
sys	0m6.317s
image: disk.img
file format: qcow2
virtual size: 4.0G (4294967296 bytes)
disk size: 535M
cluster_size: 512

real	0m0.007s
user	0m0.002s
sys	0m0.004s

Also test with default cluster_size with qemu-kvm-0.12.1.2-2.428.el6.x86_64:

[root@localhost home]# time qemu-img create -f qcow2 -o preallocation=metadata disk2.img 4G; time qemu-img info disk2.img
Formatting 'disk2.img', fmt=qcow2 size=4294967296 encryption=off cluster_size=65536 preallocation='metadata' 

real	0m0.220s
user	0m0.008s
sys	0m0.007s
image: disk2.img
file format: qcow2
virtual size: 4.0G (4294967296 bytes)
disk size: 784K
cluster_size: 65536

real	0m0.003s
user	0m0.002s
sys	0m0.001s

Based on above, the issue is fixed.
Comment 23 errata-xmlrpc 2014-10-14 02:53:06 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-1490.html

Note You need to log in before you can comment on or make changes to this bug.