1357919 – Copying a raw sparse file take a lot of time comparing to qcow sparse file using qemu-img (depends on Gluster bugs)

Bug 1357919 - Copying a raw sparse file take a lot of time comparing to qcow sparse file using qemu-img (depends on Gluster bugs)

Summary: Copying a raw sparse file take a lot of time comparing to qcow sparse file us...

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	ovirt-engine
Classification:	oVirt
Component:	BLL.Gluster
Sub Component:
Version:	4.0.1.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Gobinda Das
QA Contact:	SATHEESARAN
Docs Contact:
URL:
Whiteboard:
Depends On:	1306396
Blocks:
TreeView+	depends on / blocked

Reported:	2016-07-19 15:10 UTC by Raz Tamir
Modified:	2021-02-02 02:36 UTC (History)
CC List:	9 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2020-10-15 13:55:40 UTC
oVirt Team:	Gluster
Embargoed:
Dependent Products:
Flags:	vharihar: needinfo- pm-rhel: ovirt-4.4? rule-engine: planning_ack? rule-engine: devel_ack? rule-engine: testing_ack?

Attachments	(Terms of Use)
vdsm.log (2.70 MB, text/plain) 2016-07-19 15:10 UTC, Raz Tamir	no flags	Details
raw-sparse-libgfapi-trace-output (3.27 MB, text/plain) 2017-06-02 05:57 UTC, Krutika Dhananjay	no flags	Details
raw-sparse-xfs-strace-output (120.45 KB, text/plain) 2017-06-02 05:59 UTC, Krutika Dhananjay	no flags	Details
View All

Description Raz Tamir 2016-07-19 15:10:06 UTC

Created attachment 1181674 [details]
vdsm.log

Description of problem:
When trying to copy disks on file based domain, the is a difference between raw and cow sparse disks with the same size:

raw, sparse 10GB on file
[host images]# du -a *
0       55d380fe-cfc2-42b5-884b-f5b8d6da0be7/dc01ddd5-8501-4c63-ad51-0e19e3854046
1028    55d380fe-cfc2-42b5-884b-f5b8d6da0be7/dc01ddd5-8501-4c63-ad51-0e19e3854046.lease
4       55d380fe-cfc2-42b5-884b-f5b8d6da0be7/dc01ddd5-8501-4c63-ad51-0e19e3854046.meta
1036    55d380fe-cfc2-42b5-884b-f5b8d6da0be7

Copy disk:
DEBUG::2016-07-19 17:09:27,761::qemuimg::224::QemuImg::(__init__) /usr/bin/taskset --cpu-list 0-1 /usr/bin/nice -n 19 /usr/bin/ionice -c 3 /usr/bin/qemu-img convert -p -t none 
-T none -f raw /rhev/data-center/ce185612-2017-4ca1-a76f-ee701f73bd33/5cf6dd4b-65db-413a-8226-95bac7ee9378/images/55d380fe-cfc2-42b5-884b-f5b8d6da0be7/dc01ddd5-8501-4c63-ad51-0e19e3854046 -O raw /rhev/data-center/m
nt/10.35.64.11:_vol_RHEV_Storage_storage__jenkins__ge5__nfs__0/5cf6dd4b-65db-413a-8226-95bac7ee9378/images/c2d5b640-5142-4c8a-b4f6-e1b750a2e742/137d8028-ad59-47e4-acf7-2aaf7b541fa1 (cwd None)
DEBUG::2016-07-19 17:09:27,776::image::140::Storage.Image::(_wait_for_qemuimg_operation) waiting for qemu-img operation to complete
...
DEBUG::2016-07-19 17:11:47,782::image::147::Storage.Image::(_wait_for_qemuimg_operation) qemu-img operation progress: 100.0%
DEBUG::2016-07-19 17:11:47,782::image::149::Storage.Image::(_wait_for_qemuimg_operation) qemu-img operation has completed
DEBUG::2016-07-19 17:11:47,783::utils::870::vds.stopwatch::(stopwatch) Copy volume dc01ddd5-8501-4c63-ad51-0e19e3854046: 140.01 seconds


************************************************


cow, sparse 10GB on file
[host images]# du -a *
200      976ee819-50b5-4156-bd3c-84aca4e50483/14a85f54-73d2-48cd-bb79-02a2f027afd7
1028     976ee819-50b5-4156-bd3c-84aca4e50483/14a85f54-73d2-48cd-bb79-02a2f027afd7.lease
4        976ee819-50b5-4156-bd3c-84aca4e50483/14a85f54-73d2-48cd-bb79-02a2f027afd7.meta
1236     976ee819-50b5-4156-bd3c-84aca4e50483

Copy disk:
DEBUG::2016-07-19 17:05:07,036::qemuimg::224::QemuImg::(__init__) /usr/bin/taskset --cpu-list 0-1 /usr/bin/nice -n 19 /usr/bin/ionice -c 3 /usr/bin/qemu-img convert -p -t none 
-T none -f qcow2 /rhev/data-center/ce185612-2017-4ca1-a76f-ee701f73bd33/5cf6dd4b-65db-413a-8226-95bac7ee9378/images/976ee819-50b5-4156-bd3c-84aca4e50483/14a85f54-73d2-48cd-bb79-02a2f027afd7 -O qcow2 -o compat=0.10 
/rhev/data-center/mnt/10.35.64.11:_vol_RHEV_Storage_storage__jenkins__ge5__nfs__0/5cf6dd4b-65db-413a-8226-95bac7ee9378/images/6d9be53e-56b0-42cd-9694-275461a09151/91768f6c-d3c2-4c02-9fd6-7c37c0e36fdd (cwd None)
DEBUG::2016-07-19 17:05:07,051::image::140::Storage.Image::(_wait_for_qemuimg_operation) waiting for qemu-img operation to complete
bcb205cc-94f3-4ad6-ab89-7babb17abedf::DEBUG::2016-07-19 17:05:07,118::image::147::Storage.Image::(_wait_for_qemuimg_operation) qemu-img operation progress: 100.0%
DEBUG::2016-07-19 17:05:07,118::image::149::Storage.Image::(_wait_for_qemuimg_operation) qemu-img operation has completed
DEBUG::2016-07-19 17:05:07,119::utils::870::vds.stopwatch::(stopwatch) Copy volume 14a85f54-73d2-48cd-bb79-02a2f027afd7: 0.06 seconds

The differences in execution time are strange because the image is with the same actual and provisioned size



Version-Release number of selected component (if applicable):


How reproducible:
100%

Steps to Reproduce:
1. Create 2 disks on file based domain, both 10GB sparse, one cow format and one raw format
2. Copy each disk to the same domain they exist on
3.

Actual results:


Expected results:


Additional info:

Comment 1 Yaniv Kaul 2016-07-20 07:03:53 UTC

1. So the bug is that raw/sparse is slow? This is not what the title says.
2. Can it be reproduced with qemu-img alone, without VDSM/oVirt? If so, it's a qemu bug, no?
3. Are you sure that the NFS server supports sparse? I'd try with NFSv4.2. See http://www.snia.org/sites/default/files/NFS_4.2_Final.pdf for more details.

Comment 2 Raz Tamir 2016-07-20 08:39:42 UTC

1. True
2. 
qCOW:
[root@green-vdsa tmp]# time /usr/bin/qemu-img convert -p -t none -T none -f qcow2 /rhev/data-center/ce185612-2017-4ca1-a76f-ee701f73bd33/5cf6dd4b-65db-413a-8226-95bac7ee9378/images/6b998279-5cda-448f-a1a3-f65aefbd4252/c9185df1-1d4a-4af5-ae1c-4b9c68edf5be -O qcow2 -o compat=0.10 /rhev/data-center/mnt/10.35.64.11:_vol_RHEV_Storage_storage__jenkins__ge5__nfs__0/5cf6dd4b-65db-413a-8226-95bac7ee9378/images/tmp/tmp_vol
    (100.00/100%)

real    0m0.058s
user    0m0.020s
sys     0m0.017s


raw:
[root@green-vdsa tmp]# time /usr/bin/qemu-img convert -p -t none -T none -f raw /rhev/data-center/ce185612-2017-4ca1-a76f-ee701f73bd33/5cf6dd4b-65db-413a-8226-95bac7ee9378/images/35a17d2f-bbc3-444d-ae55-07df3936b247/f3cbae20-bd43-4e65-a2ff-829fdc1a6083 -O raw /rhev/data-center/mnt/10.35.64.11:_vol_RHEV_Storage_storage__jenkins__ge5__nfs__0/5cf6dd4b-65db-413a-8226-95bac7ee9378/images/tmp/tmp_vol
    (100.00/100%)

real    2m18.914s
user    0m2.201s
sys     0m2.586s

3. I will check this, but this issue seems to be related the the image format,  raw vs qcow and both sparse so if the server doesn't support sparse it shouldn't support for both formats

Comment 3 Allon Mureinik 2016-07-20 09:10:18 UTC

Niels - both files here are empty, but offhand it looks like in the raw case qemu-img reads the "entire" size, including blocks that aren't really allocated.

Should your recent work on SEEK_DATA address this?

Comment 4 Yaniv Kaul 2016-07-20 09:38:30 UTC

(In reply to ratamir from comment #2)

> 
> 3. I will check this, but this issue seems to be related the the image
> format,  raw vs qcow and both sparse so if the server doesn't support sparse
> it shouldn't support for both formats

The above statement is plain wrong. qcow2 is not sparse, it's thin - the FILE grows as you add content to it.
The raw sparse format tries to take advantage of the underlying file system's sparseness feature, allowing to 'punch hole' in it where there's no need for allocation.

Comment 5 Raz Tamir 2016-07-20 11:30:26 UTC

(In reply to Yaniv Kaul from comment #4)
> (In reply to ratamir from comment #2)
> 
> > 
> > 3. I will check this, but this issue seems to be related the the image
> > format,  raw vs qcow and both sparse so if the server doesn't support sparse
> > it shouldn't support for both formats
> 
> The above statement is plain wrong. qcow2 is not sparse, it's thin - the
> FILE grows as you add content to it.
> The raw sparse format tries to take advantage of the underlying file
> system's sparseness feature, allowing to 'punch hole' in it where there's no
> need for allocation.

You are right,
I should have use the term thin-provisioned disks instead of sparse because of the double meaning

Comment 6 Niels de Vos 2016-11-22 15:14:19 UTC

(In reply to Allon Mureinik from comment #3)
> Niels - both files here are empty, but offhand it looks like in the raw case
> qemu-img reads the "entire" size, including blocks that aren't really
> allocated.
> 
> Should your recent work on SEEK_DATA address this?

Sorry for the late reply, Allon!

When a recent qemu-img with the gluster block-driver is used to copy a sparse file, the sparseness of the file should be maintained. There are some caveats though.

Older (inc. RHEL) versions of the kernel can not do SEEK_DATA/HOLE over FUSE mounts. Support for SEEK_DATA/HOLE is being added to FUSE in RHEL through bug 1306396.

Unfortunately, volumes that have sharding enabled do not support SEEK_DATA/HOLE yet either... Bug 1301647 for upstream Gluster tracks (the lack of) progress there.

NFS added support for SEEK_DATA/HOLE recently (see comment #1). Not all NFS-servers support this, nor all NFS-clients. Debugging with 'rpcdebug' or capturing a network trace can tell you (or me) if SEEK_DATA/HOLE is used.

'cp' from coreutils does not use SEEK_DATA/HOLE, but the lower-level filesystem interfaces. The coreutils developers were planning to use lseek() instead of filesystem specific ioctl()'s. I did not follow progress there and do not know if patches for this exist already.

Comment 7 Maor 2017-02-12 18:09:51 UTC

(In reply to Niels de Vos from comment #6)
> (In reply to Allon Mureinik from comment #3)
....
> 
> NFS added support for SEEK_DATA/HOLE recently (see comment #1). Not all
> NFS-servers support this, nor all NFS-clients. Debugging with 'rpcdebug' or
> capturing a network trace can tell you (or me) if SEEK_DATA/HOLE is used.

Raz, can u please share the debug of rpcdebug.
So, based on your test of qemu-img isn't that should be a qemu bug?

Comment 8 Raz Tamir 2017-02-13 14:16:35 UTC

(In reply to Maor from comment #7)
> (In reply to Niels de Vos from comment #6)
> > (In reply to Allon Mureinik from comment #3)
> ....
> > 
> > NFS added support for SEEK_DATA/HOLE recently (see comment #1). Not all
> > NFS-servers support this, nor all NFS-clients. Debugging with 'rpcdebug' or
> > capturing a network trace can tell you (or me) if SEEK_DATA/HOLE is used.
> 
> Raz, can u please share the debug of rpcdebug.
I don't have the output of rpcdebug cause this bug is 6 months old.

> So, based on your test of qemu-img isn't that should be a qemu bug?
This might be a qemu bug but the product uses it so I opened it under ovirt engine

Comment 9 Yaniv Lavi 2017-02-13 14:20:42 UTC

Moving to gluster to review impact on HCI.

Comment 10 Sahina Bose 2017-02-23 07:19:20 UTC

Krutika, would Bug 1301647 help address this?

Comment 11 Krutika Dhananjay 2017-03-24 07:25:55 UTC

(In reply to Sahina Bose from comment #10)
> Krutika, would Bug 1301647 help address this?

I'm not sure yet.

Here's some data I collected from my tests:

1. Like Raz rightly reported, on FUSE mount, time taken for the qemu-img command to complete for raw files was much more than that for qcow2 files. In my run, raw files took ~40 seconds to complete while in the qcow2 case, it took less than a second.

2. I disabled shard and ran the tests again. Not much difference except for the command finishing execution slightly faster for raw than in the sharded case.

3. So with shard still disabled, I ran the same test with libgfapi and it was *much* slower than (2) for raw files.

This is the command I used for gfapi, if it helps:

[root@rhs-srv-07 ~]# truncate -s 10G /mnt/vm9.img  (this i did from the FUSE mount, but that's OK!)
                                                                                                                                                                                           
[root@rhs-srv-07 ~]# strace -ff -T -o /tmp/raw-no-shard-and-gfapi-2.log /usr/bin/qemu-img convert -p -t none -T none -f raw gluster://10.8.152.17/rep/vm9.img -O raw gluster://10.8.152.17/rep/vm9-output.img


Niels,

Is SEEK_HOLE/SEEK_DATA supposed to work with libgfapi?

I'm using glusterfs-3.8.10 FWIW.

-Krutika

PS: I do have strace output collected from all 3 runs for analysis. I'll do that some time next week && based on Niels' confirmation on the needinfo.

Comment 12 Krutika Dhananjay 2017-05-18 07:34:01 UTC

Niels,

Did you get a chance to see comment #11?

-Krutika

Comment 13 Niels de Vos 2017-05-18 08:59:11 UTC

SEEK_HOLE/SEEK_DATA is expected to work with libgfapi from glusterfs-3.8.10. Because libgfapi is completely userspace, doing an strace will not show much details. You will need to use ltrace instead.

  $ ltrace -f -x glfs* -o qemu-img.ltrace.log qemu-img ...

Some xlators (shard and disperse?) do not support SEEK_HOLE/SEEK_DATA, and they will/should return EINVAL or similar. qemu-img may fall-back to not using glfs_seek() in that case.

Also note that the version of QEMU is important. The enhancement of glfs_seek() with SEEK_DATA/SEEK_HOLE was included in upstream QEMU 2.7.0. I do not know if this was backported to qemu-kvm-rhev (or what version was used while testing).

Comment 14 Krutika Dhananjay 2017-06-02 05:56:21 UTC

(In reply to Niels de Vos from comment #13)
> SEEK_HOLE/SEEK_DATA is expected to work with libgfapi from glusterfs-3.8.10.
> Because libgfapi is completely userspace, doing an strace will not show much
> details. You will need to use ltrace instead.
> 
>   $ ltrace -f -x glfs* -o qemu-img.ltrace.log qemu-img ...
> 
> Some xlators (shard and disperse?) do not support SEEK_HOLE/SEEK_DATA, and
> they will/should return EINVAL or similar. qemu-img may fall-back to not
> using glfs_seek() in that case.

Thanks. I tried ltrace and it wasn't particularly useful because it prints addresses in place of actual parameters.

This time I ran this test on a plain replicate volume (no shard!) over libgfapi and captured trace output.

There was no improvement in time-to-completion. It took over a minute.
In the trace output, I see no occurrence of seek fop. In place there are countless reads that are issued on the src file (presuming it is trying to look for data through reading instead of doing SEEK_DATA). When I run the same thing on XFS, it completes in under a second and there is a call to seek with SEEK_DATA parameter.

Needless to say, I had to write https://review.gluster.org/17437 to make trace watch for seek FOP. And of course the version of gluster is latest master which means it has all of your SEEK fop enhancements.
This means that the issue at this point is not even about shard not supporting SEEK fop.

I'm attaching both trace output and the strace output files for your reference.

Care to take a look?

-Krutika

> 
> Also note that the version of QEMU is important. The enhancement of
> glfs_seek() with SEEK_DATA/SEEK_HOLE was included in upstream QEMU 2.7.0. I
> do not know if this was backported to qemu-kvm-rhev (or what version was
> used while testing).

Comment 15 Krutika Dhananjay 2017-06-02 05:57:39 UTC

Created attachment 1284314 [details]
raw-sparse-libgfapi-trace-output

Comment 16 Krutika Dhananjay 2017-06-02 05:59:30 UTC

Created attachment 1284315 [details]
raw-sparse-xfs-strace-output

Comment 17 Krutika Dhananjay 2017-07-17 09:44:09 UTC

Niels,

Did you get a chance to see comment #14?

Comment 18 Krutika Dhananjay 2017-12-05 12:12:09 UTC

Hi Raz,

Could you please confirm if the steps to recreate the issue in https://bugzilla.redhat.com/show_bug.cgi?id=1357919#c11 are correct?
Lot of the analysis done in debugging this issue is based on comment 11. So I want to be sure I'm on the right track.

-Krutika

Comment 19 Niels de Vos 2017-12-05 14:02:53 UTC

On Tue, Dec 05, 2017 at 05:51:40PM +0530, Krutika Dhananjay wrote:
> So I checked the strace output from fuse point of view and this is what i
> saw:
>
> 1. In the raw sparse case, there was an attempt by the application to do
> seek_data which was returned with EOPNOTSUPPORTED error. So after this the
> app reads 2MB at a time from the src file.
> The larger the src file, the more time it takes for qemu-img convert to
> return as a result. The file itself was 10GB in size in my case.

That makes sense, and is expected in some cases. It is possible that the
FUSE kernel module does not support SEEK_HOLE/SEEK_DATA, and does not
pass it through to the Gluster client. Bug 1306396 was used for the
enhancement of FUSE in kernel-3.10.0-579.el7 and newer.

Which version of the kernel was used for testing?

> 2. However, when i create qcow2 file using `qemu-img create -f qcow2 ...`
> the file itself is only 193KB in size and with data.
> So when I attempt `qemu-img convert..` for copying, it only needs to read
> 193K of data and write to destination.

Well, yes, qemu-img knows the qcow2 format so it will only copy/convert
the data blocks that are in use.

> @Niels,
> Request you to confirm if the command completes in under 1s in libgfapi
> case (which it did not when id looked into it last on 2017-07-17 and i'd
> even left a needinfo on you back then too) and whether seek_data was being
> called and returned with success with gfapi. I've done my bit from fuse pov.

Unfortunately the ltrace output did not show much (maybe the debuginfo
was missing?). I'll try to reproduce it on one of the machines that
Kasturi provided.

Comment 20 Raz Tamir 2017-12-05 16:41:17 UTC

(In reply to Krutika Dhananjay from comment #18)
> Hi Raz,
> 
> Could you please confirm if the steps to recreate the issue in
> https://bugzilla.redhat.com/show_bug.cgi?id=1357919#c11 are correct?
> Lot of the analysis done in debugging this issue is based on comment 11. So
> I want to be sure I'm on the right track.
> 
> -Krutika

Hi Krutika,

I didn't test it on gluster so except step 1 which is describing the result of this bug I can't really confirm if with gluster it works as you described

Comment 21 Krutika Dhananjay 2017-12-06 07:47:38 UTC

(In reply to Raz Tamir from comment #20)
> (In reply to Krutika Dhananjay from comment #18)
> > Hi Raz,
> > 
> > Could you please confirm if the steps to recreate the issue in
> > https://bugzilla.redhat.com/show_bug.cgi?id=1357919#c11 are correct?
> > Lot of the analysis done in debugging this issue is based on comment 11. So
> > I want to be sure I'm on the right track.
> > 
> > -Krutika
> 
> Hi Krutika,
> 
> I didn't test it on gluster so except step 1 which is describing the result
> of this bug I can't really confirm if with gluster it works as you described

Ok in that case I'm assuming that the way to create a raw sparse src disk (which I will copy later) is by selecting the "Preallocated" option from the drop down menu of the "New Virtual Disk" tab. And similarly for qcow2 it would be by selecting "Thin Provision".

If this is not correct, then please describe for me how you created the images "55d380fe-cfc2-42b5-884b-f5b8d6da0be7/dc01ddd5-8501-4c63-ad51-0e19e3854046" and "976ee819-50b5-4156-bd3c-84aca4e50483/14a85f54-73d2-48cd-bb79-02a2f027afd7" in comment #1.

I am not familiar with what "Create 2 disks on file based domain" means in the "steps to reproduce" section.

-Krutika

Comment 22 Raz Tamir 2017-12-06 12:07:17 UTC

(In reply to Krutika Dhananjay from comment #21)
> (In reply to Raz Tamir from comment #20)
> > (In reply to Krutika Dhananjay from comment #18)
> > > Hi Raz,
> > > 
> > > Could you please confirm if the steps to recreate the issue in
> > > https://bugzilla.redhat.com/show_bug.cgi?id=1357919#c11 are correct?
> > > Lot of the analysis done in debugging this issue is based on comment 11. So
> > > I want to be sure I'm on the right track.
> > > 
> > > -Krutika
> > 
> > Hi Krutika,
> > 
> > I didn't test it on gluster so except step 1 which is describing the result
> > of this bug I can't really confirm if with gluster it works as you described
> 
> Ok in that case I'm assuming that the way to create a raw sparse src disk
> (which I will copy later) is by selecting the "Preallocated" option from the
> drop down menu of the "New Virtual Disk" tab. And similarly for qcow2 it
> would be by selecting "Thin Provision".

No,

In this case you will create a preallocated disk which is not what we want.
We need to create 2 thin disks - 1 qcow and 1 raw.
You can do it by executing a REST API call to /api/disks with the body:

<disk>
    <alias>my_disk_name</alias>
    <format>raw</format>  <<<====== create 1 with 'raw' and 1 with 'cow'
    <provisioned_size>1073741824</provisioned_size>
    <sparse>true</sparse>
    <storage_domains>
        <storage_domain id="b692a210-8e2c-4862-b408-b2dbfa3a6d6f">
            <name>nfs_storage_domain_name</name>
            <type>data</type>
            <data_centers>
                <data_center id="480d7e0f-4587-41d7-a045-a0fd8a41b6af"/>
            </data_centers>
        </storage_domain>
    </storage_domains>
</disk>

Few things to ensure:
1) the storage domain ID must be an ID of a FILE storage domain
2) <starse> must be 'true'

> 
> If this is not correct, then please describe for me how you created the
> images
> "55d380fe-cfc2-42b5-884b-f5b8d6da0be7/dc01ddd5-8501-4c63-ad51-0e19e3854046"
> and
> "976ee819-50b5-4156-bd3c-84aca4e50483/14a85f54-73d2-48cd-bb79-02a2f027afd7"
> in comment #1.
> 
> I am not familiar with what "Create 2 disks on file based domain" means in
> the "steps to reproduce" section.

This is the 1st point under "Few things to ensure"

Let me know if you need further assistance 
> 
> -Krutika

Comment 23 Sandro Bonazzola 2019-01-28 09:34:07 UTC

This bug has not been marked as blocker for oVirt 4.3.0.
Since we are releasing it tomorrow, January 29th, this bug has been re-targeted to 4.3.1.

Comment 24 Sahina Bose 2019-01-30 12:20:58 UTC

Is there anything left here - do we need to raise any dependent gluster bugs?

Comment 27 Sahina Bose 2020-02-06 11:01:52 UTC

Sas, any update on test results?

Comment 28 SATHEESARAN 2020-08-03 12:40:19 UTC

I have tested with RHV 4.4.1 with RHGS 3.5.2-async ( glusterfs-6.0-37.2.el8rhgs )
This system has QEMU version.
qemu-kvm-common-4.2.0-29.module+el8.2.1+7297+a825794d.x86_64
qemu-img-4.2.0-29.module+el8.2.1+7297+a825794d.x86_64
qemu-kvm-block-iscsi-4.2.0-29.module+el8.2.1+7297+a825794d.x86_64
qemu-kvm-block-gluster-4.2.0-29.module+el8.2.1+7297+a825794d.x86_64
qemu-kvm-4.2.0-29.module+el8.2.1+7297+a825794d.x86_64
libvirt-6.0.0-25.module+el8.2.1+7154+47ffd890.x86_64

[root@]# time qemu-img convert -p -t none -T none -f raw /rhev/data-center/mnt/glusterSD/rhsqa-grafton7.lab.eng.blr.redhat.com\:_testvol/test/image_raw_10gb.img -O raw /rhev/data-center/mnt/glusterSD/rhsqa-grafton7.lab.eng.blr.redhat.com\:_testvol/test/copy_image_raw_10gb.img 
    (100.00/100%)

real	0m13.215s
user	0m1.225s
sys	0m0.632s
[root@]# time qemu-img convert -p -t none -T none -f qcow2 /rhev/data-center/mnt/glusterSD/rhsqa-grafton7.lab.eng.blr.redhat.com\:_testvol/test/image_qcow2_10gb.img -O qcow2 /rhev/data-center/mnt/glusterSD/rhsqa-grafton7.lab.eng.blr.redhat.com\:_testvol/test/copy_image_qcow2_10gb.img 
    (100.00/100%)

real	0m0.140s
user	0m0.008s
sys	0m0.007s

Observation:
qemu-img convert on raw image on gluster mount takes 13.215 seconds, where as copy of qcow2 image takes  0.140 seconds

I repeated the test on the local filesystem, I see them almost the same.
[root@ ]# time qemu-img convert -p -t none -T none -f raw /test/image_raw_10gb.img -O raw /test/copy_image_raw_10gb.img
    (100.00/100%)

real	0m0.013s
user	0m0.006s
sys	0m0.003s

[root@ ]# time qemu-img convert -p -t none -T none -f qcow2 /test/image_qcow2_10gb.img -O qcow2 /test/copy_image_qcow2_10gb.img
    (100.00/100%)

real	0m0.044s
user	0m0.005s
sys	0m0.008s

Comment 29 SATHEESARAN 2020-08-03 13:00:19 UTC

(In reply to SATHEESARAN from comment #28)
> I have tested with RHV 4.4.1 with RHGS 3.5.2-async (
> glusterfs-6.0-37.2.el8rhgs )
> This system has QEMU version.
> qemu-kvm-common-4.2.0-29.module+el8.2.1+7297+a825794d.x86_64
> qemu-img-4.2.0-29.module+el8.2.1+7297+a825794d.x86_64
> qemu-kvm-block-iscsi-4.2.0-29.module+el8.2.1+7297+a825794d.x86_64
> qemu-kvm-block-gluster-4.2.0-29.module+el8.2.1+7297+a825794d.x86_64
> qemu-kvm-4.2.0-29.module+el8.2.1+7297+a825794d.x86_64
> libvirt-6.0.0-25.module+el8.2.1+7154+47ffd890.x86_64
> 
> [root@]# time qemu-img convert -p -t none -T none -f raw
> /rhev/data-center/mnt/glusterSD/rhsqa-grafton7.lab.eng.blr.redhat.com\:
> _testvol/test/image_raw_10gb.img -O raw
> /rhev/data-center/mnt/glusterSD/rhsqa-grafton7.lab.eng.blr.redhat.com\:
> _testvol/test/copy_image_raw_10gb.img 
>     (100.00/100%)
> 
> real	0m13.215s
> user	0m1.225s
> sys	0m0.632s
> [root@]# time qemu-img convert -p -t none -T none -f qcow2
> /rhev/data-center/mnt/glusterSD/rhsqa-grafton7.lab.eng.blr.redhat.com\:
> _testvol/test/image_qcow2_10gb.img -O qcow2
> /rhev/data-center/mnt/glusterSD/rhsqa-grafton7.lab.eng.blr.redhat.com\:
> _testvol/test/copy_image_qcow2_10gb.img 
>     (100.00/100%)
> 
> real	0m0.140s
> user	0m0.008s
> sys	0m0.007s
> 
> Observation:
> qemu-img convert on raw image on gluster mount takes 13.215 seconds, where
> as copy of qcow2 image takes  0.140 seconds
> 
> I repeated the test on the local filesystem, I see them almost the same.
> [root@ ]# time qemu-img convert -p -t none -T none -f raw
> /test/image_raw_10gb.img -O raw /test/copy_image_raw_10gb.img
>     (100.00/100%)
> 
> real	0m0.013s
> user	0m0.006s
> sys	0m0.003s
> 
> [root@ ]# time qemu-img convert -p -t none -T none -f qcow2
> /test/image_qcow2_10gb.img -O qcow2 /test/copy_image_qcow2_10gb.img
>     (100.00/100%)
> 
> real	0m0.044s
> user	0m0.005s
> sys	0m0.008s

when I performed the above test on the gluster volume, one of the bricks was down.
Now I see raw image conversion takes around 9 seconds

[root@rhsqa-grafton7-nic2 test]# time qemu-img convert -p -t none -T none -f raw /rhev/data-center/mnt/glusterSD/rhsqa-grafton7.lab.eng.blr.redhat.com\:_testvol/test/image_raw_10gb.img -O raw /rhev/data-center/mnt/glusterSD/rhsqa-grafton7.lab.eng.blr.redhat.com\:_testvol/test/copy_image_raw_10gb.img
    (100.00/100%)

real	0m9.162s
user	0m1.095s
sys	0m0.635s
[root@rhsqa-grafton7-nic2 test]# time qemu-img convert -p -t none -T none -f qcow2 /rhev/data-center/mnt/glusterSD/rhsqa-grafton7.lab.eng.blr.redhat.com\:_testvol/test/image_qcow2_10gb.img -O qcow2 /rhev/data-center/mnt/glusterSD/rhsqa-grafton7.lab.eng.blr.redhat.com\:_testvol/test/copy_image_qcow2_10gb.img
    (100.00/100%)

real	0m0.157s
user	0m0.004s
sys	0m0.007s

Comment 30 SATHEESARAN 2020-08-03 13:30:44 UTC

I have capture gluster volume profile information for each qemu-img convert.

When doing qemu-img convert of raw image of size 10G
-------------------------------------------------------

Copy command
-------------
[root@ ]# time qemu-img convert -p -t none -T none -f raw /rhev/data-center/mnt/glusterSD/rhsqa-grafton7.lab.eng.blr.redhat.com\:_testvol/test/image_raw_10gb.img -O raw /rhev/data-center/mnt/glusterSD/rhsqa-grafton7.lab.eng.blr.redhat.com\:_testvol/test/copy_image_raw_10gb.img
    (100.00/100%)

real	0m9.476s
user	0m1.032s
sys	0m0.563s

Corresponding Profile Info
-----------------------------
Interval 1 Stats:
   Block Size:                512b+                1024b+                2048b+ 
 No. of Reads:                    2                     0                     0 
No. of Writes:                    6                     1                  4096 
 
   Block Size:               4096b+              131072b+ 
 No. of Reads:                    3                    24 
No. of Writes:                    1                     0 
 %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls         Fop
 ---------   -----------   -----------   -----------   ------------        ----
      0.00       0.00 us       0.00 us       0.00 us              1      FORGET
      0.00       0.00 us       0.00 us       0.00 us             16     RELEASE
      0.00       0.00 us       0.00 us       0.00 us              6  RELEASEDIR
      0.00      49.33 us      49.33 us      49.33 us              1     OPENDIR
      0.00     101.09 us     101.09 us     101.09 us              1     XATTROP
      0.00      57.80 us      56.01 us      59.59 us              2     INODELK
      0.00     244.49 us     244.49 us     244.49 us              1      CREATE
      0.00      91.75 us      70.59 us     105.24 us              4       FSTAT
      0.01      47.82 us      18.81 us      73.38 us             11       FLUSH
      0.01      60.02 us      31.48 us      98.18 us             14      STATFS
      0.01      98.04 us      61.58 us     150.94 us             10        OPEN
      0.02      52.99 us      18.78 us      81.28 us             25          LK
      0.13      40.58 us      13.70 us      96.48 us            260     ENTRYLK
      0.24      36.45 us      10.90 us     112.97 us            523    FINODELK
      0.37     158.49 us      56.21 us     331.41 us            182      LOOKUP
      1.32     805.46 us     197.02 us   35241.80 us            129       MKNOD
      4.73      80.98 us      25.84 us    7904.08 us           4617    FXATTROP
      5.11    3104.62 us      52.38 us   16037.36 us            130       FSYNC
     39.12     753.46 us      88.99 us   54071.79 us           4100       WRITE
     48.93     134.50 us      17.43 us    1307.33 us          28728        READ
 
    Duration: 61 seconds
   Data Read: 3159040 bytes
Data Written: 11543040 bytes

When doing qemu-img convert for qcow2 image of 10G
----------------------------------------------------
Copy command
-------------
[root@ ]# time qemu-img convert -p -t none -T none -f qcow2 /rhev/data-center/mnt/glusterSD/rhsqa-grafton7.lab.eng.blr.redhat.com\:_testvol/test/image_qcow2_10gb.img -O qcow2 /rhev/data-center/mnt/glusterSD/rhsqa-grafton7.lab.eng.blr.redhat.com\:_testvol/test/copy_image_qcow2_10gb.img
    (100.00/100%)

real	0m0.100s
user	0m0.007s
sys	0m0.005s


Corresponding profile info
---------------------------
Interval 4 Stats:
   Block Size:                  8b+                 128b+                 512b+ 
 No. of Reads:                    0                     0                     0 
No. of Writes:                    2                     1                     2 
 
   Block Size:              65536b+              131072b+ 
 No. of Reads:                    0                     8 
No. of Writes:                    3                     1 
 %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls         Fop
 ---------   -----------   -----------   -----------   ------------        ----
      0.00       0.00 us       0.00 us       0.00 us              2      FORGET
      0.00       0.00 us       0.00 us       0.00 us             12     RELEASE
      0.00       0.00 us       0.00 us       0.00 us              2  RELEASEDIR
      0.00      41.58 us      27.94 us      49.62 us              4     INODELK
      0.00      85.89 us      63.58 us     108.21 us              2     OPENDIR
      0.00     212.11 us     212.11 us     212.11 us              1      CREATE
      0.00      32.87 us      23.22 us      48.80 us              8    FINODELK
      0.01     104.40 us      77.33 us     147.23 us              4       FSTAT
      0.01      43.06 us      20.86 us      70.50 us             12       FLUSH
      0.01     144.71 us      66.76 us     211.78 us              4    READDIRP
      0.01      55.63 us      32.78 us      97.15 us             12      STATFS
      0.01     357.70 us     299.21 us     416.19 us              2       MKNOD
      0.02      88.17 us      52.53 us     116.51 us             11        OPEN
      0.02     111.90 us      49.74 us     161.72 us              9    FXATTROP
      0.02     352.33 us     186.05 us     499.31 us              4       FSYNC
      0.07      54.27 us      17.40 us      94.99 us             73          LK
      0.11     537.13 us      72.69 us    1152.33 us             12        READ
      0.17     126.21 us      57.02 us     229.55 us             78      LOOKUP
      0.87    5538.23 us      87.80 us   48687.26 us              9       WRITE
     44.09    8978.97 us      11.64 us   46613.03 us            280     ENTRYLK
     54.56   23394.71 us     107.95 us   50770.65 us            133      UNLINK
 
    Duration: 28 seconds
   Data Read: 1048576 bytes
Data Written: 328884 bytes

Comment 31 Gobinda Das 2020-09-02 07:22:38 UTC

Vinayak, Could you please check the test results and profile info?

Comment 37 Yaniv Kaul 2020-10-15 13:31:46 UTC

This bug was reported 4 years ago, with no customer cases attached. Please close / move to upstream Gluster?

Comment 38 Gobinda Das 2020-10-15 13:55:40 UTC

For now will be closing this bug, if any customer will hit such issue will look into it.
Will be opening upstream gluster bug to track further.

Note You need to log in before you can comment on or make changes to this bug.