1945401 – RFE: Add support for specifying cluster_size/subcluster allocation for qcow2 volumes via virStorageVolCreateXML

Bug 1945401 - RFE: Add support for specifying cluster_size/subcluster allocation for qcow2 volumes via virStorageVolCreateXML

Summary: RFE: Add support for specifying cluster_size/subcluster allocation for qcow2 ...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux Advanced Virtualization
Classification:	Red Hat
Component:	libvirt
Sub Component:
Version:	---
Hardware:	All
OS:	All
Priority:	high
Severity:	high
Target Milestone:	rc
Target Release:	8.5
Assignee:	Pavel Hrdina
QA Contact:	Meina Li
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1973269
TreeView+	depends on / blocked

Reported:	2021-03-31 19:26 UTC by Russell Cattelan
Modified:	2023-03-14 14:31 UTC (History)
CC List:	12 users (show)
Fixed In Version:	libvirt-7.4.0-1.el8
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1973269 (view as bug list)
Environment:
Last Closed:	2021-11-16 07:52:31 UTC
Type:	Feature Request
Target Upstream Version:	7.4.0
Embargoed:
Dependent Products:
Flags:	pm-rhel: mirror+

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2021:4684	0	None	None	None	2021-11-16 07:53:15 UTC

Description Russell Cattelan 2021-03-31 19:26:12 UTC

Description of problem:

Currently libvirt does no have a way to specify the cluster_size option when creating qcow2 files.

IBM cloud is using a 2M cluster size for all qcow2 files.

The plan is put a wrapper around qemu-img create adding in a static option of cluster_size = 2M.
This is obviously not ideal and would to be able to dynamically set the cluster size.

Future ask would also be able to support sub cluster allocation options as well.


It might also be good to have an option to the libvirt xml to arbitrary add args to the qemu-img create cmd.
(That might more of a security risk tho)

Comment 1 Peter Krempa 2021-04-01 07:03:45 UTC

(In reply to Russell Cattelan from comment #0)
> Description of problem:
> 
> Currently libvirt does no have a way to specify the cluster_size option when
> creating qcow2 files.
> 
> IBM cloud is using a 2M cluster size for all qcow2 files.

Could you please elaborate in whuch operations do you require this?

> The plan is put a wrapper around qemu-img create adding in a static option
> of cluster_size = 2M.

Certain operations creating images such as snapshot when the VM is running use the blockdev-create QMP command, so adding a wrapper over qemu-img won't help.

> This is obviously not ideal and would to be able to dynamically set the
> cluster size.
> 
> Future ask would also be able to support sub cluster allocation options as
> well.

With same operations as the cluster size?

> It might also be good to have an option to the libvirt xml to arbitrary add
> args to the qemu-img create cmd.
> (That might more of a security risk tho)

As noted above, we don't use qemu-img in many cases (unless you use the storage driver APIs) and don't plan to increase the usage.

Comment 2 Russell Cattelan 2021-04-01 14:58:19 UTC

Basically we first create a volume we were calling qemu-img create:

This is a note form the wrapper script we are working on to add in the cluster_size option

#  For example:
#  create --object secret,id=sec0,file=/media/memoryCache/qemuTmpFile_bd0c1048-e053-4654-a84b-11c909df6e39 -f qcow2 -F qcow2 -o encrypt.format=luks,encrypt.key-secret=sec0 /mnt/cci_root/cci_vol8_ccistordal1099a/f057e1a6-1317-4e69-ba33-2c3e6e486686/r134-7d52a51e-0a2b-4c3e-80e5-8a7e1bef5c0c.qcow2 20G
#
#  The cluster_size is missing from the above command due to https://jiracloud.swg.usma.ibm.com:8443/browse/SC-1905.
#  This wrapper will add it if necessary (see the -o option below):
#  create --object secret,id=sec0,file=/media/memoryCache/qemuTmpFile_bd0c1048-e053-4654-a84b-11c909df6e39 -f qcow2 -F qcow2 -o cluster_size=2M,encrypt.format=luks,encrypt.key-secret=sec0 /mnt/cci_root/cci_vol8_ccistordal1099a/f057e1a6-1317-4e69-ba33-2c3e6e486686/r134-7d52a51e-0a2b-4c3e-80e5-8a7e1bef5c0c.qcow2 20G
#

> As noted above, we don't use qemu-img in many cases (unless you use the storage driver APIs) and don't plan to increase the usage.

Can you expand on this a bit are you suggesting libvirt might not be the best way to initialize the qcow2 files?
Maybe there is another method we should be exploring?

Comment 3 Peter Krempa 2021-04-01 15:03:52 UTC

(In reply to Russell Cattelan from comment #2)

[...]

> 
> > As noted above, we don't use qemu-img in many cases (unless you use the storage driver APIs) and don't plan to increase the usage.
> 
> Can you expand on this a bit are you suggesting libvirt might not be the
> best way to initialize the qcow2 files?
> Maybe there is another method we should be exploring?

I wanted to say that for operations which are creating images while the VM is running such as external snapshots (virDomainSnapshotCreateXML),  the virDomainBlockCopy API, or the virDomainBackupBeginAPI libvirt doesn't use qemu-img, but uses a QMP command for formatting images, so any wrapper script hacks will be ignored.

Comment 4 Russell Cattelan 2021-04-02 05:20:32 UTC

(In reply to Peter Krempa from comment #3)
> (In reply to Russell Cattelan from comment #2)
> 
> [...]
> 
> > 
> > > As noted above, we don't use qemu-img in many cases (unless you use the storage driver APIs) and don't plan to increase the usage.
> > 
> > Can you expand on this a bit are you suggesting libvirt might not be the
> > best way to initialize the qcow2 files?
> > Maybe there is another method we should be exploring?
> 
> I wanted to say that for operations which are creating images while the VM
> is running such as external snapshots (virDomainSnapshotCreateXML),  the
> virDomainBlockCopy API, or the virDomainBackupBeginAPI libvirt doesn't use
> qemu-img, but uses a QMP command for formatting images, so any wrapper
> script hacks will be ignored.

This is the create of the qcow2 file before the VM spins up.
We are using the libvirt go package
https://github.com/libvirt/libvirt

	// libvirt volume creation...
	if err := lm.libvirtCreateQCOW2(ctx, command); err != nil {
		return fmt.Errorf("failed to create volume %s with libvirt: %s", fullPathToDevice, err)
	}
	return nil


which eventually seems to land here in libvirt
https://gitlab.com/libvirt/libvirt/-/blob/master/src/storage/storage_util.c#L1111


Which unless I'm missing something does not seem to have a way to add the cluster_size option?

Comment 5 Geoff Macartney 2021-04-02 20:49:42 UTC

We should be careful here to keep the two concerns here separate, and not allow the one to confuse the other.

This issue https://bugzilla.redhat.com/show_bug.cgi?id=1945401 is being raised to request the addition of support for setting the cluster_size option when creating qcow2 files using libvirt:

per https://libvirt.org/formatstorage.html#StorageVol

<volume>
    <target>
        <path>/mnt/.../.../.../r_123xyz.qcow2</path>
        <format type="qcow2"/>
    </target>
</volume>

In this case qemu-img is used in the libvirtd backend to initialize the file. 

*However* the StorageVol schema doesn't support any way to supply the cluster_size information to qemu-img, so that it could do `qemu-img create -o cluster_size=2M`, for example. cluster_size is an important option for performance purposes [1], and we believe libvirt should support it.

I do note that the https://libvirt.org/formatstorage.html#StorageVolTarget supports a "features" element, but at present it only supports the one feature, "lazy_refcounts":

    "Format-specific features. Only used for qcow2 now. Valid sub-elements are:
     <lazy_refcounts/> - allow delayed reference counter updates. Since 1.1.0"


The question of the "wrapper script" above is related, but not the subject of this issue. I'll make another comment on that below.

To add support for cluster_size I can think of a couple of options:

1.   Add a <cluster_size> element to the <features>.

For example
<features>
    <cluster_size>2M</cluster_size>
</features>

One objection to this would be that "cluster_size" is not at all a generic concern, and only applies to qcow2 files. (Then again the same objection applies to lazy_refcounts.)

2. Add an <options> element to <features>:

<features>
    <options>cluster_size=2M</options>
</features>

On the plus side, this has more potential for generic use, and could be used to supply options not just for qcow2 but for other formats in future.
On the minus side, this is less constrained by the schema and has more potential for abuse.

I initially tended towards option 2, but I now favour option 1 as more secure.


These are only suggestions of course. 

What do you think:
1. Can cluster_size be supported?
2. What would be the best way to do it?



Refs
[1] https://www.ibm.com/cloud/blog/how-to-tune-qemu-l2-cache-size-and-qcow2-cluster-size

Comment 6 Geoff Macartney 2021-04-02 20:56:08 UTC

(In reply to Peter Krempa from comment #3)
> (In reply to Russell Cattelan from comment #2)
> 
> I wanted to say that for operations which are creating images while the VM
> is running such as external snapshots (virDomainSnapshotCreateXML),  the
> virDomainBlockCopy API, or the virDomainBackupBeginAPI libvirt doesn't use
> qemu-img, but uses a QMP command for formatting images, so any wrapper
> script hacks will be ignored.

Just on the question of the wrapper script - this is a workaround for use on a temporary basis until we could get the fix for this issue into libvirt.

Your comment is very interesting. We only need to set cluster_size on a volume init; https://libvirt.org/html/libvirt-libvirt-storage.html#virStorageVolCreateXML.

As far as I understand qemu-img IS used in that case. I am hoping that although other scenarios such as virDomainSnapshotCreateXML do not use qemu-img, this will not matter as it is used in the case we are concerned about. Do you think this sounds right?

This is a side issue to the purpose of https://bugzilla.redhat.com/show_bug.cgi?id=1945401 but it's good to hear about your point, and we should consider it when thinking about this workaround.

Comment 7 Peter Krempa 2021-04-09 11:59:58 UTC

(In reply to Geoff Macartney from comment #6)
> (In reply to Peter Krempa from comment #3)
> > (In reply to Russell Cattelan from comment #2)
> > 
> > I wanted to say that for operations which are creating images while the VM
> > is running such as external snapshots (virDomainSnapshotCreateXML),  the
> > virDomainBlockCopy API, or the virDomainBackupBeginAPI libvirt doesn't use
> > qemu-img, but uses a QMP command for formatting images, so any wrapper
> > script hacks will be ignored.
> 
> Just on the question of the wrapper script - this is a workaround for use on
> a temporary basis until we could get the fix for this issue into libvirt.
> 
> Your comment is very interesting. We only need to set cluster_size on a
> volume init;
> https://libvirt.org/html/libvirt-libvirt-storage.html#virStorageVolCreateXML.
> 
> As far as I understand qemu-img IS used in that case. I am hoping that
> although other scenarios such as virDomainSnapshotCreateXML do not use
> qemu-img, this will not matter as it is used in the case we are concerned
> about. Do you think this sounds right?
> 
> This is a side issue to the purpose of
> https://bugzilla.redhat.com/show_bug.cgi?id=1945401 but it's good to hear
> about your point, and we should consider it when thinking about this
> workaround.

Okay, so if I understand correctly you want to be able to specify the cluster size and possibly subcluster allocation when creating a new storage volume via virStorageVolCreateXML. I'll express that in the summary so that it's obvious.

If you'll also need to be able to specify the cluster size explicitly with the APIs operating on a live VM (virDomainSnapshotCreateXML/virDomainBlockCopy/virDomainBackupBeginAPI) or also preserve the subcluster allocation setting with virDomainSnapshotCreateXML (this one preserves cluster size only nowadays) please file a new separate bug for any of your requests.

Comment 8 Geoff Macartney 2021-04-12 09:00:30 UTC

(In reply to Peter Krempa from comment #7)
> 
> Okay, so if I understand correctly you want to be able to specify the
> cluster size and possibly subcluster allocation when creating a new storage
> volume via virStorageVolCreateXML. I'll express that in the summary so that
> it's obvious.

That's right, thanks for updating the summary to clarify this.

> If you'll also need to be able to specify the cluster size explicitly with
> the APIs operating on a live VM
> (virDomainSnapshotCreateXML/virDomainBlockCopy/virDomainBackupBeginAPI) or
> also preserve the subcluster allocation setting with
> virDomainSnapshotCreateXML (this one preserves cluster size only nowadays)
> please file a new separate bug for any of your requests.

I'm not sure about that myself. We'll need to discuss this and get back to you.


What do you think of my remarks in comment #5?

Comment 9 Geoff Macartney 2021-04-22 09:27:59 UTC

Note corresponding libvirt issue: https://gitlab.com/libvirt/libvirt/-/issues/154

Comment 10 Geoff Macartney 2021-04-22 09:29:56 UTC

About 
> If you'll also need to be able to specify the cluster size explicitly with the APIs operating on a live VM (virDomainSnapshotCreateXML/virDomainBlockCopy/virDomainBackupBeginAPI) or also preserve the subcluster allocation setting with virDomainSnapshotCreateXML (this one preserves cluster size only nowadays) please file a new separate bug for any of your requests.

on further thought I don't believe we do need to do this, at least not at the present time.

Comment 12 Geoff Macartney 2021-05-12 18:26:14 UTC

Hello, I see from the above comments that this has been marked as high priority. Is someone already looking into the changes needed? 

I was thinking I would be interested to sketch out a change to fix this. If someone is already looking into it, or soon will be, then I don't think my efforts would be worthwhile. But, if not, would it be of any use for me to give that a go? (Bearing in mind that I have never developed on libvirt before.)

Comment 13 Geoff Macartney 2021-05-13 13:44:17 UTC

(In reply to Geoff Macartney from comment #12)
> But, if not, would it be of any use for me to
> give that a go? (Bearing in mind that I have never developed on libvirt
> before.)

ah, I see the patch posted and mentioned here: https://gitlab.com/libvirt/libvirt/-/issues/154#note_574868529

that's great!

Comment 14 Pavel Hrdina 2021-05-21 12:19:46 UTC

Upstream commits:

93344aed27 storage_file: add support to probe cluster_size from QCOW2 images
3e1d2c93a3 storage: add support for QCOW2 cluster_size option

Comment 15 Pavel Hrdina 2021-05-21 12:31:40 UTC

Will be included in RHEL-AV-8.5.0 by the next rebase to libvirt-7.4.0.

Comment 16 Geoff Macartney 2021-05-21 13:24:33 UTC

That's marvellous thanks Pavel.

> Will be included in RHEL-AV-8.5.0 by the next rebase to libvirt-7.4.0.

Do you know what the expected release date is for RHEL-AV-8.5.0?

Comment 17 Meina Li 2021-06-11 06:28:04 UTC

Test Version:
libvirt-7.4.0-1.module+el8.5.0+11218+83343022.x86_64
qemu-kvm-6.0.0-18.module+el8.5.0+11243+5269aaa1.x86_64
Test scenarios:
S1: Volume lifecycle test with cluster size in dir pool
vol-create/vol-dumpxml/vol-info/vol-resize/vol-clone/vol-upload/vol-downlad     passed
vol-create-from: file a new Bug 1970753
S2: Start the guest with volume disk created with cluster size  ---passed
S3: Create volume with cluster size in netfs pool   ---passed
S4: [negative] Create volume with invalid value for cluster size   ---passed

Comment 24 Meina Li 2021-06-21 08:05:56 UTC

Verified Version:
libvirt-7.4.0-1.module+el8.5.0+11218+83343022.x86_64
qemu-kvm-6.0.0-19.module+el8.5.0+11385+6e7d542e.x86_64

Verified Steps:
S1: Volume lifecycle test with cluster size in dir pool
1. Prepare a dir pool and a volume xml with cluster size.
# virsh pool-list --all
 Name      State    Autostart
-------------------------------
 default   active   yes
# cat volume.xml 
  <volume>
    <name>cluster.img</name>
    <key>/var/lib/libvirt/images/cluster.img</key>
    <capacity unit="G">2</capacity>
    <allocation>294912</allocation>
    <target>
      <path>/var/lib/libvirt/images/cluster.img</path>
      <format type='qcow2'/>
      <clusterSize unit='KiB'>128</clusterSize>
    </target>
 </volume>
2. Create the volume with vol-create and check cluster_size.
Step: # virsh vol-create --pool default volume.xml 
Vol cluster created from volume.xml
# virsh vol-list default
 Name            Path
--------------------------------------------------------
 cluster.img         /var/lib/libvirt/images/cluster.img
 lmn.qcow2       /var/lib/libvirt/images/lmn.qcow2
# qemu-img info /var/lib/libvirt/images/cluster.img 
Result:image: /var/lib/libvirt/images/cluster.img
file format: qcow2
virtual size: 2 GiB (2147483648 bytes)
disk size: 388 KiB
cluster_size: 131072
...
3. Check vol-dumpxml and vol-info.
# virsh vol-dumpxml --pool default cluster.img | grep clusterSize
    <clusterSize unit='B'>131072</clusterSize>
# virsh vol-info --pool default cluster.img 
Name:           cluster.img
Type:           file
Capacity:       2.00 GiB
Allocation:     388.00 KiB
4. Resize the volume.
# virsh vol-resize --pool default --vol cluster.img 5G
Size of volume 'cluster.img' successfully changed to 5G
# virsh vol-info --pool default cluster.img 
Name:           cluster.img
Type:           file
Capacity:       5.00 GiB
Allocation:     260.00 KiB
5. Clone the volume and check the cluster_size.
# virsh vol-clone --pool default cluster.img cluster-clone.img
Vol cluster-clone.img cloned from cluster.img
# virsh vol-dumpxml --pool default cluster-clone.img | grep clusterSize
    <clusterSize unit='B'>131072</clusterSize>
6. Upload/download an image with different cluster_size.
Step:# qemu-img create -f qcow2 -o cluster_size=256k /var/lib/libvirt/images/upload.qcow2 5G
Formatting '/var/lib/libvirt/images/upload.qcow2', fmt=qcow2 cluster_size=262144 extended_l2=off compression_type=zlib size=5368709120 lazy_refcounts=off refcount_bits=16
# virsh vol-upload cluster.img /var/lib/libvirt/images/upload.qcow2 default 
# virsh vol-dumpxml --pool default cluster.img | grep clusterSize
    <clusterSize unit='B'>262144</clusterSize>
# virsh vol-download --pool default cluster.img /tmp/download.qcow2
# qemu-img info /tmp/download.qcow2 
Result:image: /tmp/download.qcow2
file format: qcow2
virtual size: 5 GiB (5368709120 bytes)
disk size: 772 KiB
cluster_size: 262144
7. Delete the volume.
# virsh vol-delete --pool default cluster.img 
Vol cluster.img deleted
# virsh vol-list default
 Name             Path
----------------------------------------------------------
 lmn.qcow2        /var/lib/libvirt/images/lmn.qcow2

S2: Start the guest with volume disk created with cluster size
1. Prepare a dir pool and a volume xml with cluster size.
# virsh pool-list --all
 Name      State    Autostart
-------------------------------
 default   active   yes
# cat volume.xml 
  <volume>
    <name>cluster.img</name>
    <key>/var/lib/libvirt/images/cluster.img</key>
    <capacity unit="G">2</capacity>
    <allocation>294912</allocation>
    <target>
      <path>/var/lib/libvirt/images/cluster.img</path>
      <format type='qcow2'/>
      <clusterSize unit='KiB'>2048</clusterSize>
    </target>
 </volume>
2. Create the volume with vol-create.
# virsh vol-create --pool default volume.xml 
Vol cluster created from volume.xml
# virsh vol-list default
 Name            Path
--------------------------------------------------------
 cluster.img         /var/lib/libvirt/images/cluster.img
 lmn.qcow2       /var/lib/libvirt/images/lmn.qcow2
3. Start a guest with the following volume disk xml.
Step: # virsh edit guest
...
  <disk type='volume' device='disk'>
    <driver name='qemu' type='qcow2'/>
    <source pool='default' volume='cluster.img'/>
    <target dev='vdb' bus='virtio'/>
  </disk>
...
# virsh start guest
Domain 'lmn' started
# virsh domblklist guest
Result:
 Target   Source
---------------------------------------------
 vda      /var/lib/libvirt/images/lmn.qcow2
 vdb      cluster.img

S3: Create volume with cluster size in netfs pool
1. Prepare a netfs pool.
# cat netfs.pool 
<pool type='netfs'>
<name>netfs</name>
<uuid>49bcb8df-c550-4c60-b384-a0aed202c9ca</uuid>
<source>
<host name='10.66.87.196'/>
<dir path='/home/nfs'/>
<format type='nfs'/>
</source>
<target>
<path>/tmp/pool-test</path>
</target>
</pool>
# virsh pool-create netfs.pool
Pool netfs created from netfs.pool
# virsh pool-list
 Name      State    Autostart
-------------------------------
 default   active   yes
 netfs     active   no
2. Create a volume with cluster size.
# cat nfs-vol.xml 
  <volume>
    <name>test.img</name>
    <key>/var/lib/libvirt/images/test.img</key>
    <capacity unit="G">2</capacity>
    <allocation>294912</allocation>
    <target>
      <path>/tmp/pool-test/test.img</path>
      <format type='qcow2'/>
    <clusterSize unit='MiB'>2</clusterSize>
    </target>
 </volume>
# virsh vol-create --pool netfs nfs-vol.xml 
Vol test.img created from nfs-vol.xml
3. Check the volume info.
Step:# virsh vol-dumpxml --pool netfs test.img | grep clusterSize 
    <clusterSize unit='B'>2097152</clusterSize>
# qemu-img info /tmp/pool-test/test.img 
Result:image: /tmp/pool-test/test.img
file format: qcow2
virtual size: 2 GiB (2147483648 bytes)
disk size: 388 KiB
cluster_size: 2097152

Comment 26 errata-xmlrpc 2021-11-16 07:52:31 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (virt:av bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:4684

Note You need to log in before you can comment on or make changes to this bug.