Description of problem: Currently libvirt does no have a way to specify the cluster_size option when creating qcow2 files. IBM cloud is using a 2M cluster size for all qcow2 files. The plan is put a wrapper around qemu-img create adding in a static option of cluster_size = 2M. This is obviously not ideal and would to be able to dynamically set the cluster size. Future ask would also be able to support sub cluster allocation options as well. It might also be good to have an option to the libvirt xml to arbitrary add args to the qemu-img create cmd. (That might more of a security risk tho)
(In reply to Russell Cattelan from comment #0) > Description of problem: > > Currently libvirt does no have a way to specify the cluster_size option when > creating qcow2 files. > > IBM cloud is using a 2M cluster size for all qcow2 files. Could you please elaborate in whuch operations do you require this? > The plan is put a wrapper around qemu-img create adding in a static option > of cluster_size = 2M. Certain operations creating images such as snapshot when the VM is running use the blockdev-create QMP command, so adding a wrapper over qemu-img won't help. > This is obviously not ideal and would to be able to dynamically set the > cluster size. > > Future ask would also be able to support sub cluster allocation options as > well. With same operations as the cluster size? > It might also be good to have an option to the libvirt xml to arbitrary add > args to the qemu-img create cmd. > (That might more of a security risk tho) As noted above, we don't use qemu-img in many cases (unless you use the storage driver APIs) and don't plan to increase the usage.
Basically we first create a volume we were calling qemu-img create: This is a note form the wrapper script we are working on to add in the cluster_size option # For example: # create --object secret,id=sec0,file=/media/memoryCache/qemuTmpFile_bd0c1048-e053-4654-a84b-11c909df6e39 -f qcow2 -F qcow2 -o encrypt.format=luks,encrypt.key-secret=sec0 /mnt/cci_root/cci_vol8_ccistordal1099a/f057e1a6-1317-4e69-ba33-2c3e6e486686/r134-7d52a51e-0a2b-4c3e-80e5-8a7e1bef5c0c.qcow2 20G # # The cluster_size is missing from the above command due to https://jiracloud.swg.usma.ibm.com:8443/browse/SC-1905. # This wrapper will add it if necessary (see the -o option below): # create --object secret,id=sec0,file=/media/memoryCache/qemuTmpFile_bd0c1048-e053-4654-a84b-11c909df6e39 -f qcow2 -F qcow2 -o cluster_size=2M,encrypt.format=luks,encrypt.key-secret=sec0 /mnt/cci_root/cci_vol8_ccistordal1099a/f057e1a6-1317-4e69-ba33-2c3e6e486686/r134-7d52a51e-0a2b-4c3e-80e5-8a7e1bef5c0c.qcow2 20G # > As noted above, we don't use qemu-img in many cases (unless you use the storage driver APIs) and don't plan to increase the usage. Can you expand on this a bit are you suggesting libvirt might not be the best way to initialize the qcow2 files? Maybe there is another method we should be exploring?
(In reply to Russell Cattelan from comment #2) [...] > > > As noted above, we don't use qemu-img in many cases (unless you use the storage driver APIs) and don't plan to increase the usage. > > Can you expand on this a bit are you suggesting libvirt might not be the > best way to initialize the qcow2 files? > Maybe there is another method we should be exploring? I wanted to say that for operations which are creating images while the VM is running such as external snapshots (virDomainSnapshotCreateXML), the virDomainBlockCopy API, or the virDomainBackupBeginAPI libvirt doesn't use qemu-img, but uses a QMP command for formatting images, so any wrapper script hacks will be ignored.
(In reply to Peter Krempa from comment #3) > (In reply to Russell Cattelan from comment #2) > > [...] > > > > > > As noted above, we don't use qemu-img in many cases (unless you use the storage driver APIs) and don't plan to increase the usage. > > > > Can you expand on this a bit are you suggesting libvirt might not be the > > best way to initialize the qcow2 files? > > Maybe there is another method we should be exploring? > > I wanted to say that for operations which are creating images while the VM > is running such as external snapshots (virDomainSnapshotCreateXML), the > virDomainBlockCopy API, or the virDomainBackupBeginAPI libvirt doesn't use > qemu-img, but uses a QMP command for formatting images, so any wrapper > script hacks will be ignored. This is the create of the qcow2 file before the VM spins up. We are using the libvirt go package https://github.com/libvirt/libvirt // libvirt volume creation... if err := lm.libvirtCreateQCOW2(ctx, command); err != nil { return fmt.Errorf("failed to create volume %s with libvirt: %s", fullPathToDevice, err) } return nil which eventually seems to land here in libvirt https://gitlab.com/libvirt/libvirt/-/blob/master/src/storage/storage_util.c#L1111 Which unless I'm missing something does not seem to have a way to add the cluster_size option?
We should be careful here to keep the two concerns here separate, and not allow the one to confuse the other. This issue https://bugzilla.redhat.com/show_bug.cgi?id=1945401 is being raised to request the addition of support for setting the cluster_size option when creating qcow2 files using libvirt: per https://libvirt.org/formatstorage.html#StorageVol <volume> <target> <path>/mnt/.../.../.../r_123xyz.qcow2</path> <format type="qcow2"/> </target> </volume> In this case qemu-img is used in the libvirtd backend to initialize the file. *However* the StorageVol schema doesn't support any way to supply the cluster_size information to qemu-img, so that it could do `qemu-img create -o cluster_size=2M`, for example. cluster_size is an important option for performance purposes [1], and we believe libvirt should support it. I do note that the https://libvirt.org/formatstorage.html#StorageVolTarget supports a "features" element, but at present it only supports the one feature, "lazy_refcounts": "Format-specific features. Only used for qcow2 now. Valid sub-elements are: <lazy_refcounts/> - allow delayed reference counter updates. Since 1.1.0" The question of the "wrapper script" above is related, but not the subject of this issue. I'll make another comment on that below. To add support for cluster_size I can think of a couple of options: 1. Add a <cluster_size> element to the <features>. For example <features> <cluster_size>2M</cluster_size> </features> One objection to this would be that "cluster_size" is not at all a generic concern, and only applies to qcow2 files. (Then again the same objection applies to lazy_refcounts.) 2. Add an <options> element to <features>: <features> <options>cluster_size=2M</options> </features> On the plus side, this has more potential for generic use, and could be used to supply options not just for qcow2 but for other formats in future. On the minus side, this is less constrained by the schema and has more potential for abuse. I initially tended towards option 2, but I now favour option 1 as more secure. These are only suggestions of course. What do you think: 1. Can cluster_size be supported? 2. What would be the best way to do it? Refs [1] https://www.ibm.com/cloud/blog/how-to-tune-qemu-l2-cache-size-and-qcow2-cluster-size
(In reply to Peter Krempa from comment #3) > (In reply to Russell Cattelan from comment #2) > > I wanted to say that for operations which are creating images while the VM > is running such as external snapshots (virDomainSnapshotCreateXML), the > virDomainBlockCopy API, or the virDomainBackupBeginAPI libvirt doesn't use > qemu-img, but uses a QMP command for formatting images, so any wrapper > script hacks will be ignored. Just on the question of the wrapper script - this is a workaround for use on a temporary basis until we could get the fix for this issue into libvirt. Your comment is very interesting. We only need to set cluster_size on a volume init; https://libvirt.org/html/libvirt-libvirt-storage.html#virStorageVolCreateXML. As far as I understand qemu-img IS used in that case. I am hoping that although other scenarios such as virDomainSnapshotCreateXML do not use qemu-img, this will not matter as it is used in the case we are concerned about. Do you think this sounds right? This is a side issue to the purpose of https://bugzilla.redhat.com/show_bug.cgi?id=1945401 but it's good to hear about your point, and we should consider it when thinking about this workaround.
(In reply to Geoff Macartney from comment #6) > (In reply to Peter Krempa from comment #3) > > (In reply to Russell Cattelan from comment #2) > > > > I wanted to say that for operations which are creating images while the VM > > is running such as external snapshots (virDomainSnapshotCreateXML), the > > virDomainBlockCopy API, or the virDomainBackupBeginAPI libvirt doesn't use > > qemu-img, but uses a QMP command for formatting images, so any wrapper > > script hacks will be ignored. > > Just on the question of the wrapper script - this is a workaround for use on > a temporary basis until we could get the fix for this issue into libvirt. > > Your comment is very interesting. We only need to set cluster_size on a > volume init; > https://libvirt.org/html/libvirt-libvirt-storage.html#virStorageVolCreateXML. > > As far as I understand qemu-img IS used in that case. I am hoping that > although other scenarios such as virDomainSnapshotCreateXML do not use > qemu-img, this will not matter as it is used in the case we are concerned > about. Do you think this sounds right? > > This is a side issue to the purpose of > https://bugzilla.redhat.com/show_bug.cgi?id=1945401 but it's good to hear > about your point, and we should consider it when thinking about this > workaround. Okay, so if I understand correctly you want to be able to specify the cluster size and possibly subcluster allocation when creating a new storage volume via virStorageVolCreateXML. I'll express that in the summary so that it's obvious. If you'll also need to be able to specify the cluster size explicitly with the APIs operating on a live VM (virDomainSnapshotCreateXML/virDomainBlockCopy/virDomainBackupBeginAPI) or also preserve the subcluster allocation setting with virDomainSnapshotCreateXML (this one preserves cluster size only nowadays) please file a new separate bug for any of your requests.
(In reply to Peter Krempa from comment #7) > > Okay, so if I understand correctly you want to be able to specify the > cluster size and possibly subcluster allocation when creating a new storage > volume via virStorageVolCreateXML. I'll express that in the summary so that > it's obvious. That's right, thanks for updating the summary to clarify this. > If you'll also need to be able to specify the cluster size explicitly with > the APIs operating on a live VM > (virDomainSnapshotCreateXML/virDomainBlockCopy/virDomainBackupBeginAPI) or > also preserve the subcluster allocation setting with > virDomainSnapshotCreateXML (this one preserves cluster size only nowadays) > please file a new separate bug for any of your requests. I'm not sure about that myself. We'll need to discuss this and get back to you. What do you think of my remarks in comment #5?
Note corresponding libvirt issue: https://gitlab.com/libvirt/libvirt/-/issues/154
About > If you'll also need to be able to specify the cluster size explicitly with the APIs operating on a live VM (virDomainSnapshotCreateXML/virDomainBlockCopy/virDomainBackupBeginAPI) or also preserve the subcluster allocation setting with virDomainSnapshotCreateXML (this one preserves cluster size only nowadays) please file a new separate bug for any of your requests. on further thought I don't believe we do need to do this, at least not at the present time.
Hello, I see from the above comments that this has been marked as high priority. Is someone already looking into the changes needed? I was thinking I would be interested to sketch out a change to fix this. If someone is already looking into it, or soon will be, then I don't think my efforts would be worthwhile. But, if not, would it be of any use for me to give that a go? (Bearing in mind that I have never developed on libvirt before.)
(In reply to Geoff Macartney from comment #12) > But, if not, would it be of any use for me to > give that a go? (Bearing in mind that I have never developed on libvirt > before.) ah, I see the patch posted and mentioned here: https://gitlab.com/libvirt/libvirt/-/issues/154#note_574868529 that's great!
Upstream commits: 93344aed27 storage_file: add support to probe cluster_size from QCOW2 images 3e1d2c93a3 storage: add support for QCOW2 cluster_size option
Will be included in RHEL-AV-8.5.0 by the next rebase to libvirt-7.4.0.
That's marvellous thanks Pavel. > Will be included in RHEL-AV-8.5.0 by the next rebase to libvirt-7.4.0. Do you know what the expected release date is for RHEL-AV-8.5.0?
Test Version: libvirt-7.4.0-1.module+el8.5.0+11218+83343022.x86_64 qemu-kvm-6.0.0-18.module+el8.5.0+11243+5269aaa1.x86_64 Test scenarios: S1: Volume lifecycle test with cluster size in dir pool vol-create/vol-dumpxml/vol-info/vol-resize/vol-clone/vol-upload/vol-downlad passed vol-create-from: file a new Bug 1970753 S2: Start the guest with volume disk created with cluster size ---passed S3: Create volume with cluster size in netfs pool ---passed S4: [negative] Create volume with invalid value for cluster size ---passed
Verified Version: libvirt-7.4.0-1.module+el8.5.0+11218+83343022.x86_64 qemu-kvm-6.0.0-19.module+el8.5.0+11385+6e7d542e.x86_64 Verified Steps: S1: Volume lifecycle test with cluster size in dir pool 1. Prepare a dir pool and a volume xml with cluster size. # virsh pool-list --all Name State Autostart ------------------------------- default active yes # cat volume.xml <volume> <name>cluster.img</name> <key>/var/lib/libvirt/images/cluster.img</key> <capacity unit="G">2</capacity> <allocation>294912</allocation> <target> <path>/var/lib/libvirt/images/cluster.img</path> <format type='qcow2'/> <clusterSize unit='KiB'>128</clusterSize> </target> </volume> 2. Create the volume with vol-create and check cluster_size. Step: # virsh vol-create --pool default volume.xml Vol cluster created from volume.xml # virsh vol-list default Name Path -------------------------------------------------------- cluster.img /var/lib/libvirt/images/cluster.img lmn.qcow2 /var/lib/libvirt/images/lmn.qcow2 # qemu-img info /var/lib/libvirt/images/cluster.img Result:image: /var/lib/libvirt/images/cluster.img file format: qcow2 virtual size: 2 GiB (2147483648 bytes) disk size: 388 KiB cluster_size: 131072 ... 3. Check vol-dumpxml and vol-info. # virsh vol-dumpxml --pool default cluster.img | grep clusterSize <clusterSize unit='B'>131072</clusterSize> # virsh vol-info --pool default cluster.img Name: cluster.img Type: file Capacity: 2.00 GiB Allocation: 388.00 KiB 4. Resize the volume. # virsh vol-resize --pool default --vol cluster.img 5G Size of volume 'cluster.img' successfully changed to 5G # virsh vol-info --pool default cluster.img Name: cluster.img Type: file Capacity: 5.00 GiB Allocation: 260.00 KiB 5. Clone the volume and check the cluster_size. # virsh vol-clone --pool default cluster.img cluster-clone.img Vol cluster-clone.img cloned from cluster.img # virsh vol-dumpxml --pool default cluster-clone.img | grep clusterSize <clusterSize unit='B'>131072</clusterSize> 6. Upload/download an image with different cluster_size. Step:# qemu-img create -f qcow2 -o cluster_size=256k /var/lib/libvirt/images/upload.qcow2 5G Formatting '/var/lib/libvirt/images/upload.qcow2', fmt=qcow2 cluster_size=262144 extended_l2=off compression_type=zlib size=5368709120 lazy_refcounts=off refcount_bits=16 # virsh vol-upload cluster.img /var/lib/libvirt/images/upload.qcow2 default # virsh vol-dumpxml --pool default cluster.img | grep clusterSize <clusterSize unit='B'>262144</clusterSize> # virsh vol-download --pool default cluster.img /tmp/download.qcow2 # qemu-img info /tmp/download.qcow2 Result:image: /tmp/download.qcow2 file format: qcow2 virtual size: 5 GiB (5368709120 bytes) disk size: 772 KiB cluster_size: 262144 7. Delete the volume. # virsh vol-delete --pool default cluster.img Vol cluster.img deleted # virsh vol-list default Name Path ---------------------------------------------------------- lmn.qcow2 /var/lib/libvirt/images/lmn.qcow2 S2: Start the guest with volume disk created with cluster size 1. Prepare a dir pool and a volume xml with cluster size. # virsh pool-list --all Name State Autostart ------------------------------- default active yes # cat volume.xml <volume> <name>cluster.img</name> <key>/var/lib/libvirt/images/cluster.img</key> <capacity unit="G">2</capacity> <allocation>294912</allocation> <target> <path>/var/lib/libvirt/images/cluster.img</path> <format type='qcow2'/> <clusterSize unit='KiB'>2048</clusterSize> </target> </volume> 2. Create the volume with vol-create. # virsh vol-create --pool default volume.xml Vol cluster created from volume.xml # virsh vol-list default Name Path -------------------------------------------------------- cluster.img /var/lib/libvirt/images/cluster.img lmn.qcow2 /var/lib/libvirt/images/lmn.qcow2 3. Start a guest with the following volume disk xml. Step: # virsh edit guest ... <disk type='volume' device='disk'> <driver name='qemu' type='qcow2'/> <source pool='default' volume='cluster.img'/> <target dev='vdb' bus='virtio'/> </disk> ... # virsh start guest Domain 'lmn' started # virsh domblklist guest Result: Target Source --------------------------------------------- vda /var/lib/libvirt/images/lmn.qcow2 vdb cluster.img S3: Create volume with cluster size in netfs pool 1. Prepare a netfs pool. # cat netfs.pool <pool type='netfs'> <name>netfs</name> <uuid>49bcb8df-c550-4c60-b384-a0aed202c9ca</uuid> <source> <host name='10.66.87.196'/> <dir path='/home/nfs'/> <format type='nfs'/> </source> <target> <path>/tmp/pool-test</path> </target> </pool> # virsh pool-create netfs.pool Pool netfs created from netfs.pool # virsh pool-list Name State Autostart ------------------------------- default active yes netfs active no 2. Create a volume with cluster size. # cat nfs-vol.xml <volume> <name>test.img</name> <key>/var/lib/libvirt/images/test.img</key> <capacity unit="G">2</capacity> <allocation>294912</allocation> <target> <path>/tmp/pool-test/test.img</path> <format type='qcow2'/> <clusterSize unit='MiB'>2</clusterSize> </target> </volume> # virsh vol-create --pool netfs nfs-vol.xml Vol test.img created from nfs-vol.xml 3. Check the volume info. Step:# virsh vol-dumpxml --pool netfs test.img | grep clusterSize <clusterSize unit='B'>2097152</clusterSize> # qemu-img info /tmp/pool-test/test.img Result:image: /tmp/pool-test/test.img file format: qcow2 virtual size: 2 GiB (2147483648 bytes) disk size: 388 KiB cluster_size: 2097152
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (virt:av bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:4684