Description of problem: Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. default value of QCOW_OVERHEAD_FACTOR is 1.1 75ee4266 vdsm/storage/blockVolume.py (Fred Rolland 2015-09-17 17:18:16 +0300 44) QCOW_OVERHEAD_FACTOR = 1.1 3cbc0c08 lib/vdsm/storage/blockVolume.py (Vojtech Juranek 2019-07-11 22:32:11 +0200 347) alloc_size = int(initial_size * c) the default QCOW_OVERHEAD_FACTOR is a little waste of storage if they create a 1 TB disk (1.1 TB is assigned)
1.1 was an estimate when we did not have a way to compute the actual value. Both vdsm and engine use this estimate in various calculations. In some flows we use qemu-img measure to allocate the right amount of space, but in some flows we still use the old estimate. Steps to fix this issue: 1. Engine assumes that vdsm allocates 10% extra In some flows engine assumes that vdsm will allocate 10% more, and reduce the size by 10%. If we fix vdsm, we may break engine code assuming 10% extra allocation. public static long computeCowImageNeededSize(VolumeFormat sourceFormat, long actualSize) { // When vdsm creates a COW volume with provided initial size the size is multiplied by 1.1 to prevent a // case in which we won't have enough space. If the source is already COW we don't need the additional // space. return sourceFormat == VolumeFormat.COW ? Double.valueOf(Math.ceil(actualSize / StorageConstants.QCOW_OVERHEAD_FACTOR)).longValue() : actualSize; } The only used of this is CloneImageGroupVolumesStructureCommand, when it does not measure images. If we fix this command to always measure images when the destination is qocw2 format on block storage, we can remove the bad code assuming vdsm over allocation. 2. Consider require size for bitmaps Bitmap size depends on the granularity of the bitmap, which is 1 bit per 64k by the default. For the largest supported image (8t) every bitmap can take 16 MiB, so 64 bitmaps need 1g. For the smallest image (1g) bitmap size is 2k, 64 bitmaps need 128k. So I think we can use: virtual_size * 1.01 + 64 * (virtual size / 64k / 8) Example: for 1 TiB image we can allow: >>> (1024**4 * 1.01 + 64 * (1024**4 // (64 * 1024) // 8)) / 1024**4 1.0101220703125
Benny, what do you think about the proposed fix?
I mentioned one of 5 users of QCOW_OVERHEAD_FACTOR - I think this is the worst one. We have to check also other users: /** * direction storage transfer disk ticket size * ================================================================================= * upload block raw * virtual size * upload block - raw virtual size (lv size) * upload block - qcow2 actual size (lv size) * --------------------------------------------------------------------------------- * upload file raw * virtual size * upload file - raw virtual size * upload file - qcow2 virtual size + cow overhead * --------------------------------------------------------------------------------- * upload (ui) * - * uploaded file size * --------------------------------------------------------------------------------- * download block raw * virtual size (using /map) * download block - raw virtual size * download block - qcow2 image-end-offset (returned by * "qemu-img check" - not implemented yet) * --------------------------------------------------------------------------------- * download file raw * virtual size (using /map) * download file - raw virtual size * download file - qcow2 file size * --------------------------------------------------------------------------------- */ private long getTransferSize() { DiskImage image = getDiskImage(); if (getTransferBackend() == ImageTransferBackend.NBD) { // NBD always uses virtual size (raw format) return image.getSize(); } if (getParameters().getTransferType() == TransferType.Download) { if (image.getVolumeFormat() == VolumeFormat.RAW) { return image.getSize(); } if (image.getVolumeFormat() == VolumeFormat.COW) { return getImageApparentSize(image); } // Shouldn't happen throw new RuntimeException(String.format( "Invalid volume format: %s", image.getVolumeFormat())); } else if (getParameters().getTransferType() == TransferType.Upload) { if (getParameters().getTransferClientType().isBrowserTransfer()) { return getParameters().getTransferSize(); } if (image.getVolumeFormat() == VolumeFormat.RAW) { return image.getSize(); } // COW volume format boolean isBlockDomain = image.getStorageTypes().get(0).isBlockDomain(); if (isBlockDomain) { return image.getActualSizeInBytes(); } // Needed to allow uploading fully allocated qcow (BZ#1697294) // Also, adding qcow header overhead to support small files. return (long) Math.ceil(image.getSize() * StorageConstants.QCOW_OVERHEAD_FACTOR) + StorageConstants.QCOW_HEADER_OVERHEAD; } This should be easy to adapt to the new factor, regardless of new/old vdsm. The size is relevant only when using leagcy image transfer not using the nbd backend. In this case we can get the size of the image from storage before the transfer instead of guessing. /** * Calculates the required space in the storage domain for creating cloned DiskImages with collapse. * When creating COW volume the actual used space will be the needed space * QCOW_OVERHEAD_FACTOR as implemented * currently in the VDSM code. * * */ private double getTotalSizeForClonedDisk(DiskImage diskImage) { double sizeForDisk = ImagesHandler.getTotalActualSizeOfDisk(diskImage, storageDomain.getStorageStaticData()); if (diskImage.getVolumeFormat() == VolumeFormat.COW) { sizeForDisk = Math.ceil(StorageConstants.QCOW_OVERHEAD_FACTOR * sizeForDisk); } return sizeForDisk; } I think this should be removed and use Volume.measure instead. /** * Verify there's enough space in the storage domain for creating cloned DiskImages with snapshots without collapse. * Space should be allocated according to the volumes type and format, and allocation policy, * according to the following table: * * | File Domain | Block Domain * -----|-----------------------------------------|---------------- * qcow | 1.1 * used space |1.1 * used space * -----|-----------------------------------------|---------------- * raw | preallocated: disk capacity |disk capacity * | sparse: used space | * * */ private double getTotalSizeForDisksWithSnapshots(Collection<DiskImage> diskImages) { return getTotalSizeForDisksByMethod(diskImages, diskImage -> { double sizeForDisk = diskImage.getSize(); if (storageDomain.getStorageType().isFileDomain() && diskImage.getVolumeType() == VolumeType.Sparse || diskImage.getVolumeFormat() == VolumeFormat.COW) { sizeForDisk = diskImage.getActualDiskWithSnapshotsSizeInBytes(); } if (diskImage.getVolumeFormat() == VolumeFormat.COW) { sizeForDisk = Math.ceil(StorageConstants.QCOW_OVERHEAD_FACTOR * sizeForDisk); } return sizeForDisk; }); } This can use Volume.measure. private double getRequiredSizeForMerge(SubchainInfo subchain, ActionType snapshotActionType) { DiskImage baseSnapshot = subchain.getBaseImage(); DiskImage topSnapshot = subchain.getTopImage(); // The snapshot is the root snapshot if (Guid.isNullOrEmpty(baseSnapshot.getParentId())) { if (baseSnapshot.getVolumeFormat() == VolumeFormat.RAW) { // Raw/Block can only be preallocated thus we are necessarily overlapping // with existing data if (baseSnapshot.getVolumeType() == VolumeType.Preallocated) { return 0.0; } return Math.min(topSnapshot.getActualSizeInBytes() / StorageConstants.QCOW_OVERHEAD_FACTOR, baseSnapshot.getSize() - baseSnapshot.getActualSizeInBytes()); } } // The required size for the extension of the volume we merge into. // If the actual size of top is larger than the actual size of base we // will be overlapping, hence we extend by the lower of the two. return Math.min(topSnapshot.getSize() * StorageConstants.QCOW_OVERHEAD_FACTOR - baseSnapshot.getActualSizeInBytes(), topSnapshot.getActualSizeInBytes()); } This assumes vdsm over allocation - bad! I'm not sure we can measure the merge size, since measure works on either the entire chain, or single image. When merging we need to measure only top and base, and we must measure them together. We can measure if the base does not have a parent: base <- top We cannot measure the sub chain base<-top with current vdsm code: parent <- base <- top But we can change vdsm to support this.
To support measuring volumes before merge, we need bug 2064907.
The impact is not that high and it's not a result of a recent change so it is unlikely to be fixed in 4.5.1 but as it makes sense moving to GitHub: https://github.com/oVirt/vdsm/issues/207
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 365 days