Bug 2041352
Summary: | Default QCOW_OVERHEAD_FACTOR may waste too many extra disks for big storage | ||
---|---|---|---|
Product: | [oVirt] vdsm | Reporter: | Mike Cao <bcao> |
Component: | Core | Assignee: | Albert Esteve <aesteve> |
Status: | CLOSED UPSTREAM | QA Contact: | Lukas Svaty <lsvaty> |
Severity: | medium | Docs Contact: | |
Priority: | unspecified | ||
Version: | 4.40.0 | CC: | ahadas, bcao, bugs, bzlotnik, lsurette, nsoffer, srevivo, vjuranek, ycui |
Target Milestone: | --- | Flags: | pm-rhel:
ovirt-4.5?
|
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | All | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2022-05-24 16:48:00 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 2064907 | ||
Bug Blocks: |
Description
Mike Cao
2022-01-17 07:22:37 UTC
1.1 was an estimate when we did not have a way to compute the actual value.
Both vdsm and engine use this estimate in various calculations. In some flows
we use qemu-img measure to allocate the right amount of space, but in some
flows we still use the old estimate.
Steps to fix this issue:
1. Engine assumes that vdsm allocates 10% extra
In some flows engine assumes that vdsm will allocate 10% more, and reduce
the size by 10%. If we fix vdsm, we may break engine code assuming 10%
extra allocation.
public static long computeCowImageNeededSize(VolumeFormat sourceFormat, long actualSize) {
// When vdsm creates a COW volume with provided initial size the size is multiplied by 1.1 to prevent a
// case in which we won't have enough space. If the source is already COW we don't need the additional
// space.
return sourceFormat == VolumeFormat.COW
? Double.valueOf(Math.ceil(actualSize / StorageConstants.QCOW_OVERHEAD_FACTOR)).longValue()
: actualSize;
}
The only used of this is CloneImageGroupVolumesStructureCommand, when it does
not measure images.
If we fix this command to always measure images when the destination is
qocw2 format on block storage, we can remove the bad code assuming vdsm
over allocation.
2. Consider require size for bitmaps
Bitmap size depends on the granularity of the bitmap, which is 1 bit per
64k by the default. For the largest supported image (8t) every bitmap
can take 16 MiB, so 64 bitmaps need 1g. For the smallest image (1g) bitmap
size is 2k, 64 bitmaps need 128k.
So I think we can use:
virtual_size * 1.01 + 64 * (virtual size / 64k / 8)
Example: for 1 TiB image we can allow:
>>> (1024**4 * 1.01 + 64 * (1024**4 // (64 * 1024) // 8)) / 1024**4
1.0101220703125
Benny, what do you think about the proposed fix? I mentioned one of 5 users of QCOW_OVERHEAD_FACTOR - I think this is the worst one. We have to check also other users: /** * direction storage transfer disk ticket size * ================================================================================= * upload block raw * virtual size * upload block - raw virtual size (lv size) * upload block - qcow2 actual size (lv size) * --------------------------------------------------------------------------------- * upload file raw * virtual size * upload file - raw virtual size * upload file - qcow2 virtual size + cow overhead * --------------------------------------------------------------------------------- * upload (ui) * - * uploaded file size * --------------------------------------------------------------------------------- * download block raw * virtual size (using /map) * download block - raw virtual size * download block - qcow2 image-end-offset (returned by * "qemu-img check" - not implemented yet) * --------------------------------------------------------------------------------- * download file raw * virtual size (using /map) * download file - raw virtual size * download file - qcow2 file size * --------------------------------------------------------------------------------- */ private long getTransferSize() { DiskImage image = getDiskImage(); if (getTransferBackend() == ImageTransferBackend.NBD) { // NBD always uses virtual size (raw format) return image.getSize(); } if (getParameters().getTransferType() == TransferType.Download) { if (image.getVolumeFormat() == VolumeFormat.RAW) { return image.getSize(); } if (image.getVolumeFormat() == VolumeFormat.COW) { return getImageApparentSize(image); } // Shouldn't happen throw new RuntimeException(String.format( "Invalid volume format: %s", image.getVolumeFormat())); } else if (getParameters().getTransferType() == TransferType.Upload) { if (getParameters().getTransferClientType().isBrowserTransfer()) { return getParameters().getTransferSize(); } if (image.getVolumeFormat() == VolumeFormat.RAW) { return image.getSize(); } // COW volume format boolean isBlockDomain = image.getStorageTypes().get(0).isBlockDomain(); if (isBlockDomain) { return image.getActualSizeInBytes(); } // Needed to allow uploading fully allocated qcow (BZ#1697294) // Also, adding qcow header overhead to support small files. return (long) Math.ceil(image.getSize() * StorageConstants.QCOW_OVERHEAD_FACTOR) + StorageConstants.QCOW_HEADER_OVERHEAD; } This should be easy to adapt to the new factor, regardless of new/old vdsm. The size is relevant only when using leagcy image transfer not using the nbd backend. In this case we can get the size of the image from storage before the transfer instead of guessing. /** * Calculates the required space in the storage domain for creating cloned DiskImages with collapse. * When creating COW volume the actual used space will be the needed space * QCOW_OVERHEAD_FACTOR as implemented * currently in the VDSM code. * * */ private double getTotalSizeForClonedDisk(DiskImage diskImage) { double sizeForDisk = ImagesHandler.getTotalActualSizeOfDisk(diskImage, storageDomain.getStorageStaticData()); if (diskImage.getVolumeFormat() == VolumeFormat.COW) { sizeForDisk = Math.ceil(StorageConstants.QCOW_OVERHEAD_FACTOR * sizeForDisk); } return sizeForDisk; } I think this should be removed and use Volume.measure instead. /** * Verify there's enough space in the storage domain for creating cloned DiskImages with snapshots without collapse. * Space should be allocated according to the volumes type and format, and allocation policy, * according to the following table: * * | File Domain | Block Domain * -----|-----------------------------------------|---------------- * qcow | 1.1 * used space |1.1 * used space * -----|-----------------------------------------|---------------- * raw | preallocated: disk capacity |disk capacity * | sparse: used space | * * */ private double getTotalSizeForDisksWithSnapshots(Collection<DiskImage> diskImages) { return getTotalSizeForDisksByMethod(diskImages, diskImage -> { double sizeForDisk = diskImage.getSize(); if (storageDomain.getStorageType().isFileDomain() && diskImage.getVolumeType() == VolumeType.Sparse || diskImage.getVolumeFormat() == VolumeFormat.COW) { sizeForDisk = diskImage.getActualDiskWithSnapshotsSizeInBytes(); } if (diskImage.getVolumeFormat() == VolumeFormat.COW) { sizeForDisk = Math.ceil(StorageConstants.QCOW_OVERHEAD_FACTOR * sizeForDisk); } return sizeForDisk; }); } This can use Volume.measure. private double getRequiredSizeForMerge(SubchainInfo subchain, ActionType snapshotActionType) { DiskImage baseSnapshot = subchain.getBaseImage(); DiskImage topSnapshot = subchain.getTopImage(); // The snapshot is the root snapshot if (Guid.isNullOrEmpty(baseSnapshot.getParentId())) { if (baseSnapshot.getVolumeFormat() == VolumeFormat.RAW) { // Raw/Block can only be preallocated thus we are necessarily overlapping // with existing data if (baseSnapshot.getVolumeType() == VolumeType.Preallocated) { return 0.0; } return Math.min(topSnapshot.getActualSizeInBytes() / StorageConstants.QCOW_OVERHEAD_FACTOR, baseSnapshot.getSize() - baseSnapshot.getActualSizeInBytes()); } } // The required size for the extension of the volume we merge into. // If the actual size of top is larger than the actual size of base we // will be overlapping, hence we extend by the lower of the two. return Math.min(topSnapshot.getSize() * StorageConstants.QCOW_OVERHEAD_FACTOR - baseSnapshot.getActualSizeInBytes(), topSnapshot.getActualSizeInBytes()); } This assumes vdsm over allocation - bad! I'm not sure we can measure the merge size, since measure works on either the entire chain, or single image. When merging we need to measure only top and base, and we must measure them together. We can measure if the base does not have a parent: base <- top We cannot measure the sub chain base<-top with current vdsm code: parent <- base <- top But we can change vdsm to support this. To support measuring volumes before merge, we need bug 2064907. The impact is not that high and it's not a result of a recent change so it is unlikely to be fixed in 4.5.1 but as it makes sense moving to GitHub: https://github.com/oVirt/vdsm/issues/207 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 365 days |