Description of problem: The user can create a QCOW2 preallocated just fine. However, if the user creates a snapshot of the disk and then deletes the snapshot cold, a reduce is incorrectly called on the "preallocated" volume and its shrinked back to a thin size. Version-Release number of selected component (if applicable): rhvm-4.4.10.7-0.4.el8ev.noarch vdsm-4.40.100.2-1.el8ev.x86_64 How reproducible: 100% Steps to Reproduce: 1. Create a VM, with no disks 2. Virtual Machines -> VM -> Disks -> New Set the disk to preallocated and enable incremental: Allocation Policy: PREALLOCATED [x] Enable Incremental Backup Size: 100G 3. Click OK, the disk is created as follows: 2022-05-03 17:30:55,629-04 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.CreateVolumeVDSCommand] (default task-5) [64953a65-7b73-4435-94bd-d3d7bf0658f9] START, CreateVolumeVDSCommand( CreateVolumeVDSCommandParameters:{ storagePoolId='41d28ffe-c36a-11ec-8f0b-525400990001', ignoreFailoverLimit='false', storageDomainId='39f872a5-5983-45c3-ad2b-15a7fcc25b3f', imageGroupId='63ecd0c0-612c-4d5e-ba68-63bd38d84b11', imageSizeInBytes='107374182400', <------------- volumeFormat='COW', <------------- newImageId='9d4a796c-d624-47f0-a362-4c3a8d63387f', imageType='Preallocated', <------------- newImageDescription='{"DiskAlias":"VOLTEST_Disk1","DiskDescription":""}', imageInitialSizeInBytes='0', imageId='00000000-0000-0000-0000-000000000000', sourceImageGroupId='00000000-0000-0000-0000-000000000000', shouldAddBitmaps='false'}), log id: 6e1c3764 4. All good until here, the disk is fully allocated, see the size of the LV 9d4a796c-d624-47f0-a362-4c3a8d63387f 39f872a5-5983-45c3-ad2b-15a7fcc25b3f -wi------- 100.00g IU_63ecd0c0-612c-4d5e-ba68-63bd38d84b11,MD_20,PU_00000000-0000-0000-0000-000000000000 And the Database and the SD Metadata are correct too: engine=# select size,volume_type,volume_format from images where image_guid = '9d4a796c-d624-47f0-a362-4c3a8d63387f'; size | volume_type | volume_format --------------+-------------+--------------- 107374182400 | 1 | 4 CAP=107374182400 CTIME=1651613456 DESCRIPTION={"DiskAlias":"VOLTEST_Disk1","DiskDescription":""} DISKTYPE=DATA DOMAIN=39f872a5-5983-45c3-ad2b-15a7fcc25b3f FORMAT=COW GEN=0 IMAGE=63ecd0c0-612c-4d5e-ba68-63bd38d84b11 LEGALITY=LEGAL PUUID=00000000-0000-0000-0000-000000000000 TYPE=PREALLOCATED VOLTYPE=LEAF 5. Now we create a snapshot of the VM, our base image is prealloc and the new one is thin, as expected 653f262a-ad0c-401d-b91e-d1885a7d4eaa 39f872a5-5983-45c3-ad2b-15a7fcc25b3f -wi------- 1.00g IU_63ecd0c0-612c-4d5e-ba68-63bd38d84b11,MD_21,PU_9d4a796c-d624-47f0-a362-4c3a8d63387f 9d4a796c-d624-47f0-a362-4c3a8d63387f 39f872a5-5983-45c3-ad2b-15a7fcc25b3f -wi------- 100.00g IU_63ecd0c0-612c-4d5e-ba68-63bd38d84b11,MD_20,PU_00000000-0000-0000-0000-000000000000 image_guid | size | volume_type | volume_format --------------------------------------+--------------+-------------+--------------- 9d4a796c-d624-47f0-a362-4c3a8d63387f | 107374182400 | 1 | 4 653f262a-ad0c-401d-b91e-d1885a7d4eaa | 107374182400 | 2 | 4 6. Here comes the problem. Delete that snapshot. During Finalize, VDSM reduces the size of the LV: 2022-05-03 17:39:43,617-0400 INFO (jsonrpc/3) [vdsm.api] START finalizeMerge(spUUID='41d28ffe-c36a-11ec-8f0b-525400990001', subchainInfo={'base_id': '9d4a796c-d624-47f0-a362-4c3a8d63387f', 'top_id': '653f262a-ad0c-401d-b91e-d1885a7d4eaa', 'sd_id': '39f872a5-5983-45c3-ad2b-15a7fcc25b3f', 'img_id': '63ecd0c0-612c-4d5e-ba68-63bd38d84b11'}) from=::ffff:192.168.1.78,35924, flow_id=f86e1aa6-3ca1-4543-ad35-afc4ff9308ed, task_id=da79e5d0-251f-42bd-b47e-0d6605cb04d0 (api:48) .... 2022-05-03 17:39:44,467-0400 INFO (tasks/3) [storage.Volume] Request to reduce LV 9d4a796c-d624-47f0-a362-4c3a8d63387f of image 63ecd0c0-612c-4d5e-ba68-63bd38d84b11 in VG 39f872a5-5983-45c3-ad2b-15a7fcc25b3f with size = 1073741824 allowActive = False (blockVolume:683) 2022-05-03 17:39:44,467-0400 INFO (tasks/3) [storage.LVM] Reducing LV 39f872a5-5983-45c3-ad2b-15a7fcc25b3f/9d4a796c-d624-47f0-a362-4c3a8d63387f to 1024 megabytes (force=False) (lvm:1739) And we have a volume that is supposed to be preallocated and is now thin in size. Our 100G disk went back to 1G. 9d4a796c-d624-47f0-a362-4c3a8d63387f 39f872a5-5983-45c3-ad2b-15a7fcc25b3f -wi------- 1.00g IU_63ecd0c0-612c-4d5e-ba68-63bd38d84b11,MD_20,PU_00000000-0000-0000-0000-000000000000 Note the metadata still says preallocated, its just that the size of the volume was shrinked, making it effectively thin again regardless what the DB and SD metadata says. It seems to call shrink here, without checking if the base volume is preallocated: lib/vdsm/storage/merge.py 224 def finalize(subchain): ... 269 if subchain.base_vol.chunked(): 270 _shrink_base_volume(subchain, optimal_size) <---- Given that for block volumes we have: def chunked(self): return self.getFormat() == sc.COW_FORMAT <---- its going to shrink based on format (cow), not type (prealloc) Actual results: Preallocated volume is made thin again Expected results: Preallocated volume stays preallocated
Nir I opened against VDSM as the engine is just calling finalize and the shrink is triggered by VDSM during finalize.
(In reply to Germano Veit Michel from comment #2) > Nir I opened against VDSM as the engine is just calling finalize and the > shrink is triggered by VDSM during finalize. Right, engine does not call reduce directly in cold merge, so this must be fixed in vdsm. Engine do support calling Volume.reduce() API (e.g. via the API/SDK) on a preallocated volume, which is wrong. After fixing vdsm this will be harmless, but engine can skip the vdsm API since preallocated volume never need to be reduced.
Should be easy fix, consider also volume.type: 149 # Allocation policy (PREALLOCATED or SPARSE) 150 self.type = type When reducing a volume.
Tracker to improve discrepancy tool to find these issues: https://bugzilla.redhat.com/show_bug.cgi?id=2081559 I still feel there might be other ways to hit this, so the above will help to find them.
Alert, I simplify the Fix description, I don't think the implementation details are helpful to the reader.
Verified. The volume stays preallocated and lvdevices shows expected size. Versions: engine-4.5.1.1-0.14.el8ev vdsm-4.50.1.2-1.el8ev.x86_64
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (RHV RHEL Host (ovirt-host) [ovirt-4.5.1] update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:5583
Due to QE capacity, we are not going to cover this issue in our automation