Description of problem: When OCP runs on RHV, it can use RHV disks as PVCs. 1. OCP asks RHV to create disks in a somewhat *weird* way, which may deserve a bug on its own. It appears to want QCOW2 "Preallocated", but sends sparse and sets initial size to match capacity, like this: For example a 100G disk <disk> [...] <format>cow</format> <initial_size>107374182400</initial_size> <provisioned_size>107374182400</provisioned_size> </disk> 2. This makes the engine send this to VDSM, with 100G both as capacity and initial size. 2022-05-25 09:45:42,026+0200 INFO (jsonrpc/6) [vdsm.api] START createVolume( ... size='107374182400' volFormat=4 preallocate=2 initialSize='107374182400' ... ) 3. So VDSM ends up where blockVolume.py 319 def calculate_volume_alloc_size( ... 342 if preallocate == sc.SPARSE_VOL: 343 # Sparse qcow2 344 if initial_size: 345 # TODO: if initial_size == max_size, we exceed the max_size 346 # here. This should be fixed, but first we must check that 347 # engine is not assuming that vdsm will increase initial size 348 # like this. 349 alloc_size = int(initial_size * QCOW_OVERHEAD_FACTOR) 4. Given that: QCOW_OVERHEAD_FACTOR = 1.1 5. It creates a 110G LV to hold a 100G qcow2. Sounds execessive and does not match what a Preallocated COW (i.e. create preallocated with incremental backup enabled - which creates a 100G LV here and may be a bug on the opposite end?). 6. See the discrepancy: 2022-05-25 09:45:42,330+0200 INFO (tasks/7) [storage.Volume] Request to create COW volume /rhev/data-center/mnt/blockSD/6552eb00-37b6-4985-a588-fc19ef44e3ec/images/acf65db3-51ad-467d-b86d-61d463afe8d8/0d45176f-3cdb-44f3-a1f6-a59ab907a745 with capacity = 107374182400 (blockVolume:517) 7. Half a second later, from the getInfo on the VolumeCreate flow: 2022-05-25 09:45:44,821+0200 INFO (jsonrpc/7) [storage.VolumeManifest] 6552eb00-37b6-4985-a588-fc19ef44e3ec/acf65db3-51ad-467d-b86d-61d463afe8d8/0d45176f-3cdb-44f3-a1f6-a59ab907a745 info is { 'type': 'SPARSE', 'format': 'COW', 'disktype': 'DATA', 'voltype': 'LEAF', 'capacity': '107374182400', ... 'apparentsize': '118111600640', 'truesize': '118111600640', } (volume:278) I don't really know what to expect here, but this seems off. OCP may need some work on how it calls RHV API, but just RHV on its own also does not look correct and consistent. Wonder if initial size being equal to capacity shouldn't make RHV do preallocated qcow. And there is a TODO in the code :) Version-Release number of selected component (if applicable): vdsm-4.50.0.13-1.el8ev.x86_64 rhvm-4.5.0.7-0.9.el8ev.noarch How reproducible: 100% Steps to Reproduce: disk = disks_service.add( types.Disk( name='mydisk', description='My disk', format=types.DiskFormat.COW, provisioned_size=100 * 2**30, initial_size=100 * 2**30, storage_domains=[ types.StorageDomain( name='iSCSI' ) ] ) ) Actual results: * Possible waste of storage space * More allocation than "Preallocated COW" if initial size and size are the same for a "sparse" disk Expected results: * Does it need this much overhead on a 100G disk. If its a 1T disk it will allocate an extra 100G and so on. * Should match preallocated cow allocation?
@germano regarding the OCP on RHV weird disk creation, could you elaborate on that? Also, can you let us know what OCP version the customer is running here?
(In reply to Janos Bonic from comment #1) > @germano regarding the OCP on RHV weird disk creation, could you > elaborate on that? Hi Janos, I think the problem may be on the go ovirt client, its setting both Provisioned and Initial size to the same value (size), which is passed from ovirt-csi driver. diskBuilder := ovirtsdk4.NewDiskBuilder(). ProvisionedSize(int64(size)). InitialSize(int64(size)). StorageDomainsOfAny(storageDomain). Format(ovirtsdk4.DiskFormat(format)) https://github.com/oVirt/go-ovirt-client/blob/main/disk_create.go#L119 > Also, can you let us know what OCP version the customer is running here? I'll ask, I just saw the weird request arriving on the logs. I guess you are more interested in the ovirt-csi driver version?
btw, maybe we should split that from the VDSM issue to avoid confusion. Let me know what you think and we can do a new bug for that.
(In reply to Janos Bonic from comment #1) > @germano regarding the OCP on RHV weird disk creation, could you > elaborate on that? > > Also, can you let us know what OCP version the customer is running here? OCP version is 4.10.14
(In reply to Germano Veit Michel from comment #0) > 4. Given that: > > QCOW_OVERHEAD_FACTOR = 1.1 > > 5. It creates a 110G LV to hold a 100G qcow2. Sounds execessive and does not > match what a Preallocated COW (i.e. create preallocated with incremental > backup enabled - which creates a 100G LV here and may be a bug on the > opposite end?). > > 6. See the discrepancy: > > 2022-05-25 09:45:42,330+0200 INFO (tasks/7) [storage.Volume] Request to > create COW volume > /rhev/data-center/mnt/blockSD/6552eb00-37b6-4985-a588-fc19ef44e3ec/images/ > acf65db3-51ad-467d-b86d-61d463afe8d8/0d45176f-3cdb-44f3-a1f6-a59ab907a745 > with capacity = 107374182400 (blockVolume:517) > > 7. Half a second later, from the getInfo on the VolumeCreate flow: > > 2022-05-25 09:45:44,821+0200 INFO (jsonrpc/7) [storage.VolumeManifest] > 6552eb00-37b6-4985-a588-fc19ef44e3ec/acf65db3-51ad-467d-b86d-61d463afe8d8/ > 0d45176f-3cdb-44f3-a1f6-a59ab907a745 info is { > 'type': 'SPARSE', > 'format': 'COW', > 'disktype': 'DATA', > 'voltype': 'LEAF', > 'capacity': '107374182400', > ... > 'apparentsize': '118111600640', > 'truesize': '118111600640', > } (volume:278) Yeah, that is expected. We have a bug asking to improve this (bz 2041352) but it was not prioritized and we moved it upstream (https://github.com/oVirt/vdsm/issues/207) - let's separate that out from this bug. other than that, everything else seems to be related to the input we get from OCP so we think this bug should move to OCP on RHV
Sure, we can use this to track the OCP side then, which actually seem to be a bug on the Go SDk.
(In reply to Germano Veit Michel from comment #2) > (In reply to Janos Bonic from comment #1) > > @germano regarding the OCP on RHV weird disk creation, could you > > elaborate on that? > > Hi Janos, > > I think the problem may be on the go ovirt client, its setting both > Provisioned and Initial size to the same value (size), which is passed from > ovirt-csi driver. > > diskBuilder := ovirtsdk4.NewDiskBuilder(). > ProvisionedSize(int64(size)). > InitialSize(int64(size)). > StorageDomainsOfAny(storageDomain). > Format(ovirtsdk4.DiskFormat(format)) > > https://github.com/oVirt/go-ovirt-client/blob/main/disk_create.go#L119 and this is with the default settings, I suppose? I.e. Sparse set to true, Format set to COW. Why do we set initial size to the provisioned size, isn't that a hchange in behavior/regression from previous code? This should have created "normal" thin provisioned volume, not preallocate...
The OCP on RHV issue is now tracked in Jira under https://issues.redhat.com/browse/OCPRHV-813
closing based on comment 9