Created attachment 1662758 [details] vdsm log Description of problem: When running in the cycle creating a snapshots and deleting (live merge) old ones, it can fail with exception bellow. After that disk is in illegal state and cannot be deleted. Version-Release number of selected component (if applicable): ovirt-engine-4.3.8.2-1.el7.noarch vdsm-4.30.40-3.git7d8f244.el7.x86_64 How reproducible: Randomly (hit it only once) Steps to Reproduce: 1. create initial snapshot 2. run in the loop: 2.1 create a snapshot 2.2 delete (live merge) original older snapshot Hit by running following script: https://github.com/vjuranek/ovirt-bug-reproducers/blob/master/BZ1796415/snapshots.py Actual results: older snaphost we be merged/deleted Expected results: merge of the snapshot failed and disk in in illegal state Additional info: engine log exception: 2020-02-07 13:19:44,329-05 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.MergeVDSCommand] (EE-ManagedThreadFactory-commandCoordinator-Thread-9) [374307c4-f4dd-42fc-9a4e-32a43f656c57] Failed in 'MergeVDS' method 2020-02-07 13:19:44,362-05 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-commandCoordinator-Thread-9) [374307c4-f4dd-42fc-9a4e-32a43f656c57] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), VDSM centos-7.7 command MergeVDS failed: Merge failed 2020-02-07 13:19:44,362-05 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.MergeVDSCommand] (EE-ManagedThreadFactory-commandCoordinator-Thread-9) [374307c4-f4dd-42fc-9a4e-32a43f656c57] Command 'org.ovirt.engine.core.vdsbroker.vdsbroker.MergeVDSCommand' return value 'StatusOnlyReturn [status=Status [code=52, message=Merge failed]]' 2020-02-07 13:19:44,362-05 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.MergeVDSCommand] (EE-ManagedThreadFactory-commandCoordinator-Thread-9) [374307c4-f4dd-42fc-9a4e-32a43f656c57] HostName = centos-7.7 2020-02-07 13:19:44,362-05 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.MergeVDSCommand] (EE-ManagedThreadFactory-commandCoordinator-Thread-9) [374307c4-f4dd-42fc-9a4e-32a43f656c57] Command 'MergeVDSCommand(HostName = centos-7.7, MergeVDSCommandParameters:{hostId='b74ec1cb-d9f7-4e04-b673-06a35ebc7ad2', vmId='a5a486b6-4ccb-40fb-9c47-ca1885cd87a9', storagePoolId='0b600aa6-42a1-11ea-b6bf-525400f551af', storageDomainId='75bcadd7-d61f-4c53-ad0e-259460d8b10d', imageGroupId='5d407f5f-a758-4b60-9633-c41285ac024c', imageId='a9d83317-fefa-4182-9f33-a443628d773f', baseImageId='5346ebcd-80c6-4696-b376-1fd16cff6d34', topImageId='18c43c63-6610-4aca-80d7-d5139f536e68', bandwidth='0'})' execution failed: VDSGenericException: VDSErrorException: Failed to MergeVDS, error = Merge failed, code = 52 2020-02-07 13:19:44,362-05 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.MergeVDSCommand] (EE-ManagedThreadFactory-commandCoordinator-Thread-9) [374307c4-f4dd-42fc-9a4e-32a43f656c57] FINISH, MergeVDSCommand, return: , log id: 52195393 2020-02-07 13:19:44,362-05 ERROR [org.ovirt.engine.core.bll.MergeCommand] (EE-ManagedThreadFactory-commandCoordinator-Thread-9) [374307c4-f4dd-42fc-9a4e-32a43f656c57] Engine exception thrown while sending merge command: org.ovirt.engine.core.common.errors.EngineException: EngineException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed to MergeVDS, error = Merge failed, code = 52 (Failed with error mergeErr and code 52) at org.ovirt.engine.core.bll.VdsHandler.handleVdsResult(VdsHandler.java:118) [bll.jar:] vdsm error: 2020-02-07 13:19:48,975-0500 ERROR (tasks/5) [storage.TaskManager.Task] (Task='1108934c-f381-4128-91d8-b08ce8d6759f') Unexpected error (task:875) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run return fn(*args, **kargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 336, in run return self.cmd(*self.argslist, **self.argsdict) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper return method(self, *args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 1967, in deleteVolume vol.delete(postZero=postZero, force=force, discard=discard) File "/usr/lib/python2.7/site-packages/vdsm/storage/blockVolume.py", line 585, in delete self.validateDelete() File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line 1292, in validateDelete self._manifest.validateDelete() File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line 545, in validateDelete raise se.VolumeImageHasChildren(self) VolumeImageHasChildren: Cannot delete volume which has children (non-ethical): ['sdUUID: 75bcadd7-d61f-4c53-ad0e-259460d8b10d', u'imgUUID: 5d407f5f-a758-4b60-9633-c41285ac024c', u'volUUID: 18c43c63-6610-4aca-80d7-d5139f536e68'] This is caused by wrong image chain stored in engine DB as well as in volume metadata. In both cases the chain should be 5346ebcd-80c6-4696-b376-1fd16cff6d34 -> 18c43c63-6610-4aca-80d7-d5139f536e68 -> a9d83317-fefa-4182-9f33-a443628d773f Engine: engine=# select image_guid::text, size, parentid, volume_type, active from images where image_guid in ('5346ebcd-80c6-4696-b376-1fd16cff6d34', '18c43c63-6610-4aca-80d7-d5139f536e68', 'a9d83317-fefa-4182-9f33-a443628d773f'); image_guid | size | parentid | volume_type | active --------------------------------------+------------+--------------------------------------+-------------+-------- a9d83317-fefa-4182-9f33-a443628d773f | 1073741824 | 18c43c63-6610-4aca-80d7-d5139f536e68 | 2 | t 18c43c63-6610-4aca-80d7-d5139f536e68 | 1073741824 | 5346ebcd-80c6-4696-b376-1fd16cff6d34 | 2 | f 5346ebcd-80c6-4696-b376-1fd16cff6d34 | 1073741824 | 00000000-0000-0000-0000-000000000000 | 2 | f vdsm: [root@localhost ~]# lvs -o tags 75bcadd7-d61f-4c53-ad0e-259460d8b10d/a9d83317-fefa-4182-9f33-a443628d773f LV Tags IU_5d407f5f-a758-4b60-9633-c41285ac024c,MD_1,PU_18c43c63-6610-4aca-80d7-d5139f536e68 [root@localhost ~]# lvs -o tags 75bcadd7-d61f-4c53-ad0e-259460d8b10d/18c43c63-6610-4aca-80d7-d5139f536e68 LV Tags IU_5d407f5f-a758-4b60-9633-c41285ac024c,MD_5,PU_5346ebcd-80c6-4696-b376-1fd16cff6d34 [root@localhost ~]# lvs -o tags 75bcadd7-d61f-4c53-ad0e-259460d8b10d/5346ebcd-80c6-4696-b376-1fd16cff6d34 LV Tags IU_5d407f5f-a758-4b60-9633-c41285ac024c,MD_8,PU_00000000-0000-0000-0000-000000000000 In really, the chain seen by qemu-img is 5346ebcd-80c6-4696-b376-1fd16cff6d34 -> a9d83317-fefa-4182-9f33-a443628d773f 5346ebcd-80c6-4696-b376-1fd16cff6d34 -> 18c43c63-6610-4aca-80d7-d5139f536e68 [root@localhost images]# qemu-img info 5d407f5f-a758-4b60-9633-c41285ac024c/a9d83317-fefa-4182-9f33-a443628d773f image: 5d407f5f-a758-4b60-9633-c41285ac024c/a9d83317-fefa-4182-9f33-a443628d773f file format: qcow2 virtual size: 1.0G (1073741824 bytes) disk size: 0 cluster_size: 65536 backing file: 5346ebcd-80c6-4696-b376-1fd16cff6d34 (actual path: 5d407f5f-a758-4b60-9633-c41285ac024c/5346ebcd-80c6-4696-b376-1fd16cff6d34) backing file format: qcow2 Format specific information: compat: 1.1 lazy refcounts: false refcount bits: 16 corrupt: false [root@localhost images]# qemu-img info 5d407f5f-a758-4b60-9633-c41285ac024c/18c43c63-6610-4aca-80d7-d5139f536e68 image: 5d407f5f-a758-4b60-9633-c41285ac024c/18c43c63-6610-4aca-80d7-d5139f536e68 file format: qcow2 virtual size: 1.0G (1073741824 bytes) disk size: 0 cluster_size: 65536 backing file: 5346ebcd-80c6-4696-b376-1fd16cff6d34 (actual path: 5d407f5f-a758-4b60-9633-c41285ac024c/5346ebcd-80c6-4696-b376-1fd16cff6d34) backing file format: qcow2 Format specific information: compat: 1.1 lazy refcounts: false refcount bits: 16 corrupt: false [root@localhost images]# qemu-img info 5d407f5f-a758-4b60-9633-c41285ac024c/5346ebcd-80c6-4696-b376-1fd16cff6d34 image: 5d407f5f-a758-4b60-9633-c41285ac024c/5346ebcd-80c6-4696-b376-1fd16cff6d34 file format: qcow2 virtual size: 1.0G (1073741824 bytes) disk size: 0 cluster_size: 65536 Format specific information: compat: 1.1 lazy refcounts: false refcount bits: 16 corrupt: false
First, when we talk about chains lets use the same notation as used in vdsm logs, otherwise it is huge mess. We always show base at the left and top at the right, and the arrows point from child to parent. Based on the vm xml the actual chain is: 5346ebcd-80c6-4696-b376-1fd16cff6d34 <- a9d83317-fefa-4182-9f33-a443628d773f (top) <disk type='block' device='disk' snapshot='no'> <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/> <source dev='/rhev/data-center/mnt/blockSD/75bcadd7-d61f-4c53-ad0e-259460d8b10d/images/5d407f5f-a758-4b60-9633-c41285ac024c/a9d83317-fefa-4182-9f33-a443628d773f'> <seclabel model='dac' relabel='no'/> </source> <backingStore type='block' index='1'> <format type='qcow2'/> <source dev='/rhev/data-center/mnt/blockSD/75bcadd7-d61f-4c53-ad0e-259460d8b10d/images/5d407f5f-a758-4b60-9633-c41285ac024c/5346ebcd-80c6-4696-b376-1fd16cff6d34'/> <backingStore/> </backingStore> <target dev='sda' bus='scsi'/> <serial>5d407f5f-a758-4b60-9633-c41285ac024c</serial> <alias name='ua-5d407f5f-a758-4b60-9633-c41285ac024c'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> The output of qemu-img info in comment 0 is not very helpful, can you attach output of: qemu-img info --backing-chain /dev/75bcadd7-d61f-4c53-ad0e-259460d8b10d/a9d83317-fefa-4182-9f33-a443628d773f Based on lv tags, the chain is: 5346ebcd-80c6-4696-b376-1fd16cff6d34 <- 18c43c63-6610-4aca-80d7-d5139f536e68 <- a9d83317-fefa-4182-9f33-a443628d773f (top) This is also matched by engine database. So I guess what vdsm and engine show is the original chain, and we had a successful merge, so qemu chain was updated. Note that vdsm does not update the lv tags when the merge completes, since lv tags can be change only on the spm. Therefore vdsm updates the parent uuid only in the volume metadata, which is missing here. Please extract the volume metadata using the md slots MD_1, MD_5, MD_8.
Also, where is the log from the test tool? https://github.com/vjuranek/ovirt-bug-reproducers/blob/master/BZ1796415/snapshots.py What was the last successful step? Did we have any error in the test script?
Looking in vdsm log, the last merge seen is: 2020-02-07 07:01:26,768-0500 INFO (jsonrpc/7) [api.virt] START merge(drive={u'imageID': u'5d407f5f-a758-4b60-9633-c41285ac024c', u'volumeID': u'a9d83317-fefa-4182-9f33-a443 628d773f', u'domainID': u'75bcadd7-d61f-4c53-ad0e-259460d8b10d', u'poolID': u'0b600aa6-42a1-11ea-b6bf-525400f551af'}, baseVolUUID=u'5346ebcd-80c6-4696-b376-1fd16cff6d34', to pVolUUID=u'18c43c63-6610-4aca-80d7-d5139f536e68', bandwidth=u'0', jobUUID=u'a78c0528-bdd2-4389-b186-2cac7dbeb6d6') from=::ffff:192.168.122.123,43428, flow_id=e3f8a7ff-2c46-4 0de-b205-8b9bcc15e3b8, vmId=a5a486b6-4ccb-40fb-9c47-ca1885cd87a9 (api:48) This merge fails immediately with: 2020-02-07 07:01:26,786-0500 ERROR (jsonrpc/7) [virt.vm] (vmId='a5a486b6-4ccb-40fb-9c47-ca1885cd87a9') merge: Cannot find volume 18c43c63-6610-4aca-80d7-d5139f536e68 in drive sdb's volume chain (vm:5872) Since the volume is not in the chain at this point. The previous merge with same topVolUUID was at: 2020-02-07 06:26:07,303-0500 INFO (jsonrpc/2) [api.virt] START merge(drive={u'imageID': u'5d407f5f-a758-4b60-9633-c41285ac024c', u'volumeID': u'a9d83317-fefa-4182-9f33-a443 628d773f', u'domainID': u'75bcadd7-d61f-4c53-ad0e-259460d8b10d', u'poolID': u'0b600aa6-42a1-11ea-b6bf-525400f551af'}, baseVolUUID=u'5346ebcd-80c6-4696-b376-1fd16cff6d34', to pVolUUID=u'18c43c63-6610-4aca-80d7-d5139f536e68', bandwidth=u'0', jobUUID=u'7a47233e-4ec1-4827-b1b5-1ef8c156fecb') from=::ffff:192.168.122.123,43428, flow_id=dc6b111a-6d5a-4 893-a312-115d89b0095b, vmId=a5a486b6-4ccb-40fb-9c47-ca1885cd87a9 (api:48) Following the merge flow using the jobUUID prefix: $ xzgrep 7a47233e vdsm.log.xz 2020-02-07 06:26:07,303-0500 INFO (jsonrpc/2) [api.virt] START merge(drive={u'imageID': u'5d407f5f-a758-4b60-9633-c41285ac024c', u'volumeID': u'a9d83317-fefa-4182-9f33-a443628d773f', u'domainID': u'75bcadd7-d61f-4c53-ad0e-259460d8b10d', u'poolID': u'0b600aa6-42a1-11ea-b6bf-525400f551af'}, baseVolUUID=u'5346ebcd-80c6-4696-b376-1fd16cff6d34', topVolUUID=u'18c43c63-6610-4aca-80d7-d5139f536e68', bandwidth=u'0', jobUUID=u'7a47233e-4ec1-4827-b1b5-1ef8c156fecb') from=::ffff:192.168.122.123,43428, flow_id=dc6b111a-6d5a-4893-a312-115d89b0095b, vmId=a5a486b6-4ccb-40fb-9c47-ca1885cd87a9 (api:48) 2020-02-07 06:26:07,376-0500 INFO (jsonrpc/2) [virt.vm] (vmId='a5a486b6-4ccb-40fb-9c47-ca1885cd87a9') Starting merge with jobUUID=u'7a47233e-4ec1-4827-b1b5-1ef8c156fecb', original chain=5346ebcd-80c6-4696-b376-1fd16cff6d34 < 18c43c63-6610-4aca-80d7-d5139f536e68 < a9d83317-fefa-4182-9f33-a443628d773f (top), disk='sdb', base='sdb[2]', top='sdb[1]', bandwidth=0, flags=8 (vm:5942) 2020-02-07 06:26:07,428-0500 DEBUG (jsonrpc/2) [virt.vm] (vmId='a5a486b6-4ccb-40fb-9c47-ca1885cd87a9') Checking job 7a47233e-4ec1-4827-b1b5-1ef8c156fecb (vm:5739) 2020-02-07 06:26:07,436-0500 DEBUG (jsonrpc/2) [virt.vm] (vmId='a5a486b6-4ccb-40fb-9c47-ca1885cd87a9') Job 7a47233e-4ec1-4827-b1b5-1ef8c156fecb live info: {'end': 1073741824L, 'bandwidth': 0L, 'type': 3, 'cur': 1114112L} (vm:5789) 2020-02-07 06:26:08,741-0500 DEBUG (periodic/3) [virt.vm] (vmId='a5a486b6-4ccb-40fb-9c47-ca1885cd87a9') Checking job 7a47233e-4ec1-4827-b1b5-1ef8c156fecb (vm:5739) The libvirt job was gone here: 2020-02-07 06:26:09,175-0500 INFO (periodic/3) [virt.vm] (vmId='a5a486b6-4ccb-40fb-9c47-ca1885cd87a9') Libvirt job 7a47233e-4ec1-4827-b1b5-1ef8c156fecb was terminated (vm:5798) So we started a cleanup thread: 2020-02-07 06:26:09,175-0500 INFO (periodic/3) [virt.vm] (vmId='a5a486b6-4ccb-40fb-9c47-ca1885cd87a9') Starting cleanup thread for job: 7a47233e-4ec1-4827-b1b5-1ef8c156fecb (vm:5806) But note that the libvirt BLOCK_JOB event completion event was received only here: 2020-02-07 06:26:09,349-0500 INFO (libvirt/events) [virt.vm] (vmId='a5a486b6-4ccb-40fb-9c47-ca1885cd87a9') Block job COMMIT for drive /rhev/data-center/mnt/blockSD/75bcadd7-d61f-4c53-ad0e-259460d8b10d/images/5d407f5f-a758-4b60-9633-c41285ac024c/a9d83317-fefa-4182-9f33-a443628d773f has completed (vm:5835) So following code run before libvirt submitted the completed event, and may have see incorrect chain when getting the chain from libvirt. 2020-02-07 06:26:09,232-0500 INFO (merge/7a47233e) [virt.vm] (vmId='a5a486b6-4ccb-40fb-9c47-ca1885cd87a9') Synchronizing volume chain after live merge (job 7a47233e-4ec1-4827-b1b5-1ef8c156fecb) (vm:6251) Here we see that libvirt chain still includes the volume 18c43c63-6610-4aca-80d7-d5139f536e68. 2020-02-07 06:26:09,329-0500 DEBUG (merge/7a47233e) [virt.vm] (vmId='a5a486b6-4ccb-40fb-9c47-ca1885cd87a9') vdsm chain: ['5346ebcd-80c6-4696-b376-1fd16cff6d34', '18c43c63-6610-4aca-80d7-d5139f536e68', 'a9d83317-fefa-4182-9f33-a443628d773f'], libvirt chain: ['5346ebcd-80c6-4696-b376-1fd16cff6d34', '18c43c63-6610-4aca-80d7-d5139f536e68', 'a9d83317-fefa-4182-9f33-a443628d773f'] (vm:6026) So imageSyncVolumeChain did not update vdsm metadata - correctly. 2020-02-07 06:26:09,330-0500 INFO (merge/7a47233e) [vdsm.api] START imageSyncVolumeChain(sdUUID=u'75bcadd7-d61f-4c53-ad0e-259460d8b10d', imgUUID=u'5d407f5f-a758-4b60-9633-c41285ac024c', volUUID=u'a9d83317-fefa-4182-9f33-a443628d773f', newChain=['5346ebcd-80c6-4696-b376-1fd16cff6d34', '18c43c63-6610-4aca-80d7-d5139f536e68', 'a9d83317-fefa-4182-9f33-a443628d773f']) from=internal, task_id=1211d26c-19c4-426a-a839-71df0c6db424 (api:48) 2020-02-07 06:26:09,629-0500 INFO (merge/7a47233e) [storage.Image] Current chain=5346ebcd-80c6-4696-b376-1fd16cff6d34 < 18c43c63-6610-4aca-80d7-d5139f536e68 < a9d83317-fefa-4182-9f33-a443628d773f (top) (image:1221) 2020-02-07 06:26:09,630-0500 INFO (merge/7a47233e) [vdsm.api] FINISH imageSyncVolumeChain return=None from=internal, task_id=1211d26c-19c4-426a-a839-71df0c6db424 (api:54) 2020-02-07 06:26:09,631-0500 INFO (merge/7a47233e) [virt.vm] (vmId='a5a486b6-4ccb-40fb-9c47-ca1885cd87a9') Synchronization completed (job 7a47233e-4ec1-4827-b1b5-1ef8c156fecb) (vm:6260) 2020-02-07 06:26:23,742-0500 INFO (periodic/4) [virt.vm] (vmId='a5a486b6-4ccb-40fb-9c47-ca1885cd87a9') Cleanup thread <vdsm.virt.vm.LiveMergeCleanupThread object at 0x7fcba419e3d0> successfully completed, untracking job 7a47233e-4ec1-4827-b1b5-1ef8c156fecb (base=5346ebcd-80c6-4696-b376-1fd16cff6d34, top=18c43c63-6610-4aca-80d7-d5139f536e68) (vm:5749) So I think what happens is: - Vdsm assumes that once libvirt block job does not exist, the block job is done, and getting the volume chain from libvirt using the vm xml will show the correct chain - Libvirt does not show the correct chain at this point. - After libvirt send the BLOCK_JOB completion event, it shows a correct backing chain. Vdsm is not using the completion event for detecting completion, since events are not reliable. If vdsm is restarted during the merge, it can loose the event. So vdsm is using the existence of the libvirt job as a reliable way to detect that merge was completed or failed, and then check the chain to understand the result of the merge. Peter, can we change libvirt so block job is removed only after the backing chain is modified? A possible solution is to keep the block job around in special "completed" state, and remove it only after the completion event is sent, or after some delay (e.g. 1 second).
(In reply to Nir Soffer from comment #3) > So I think what happens is: > > - Vdsm assumes that once libvirt block job does not exist, the block job > is done, and getting the volume chain from libvirt using the vm xml will > show the correct chain > > - Libvirt does not show the correct chain at this point. > > - After libvirt send the BLOCK_JOB completion event, it shows a correct > backing chain. The official statement would be to wait for the block job completion event. > Vdsm is not using the completion event for detecting completion, since events > are not reliable. If vdsm is restarted during the merge, it can loose the > event. How are they not reliable? Checking the job info only on restart would be far more reliable. > So vdsm is using the existence of the libvirt job as a reliable way to detect > that merge was completed or failed, and then check the chain to understand > the > result of the merge. > > > Peter, can we change libvirt so block job is removed only after the backing > chain is modified? With pre-blockdev qemu/libvirt combination qemu doesn't report any state, so the API returns nothing because qemu doesn't report the job while libvirt didn't finish the processing yet. This is fixed with libvirt-5.10 and qemu-4.2 where we ask qemu to keep the job around and until it's processed by the event handler which then also fires the BLOCK_JOB_COMPLETE libvirt event, so the job api returns the job until it's done processing it. The job state more or less reports proper progress statistics. > A possible solution is to keep the block job around in special "completed" > state, With pre-blockdev-era we could have the job, but there's no sane way to report any useful progress information at that point so the data reported to satisfy semantics of the API would be plain garbage. I'm not sure though how far such thing will be backportable. It depends on internal infrastructure which was done for blockdev support and that might be too complex to backport. > and remove it only after the completion event is sent, or after some delay > (e.g. 1 second). Delaying it arbitrarily is not acceptable.
> The output of qemu-img info in comment 0 is not very helpful, can you attach > output of: > > qemu-img info --backing-chain > /dev/75bcadd7-d61f-4c53-ad0e-259460d8b10d/a9d83317-fefa-4182-9f33- > a443628d773f > [root@localhost ~]# qemu-img info --backing-chain /dev/75bcadd7-d61f-4c53-ad0e-259460d8b10d/a9d83317-fefa-4182-9f33-a443628d773f image: /dev/75bcadd7-d61f-4c53-ad0e-259460d8b10d/a9d83317-fefa-4182-9f33-a443628d773f file format: qcow2 virtual size: 1.0G (1073741824 bytes) disk size: 0 cluster_size: 65536 backing file: 5346ebcd-80c6-4696-b376-1fd16cff6d34 (actual path: /dev/75bcadd7-d61f-4c53-ad0e-259460d8b10d/5346ebcd-80c6-4696-b376-1fd16cff6d34) backing file format: qcow2 Format specific information: compat: 1.1 lazy refcounts: false refcount bits: 16 corrupt: false image: /dev/75bcadd7-d61f-4c53-ad0e-259460d8b10d/5346ebcd-80c6-4696-b376-1fd16cff6d34 file format: qcow2 virtual size: 1.0G (1073741824 bytes) disk size: 0 cluster_size: 65536 Format specific information: compat: 1.1 lazy refcounts: false refcount bits: 16 corrupt: false > Please extract the volume metadata using the md slots MD_1, MD_5, MD_8. # dd if=/dev/75bcadd7-d61f-4c53-ad0e-259460d8b10d/metadata bs=512 count=1 skip=$((1024 * 1024 + 1 * 8192)) iflag=direct,skip_bytes CAP=1073741824 CTIME=1581074755 DESCRIPTION= DISKTYPE=DATA DOMAIN=75bcadd7-d61f-4c53-ad0e-259460d8b10d FORMAT=COW GEN=0 IMAGE=5d407f5f-a758-4b60-9633-c41285ac024c LEGALITY=LEGAL PUUID=18c43c63-6610-4aca-80d7-d5139f536e68 TYPE=SPARSE VOLTYPE=LEAF EOF # dd if=/dev/75bcadd7-d61f-4c53-ad0e-259460d8b10d/metadata bs=512 count=1 skip=$((1024 * 1024 + 5 * 8192)) iflag=direct,skip_bytes CAP=1073741824 CTIME=1581074658 DESCRIPTION= DISKTYPE=DATA DOMAIN=75bcadd7-d61f-4c53-ad0e-259460d8b10d FORMAT=COW GEN=0 IMAGE=5d407f5f-a758-4b60-9633-c41285ac024c LEGALITY=LEGAL PUUID=5346ebcd-80c6-4696-b376-1fd16cff6d34 TYPE=SPARSE VOLTYPE=INTERNAL EOF # dd if=/dev/75bcadd7-d61f-4c53-ad0e-259460d8b10d/metadata bs=512 count=1 skip=$((1024 * 1024 + 8 * 8192)) iflag=direct,skip_bytes CAP=1073741824 CTIME=1581028699 DESCRIPTION={"DiskAlias":"test-cow","DiskDescription":""} DISKTYPE=DATA DOMAIN=75bcadd7-d61f-4c53-ad0e-259460d8b10d FORMAT=COW GEN=0 IMAGE=5d407f5f-a758-4b60-9633-c41285ac024c LEGALITY=LEGAL PUUID=00000000-0000-0000-0000-000000000000 TYPE=SPARSE VOLTYPE=INTERNAL EOF > Also, where is the log from the test tool? > https://github.com/vjuranek/ovirt-bug-reproducers/blob/master/BZ1796415/snapshots.py > What was the last successful step? Did we have any error in the test script? unfortunately I didn't save it. But there wasn't anything interesting, there were no errors, it hanged waiting for the snapshost to be deleted
(In reply to Peter Krempa from comment #4) > (In reply to Nir Soffer from comment #3) > > Vdsm is not using the completion event for detecting completion, since events > > are not reliable. If vdsm is restarted during the merge, it can loose the > > event. > > How are they not reliable? You can miss an event if vdsm is restarted at the right time. And when you restart, the existence of the job your only hint about the job. We started to use events only last month to make it easier to debug issue. I think we can use them to complement the job monitoring code. > > So vdsm is using the existence of the libvirt job as a reliable way to detect > > that merge was completed or failed, and then check the chain to understand > > the > > result of the merge. > > > > > > Peter, can we change libvirt so block job is removed only after the backing > > chain is modified? > > With pre-blockdev qemu/libvirt combination Yes, this is a bout libvirt in RHEL 7.7 (4.5?) > qemu doesn't report any state, so > the API returns nothing because qemu doesn't report the job while libvirt > didn't finish the processing yet. What do you mean not reporting the job? We assume that the libvirt block jobs are reported in virtDomainBlockJobInfo after virDomainBlockCommit was called, and that not job which is not reported by libvirt will never be reported again. We also assume that libvirt xml and block jobs are synced; if block job does not exist because a volume was removed from the chain in qemu, getting libvirt xml at this point will not show the volume in the backing chain. Are these assumptions correct for libvirt 4.5? libvirt 6.0?
(In reply to Vojtech Juranek from comment #5) > > The output of qemu-img info in comment 0 is not very helpful, can you attach > > output of: > > > > qemu-img info --backing-chain > > /dev/75bcadd7-d61f-4c53-ad0e-259460d8b10d/a9d83317-fefa-4182-9f33- > > a443628d773f ... Makes sense. > > Please extract the volume metadata using the md slots MD_1, MD_5, MD_8. > > > # dd if=/dev/75bcadd7-d61f-4c53-ad0e-259460d8b10d/metadata bs=512 count=1 > skip=$((1024 * 1024 + 1 * 8192)) iflag=direct,skip_bytes ... > PUUID=18c43c63-6610-4aca-80d7-d5139f536e68 > VOLTYPE=LEAF ... > # dd if=/dev/75bcadd7-d61f-4c53-ad0e-259460d8b10d/metadata bs=512 count=1 > skip=$((1024 * 1024 + 5 * 8192)) iflag=direct,skip_bytes ... > PUUID=5346ebcd-80c6-4696-b376-1fd16cff6d34 > VOLTYPE=INTERNAL ... > # dd if=/dev/75bcadd7-d61f-4c53-ad0e-259460d8b10d/metadata bs=512 count=1 > skip=$((1024 * 1024 + 8 * 8192)) iflag=direct,skip_bytes ... > PUUID=00000000-0000-0000-0000-000000000000 > VOLTYPE=INTERNAL ... Matches what we see in the logs about vdsm chain and libvirt chain. Vdsm correctly did not update the chain. > > Also, where is the log from the test tool? > > https://github.com/vjuranek/ovirt-bug-reproducers/blob/master/BZ1796415/snapshots.py > > What was the last successful step? Did we have any error in the test script? > > unfortunately I didn't save it. But there wasn't anything interesting, there > were no errors, it hanged waiting for the snapshost to be deleted We probably need to improve the test script to detect that the snapshot delete operation failed, instead of waiting forever. We also need an example for the SDK showing how to do this properly.
(In reply to Nir Soffer from comment #6) > (In reply to Peter Krempa from comment #4) > > (In reply to Nir Soffer from comment #3) [...] > > > Peter, can we change libvirt so block job is removed only after the backing > > > chain is modified? > > > > With pre-blockdev qemu/libvirt combination > > Yes, this is a bout libvirt in RHEL 7.7 (4.5?) Hmm, I can't guarantee that it will be possible to fix that version. > > qemu doesn't report any state, so > > the API returns nothing because qemu doesn't report the job while libvirt > > didn't finish the processing yet. > > What do you mean not reporting the job? > > We assume that the libvirt block jobs are reported in virtDomainBlockJobInfo > after > virDomainBlockCommit was called, and that not job which is not reported by > libvirt > will never be reported again. I don't quite understand what you meant. Anyways, the old control flow is: 1) block-commit is started in qemu, it's state can be tracked via virtDomainBlockJobInfo 2) block-commit finishes, QEMU fires an event, Insert race here: using query-block-jobs which is used by virtDomainBlockJobInfo returns nothing as the job finished from qemu's PoV and thus also virtDomainBlockJobInfo reports nothing 3) libvirt processes the event produced by qemu and updates it's state (every step above is considered atomic) With the new control flow we ask qemu to keep the job around until step 3 is reached, so you either see the job state or the updated XML. > We also assume that libvirt xml and block jobs are synced; if block job does > not > exist because a volume was removed from the chain in qemu, getting libvirt > xml > at this point will not show the volume in the backing chain. > > Are these assumptions correct for libvirt 4.5? libvirt 6.0? For libvirt 4.5 they are correct only if you take the event as the end of the job. The fact that virtDomainBlockJobInfo doesn't return anything does not necessarily mean libvirt finished processing the XML update. That is fixed with libvirt-6 and qemu-4.2 because we ask qemu to keep the job along until we start processing it (so that from the API point of view the update is atomic) so that we can even report the current job state.
(In reply to Peter Krempa from comment #8) Based on Peter response, RHV 4.4 should be fine on RHEL 8.2 without change. If we want to fix this race in RHV 4.3, we need to wait for the block job event before we continue with job cleanup. If we don't get the block job completion event (maybe vdsm was just restarted) we need to time out after reasonable timeout and continue cleanup as done in current code. The attached patches are the first step for such fix. Germano, do we have cases with this issue? How important is to fix this in 4.3?
Moving back to New. The bug status was changed by broken CI scripts.
*** Bug 1814318 has been marked as a duplicate of this bug. ***
Created attachment 1716204 [details] logs We saw it again in our automation tests. Versions: ovirt-engine-4.3.11.4-0.1.el7.noarch vdsm-4.30.50-1.el7ev.x86_64 (rhv-4.3.11-8) Attaching relevant logs.
does it reproduce in 4.4? As per previous comments it should not, so once we confirm we can close this bug as fixed
(In reply to Michal Skrivanek from comment #17) > does it reproduce in 4.4? As per previous comments it should not, so once we > confirm we can close this bug as fixed
(In reply to Avihai from comment #18) > (In reply to Michal Skrivanek from comment #17) > > does it reproduce in 4.4? As per previous comments it should not, so once we > > confirm we can close this bug as fixed Evelina, did we encounter this issue in 4.4 runs?
(In reply to Avihai from comment #19) > (In reply to Avihai from comment #18) > > (In reply to Michal Skrivanek from comment #17) > > > does it reproduce in 4.4? As per previous comments it should not, so once we > > > confirm we can close this bug as fixed > > Evelina, did we encounter this issue in 4.4 runs? No, we can closed this as fixed.
removing remaining need info