Created attachment 1759567 [details] Excerpt from vdsm.log Description of problem: We are running an oVirt 4.3.10 production cluster with 9 hosts and 5 datastore domains, 4 of which are Gluster domains. oVirt showed hourly error messages 26.02.2021 02:00:48 VDSM command SetVolumeDescriptionVDS failed: Could not acquire resource. Probably resource factory threw an exception.: () 26.02.2021 02:00:48 Failed to update OVF disks 5aa438e3-8d22-4b6c-bccf-a843151ca0be, OVF data isn't updated on those OVF stores (Data Center datacenter01, Storage Domain vmstore13). 26.02.2021 02:00:48 Failed to update VMs/Templates OVF data for Storage Domain vmstore13 in Data Center datacenter01. Only one domain ("vmstore13") was affected. Trying to update the OVF's manually from the engine web-gui lead to the same result. The vm's with discs on the affected domain were running fine, snapshots were working. I tried to move the SPM role to another host, which succeeded, but the error messages persisted. The vdsm log on the SPM host contained something like 2021-02-26 03:00:57,701+0100 INFO (jsonrpc/2) [vdsm.api] START setVolumeDescription(sdUUID=u'9f731135-f5d9-4609-9e3b-fa9cae75e314', spUUID=u'33e8dc9e-8bc8-11ea-bd76-00163e741033', imgUUID=u'5aa438e3-8d22-4b6c-bccf-a843151ca0be', volUUID=u'0795e58c-4960-413a-a0b4-e8a6d547fda5', description=u'{"Updated":false,"Last Updated":"Wed Feb 24 17:48:17 CET 2021","Storage Domains":[{"uuid":"9f731135-f5d9-4609-9e3b-fa9cae75e314"}],"Disk Description":"OVF_STORE"}', options=None) from=::ffff:10.70.1.1,46968, flow_id=1f314676, task_id=9101db01-b4f0-447e-a5a9-b6af76278d55 (api:48) 2021-02-26 03:00:57,712+0100 ERROR (jsonrpc/2) [storage.VolumeManifest] [Errno 116] Stale file handle (fileVolume:155) for each error, I have attached the relevant part of vdsm.log. Finally I managed to fix it by doing a cat /rhev/data-center/mnt/glusterSD/10.70.7.17\:_vmstore13/9f731135-f5d9-4609-9e3b-fa9cae75e314/images/5aa438e3-8d22-4b6c-bccf-a843151ca0be/0795e58c-4960-413a-a0b4-e8a6d547fda5.meta on the host the gluster file system was mounted from (vmhost17, IP 10.70.7.17), got "file not found", repeated the same command, this time successful and the problem went away. Version-Release number of selected component (if applicable): oVirt 4.3.10 How reproducible: not reproducible Additional info: I posted the problem on users and Nir Soffer asked me to file a bug because "we may need to improve storage monitoring with Gluster to handle [Errno 116] Stale file handle".
The documentation text flag should only be set after 'doc text' field is provided. Please provide the documentation text and set the flag to '?' again.
Most of the issues we have fixed in ovirt-4.4. So it's always better to upgrade to latest. So I strongly recommend to upgrade to ovirt-4.4.
Jurgen, Do let me know if you are facing the same issue with latest version if yes, please attach engine & vdsm logs.
Closing this bug as not able to reproduce and no new info from reporter. Please feel free to re-open this bug if encountered the same issue with newer version.
Sorry for the late answer. We have vot updated to oVirt 4.4 yet and will not for at least a few months