Created attachment 1677968 [details] VDSM DEBUG Log from SPM Description of problem: Engine is failing to activate/detach the domain if there is a second directory with similar name: [root@ovirt3 gluster1:_data__fast4]# ll общо 1 drwxrwxr-x. 5 vdsm kvm 48 10 ное 13,42 578bca3d-6540-41cd-8e0e-9e3047026484 drwxrwxr-x. 5 vdsm kvm 48 10 ное 13,42 578bca3d-6540-41cd-8e0e-9e3047026484-NEW [root@ovirt3 gluster1:_data__fast]# ll общо 1 drwxrwxrwx. 6 vdsm kvm 59 11 апр 6,30 396604d9-2a9e-49cd-9563-fdc79981f67b drwxr-xr-x. 5 vdsm kvm 48 19 ное 21,32 396604d9-2a9e-49cd-9563-fdc79981f67b-OLD Version-Release number of selected component (if applicable): vdsm-api-4.30.43-1.el7.noarch vdsm-4.30.43-1.el7.x86_64 vdsm-http-4.30.43-1.el7.noarch vdsm-hook-openstacknet-4.30.43-1.el7.noarch vdsm-yajsonrpc-4.30.43-1.el7.noarch vdsm-jsonrpc-4.30.43-1.el7.noarch vdsm-hook-fcoe-4.30.43-1.el7.noarch vdsm-hook-vhostmd-4.30.43-1.el7.noarch vdsm-network-4.30.43-1.el7.x86_64 vdsm-hook-vmfex-dev-4.30.43-1.el7.noarch vdsm-hook-ethtool-options-4.30.43-1.el7.noarch vdsm-python-4.30.43-1.el7.noarch vdsm-client-4.30.43-1.el7.noarch vdsm-common-4.30.43-1.el7.noarch vdsm-gluster-4.30.43-1.el7.x86_64 How reproducible: Always - 4 storage domains were affected Steps to Reproduce: 1. Set a domain into maintenance 2. Create a copy with the name <SD uuid>-NEW 3. Replace the old and the new: mv <SD uuid> <SD uuid>-old mv <SD uuid>-new <SD uuid> 4. Try to activate the domain Actual results: Fail to activate the domain with traceback: 2020-04-11 06:27:17,434+0300 ERROR (jsonrpc/2) [storage.TaskManager.Task] (Task='8ad73c21-cc9c-44ec-ab83-ced16d0bf748') Unexpected error (task:875) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run return fn(*args, **kargs) File "<string>", line 2, in activateStorageDomain File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 50, in method ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 1261, in activateStorageDomain pool.activateSD(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper return method(self, *args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 1138, in activateSD dom = sdCache.produce(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 110, in produce domain.getRealDomain() File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 51, in getRealDomain return self._cache._realProduce(self._sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 134, in _realProduce domain = self._findDomain(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 151, in _findDomain return findMethod(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 176, in _findUnfetchedDomain raise se.StorageDomainDoesNotExist(sdUUID) StorageDomainDoesNotExist: Storage domain does not exist: (u'396604d9-2a9e-49cd-9563-fdc79981f67b',) Expected results: Engine to get the <SD uuid> from the DB and look for an EXACT match and not for anything that contains the <SD uuid> Additional info: Debug logs attached.
This is caused by searching for storage domain directories using this pattern: /rhev/data-center/mnt/glusterSD/*-*-*-*-*/dom_md We expect to find exactly one item per mountpoint, since this is the directory structure we create. We don't support user created files or directories in a storage domain mount. To perform operations described in comment 0, you can use this directory structure instead: 578bca3d-6540-41cd-8e0e-9e3047026484 new/578bca3d-6540-41cd-8e0e-9e3047026484 396604d9-2a9e-49cd-9563-fdc79981f67b old/396604d9-2a9e-49cd-9563-fdc79981f67b But I would avoid this. Instead you can do this on the server side. Assuming that the directory structure on the NFS server is: /export/ data_fast4/ 578bca3d-6540-41cd-8e0e-9e3047026484/ dom_md/ Create the copy of the domain at: /export/ data_fast4/ 578bca3d-6540-41cd-8e0e-9e3047026484/ dom_md/ data_fast4-new/ 578bca3d-6540-41cd-8e0e-9e3047026484/ dom_md/ With this oVirt cannot be affected since it does not see /export/data_fast4-new, but the operations on the file system on the server side are the same. Closing since this is not a bug.