Description of problem: It has been reported by an upstream user. He was using an NFS ISO storage domain; by fault he manually created a symlink to an ISO file outside that storage domain on the host where he was exporting the NFS share for the ISO storage domain. Of course this is absolutely wrong cause NFS server is not following the symlink and the NFS client (out VDSM host) will simply try to locally follow it and if the same file doesn't exists in the same position on all the host we just got a broken symlink. Then the subsequent issue and so this bug: If just one of the file is not accessible (the broken symlink in that case), the whole StorageDomain.getFileStats raises an exception and so the whole task got aborted and the engine shows an empty storage domain while other files are there. Probably catching that exception and showing other valid files makes it more usable. Currently this appears in the events area: Refresh image list failed for domain(s): testiso (All file type). Please check domain activity. Thread-40273::DEBUG::2015-11-11 10:03:44,309::__init__::481::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'StorageDomain.getFileStats' in bridge with {u'caseSensitive': False, u'pattern': u'*.iso', u'storagedomainID': u'1ffcd41b-b204-456f-be0e-1b22cd94da9f'} Thread-40273::DEBUG::2015-11-11 10:03:44,310::task::595::Storage.TaskManager.Task::(_updateState) Task=`a1b68de1-2bdb-4aac-a40f-c980d3ecc5ea`::moving from state init -> state preparing Thread-40273::INFO::2015-11-11 10:03:44,310::logUtils::44::dispatcher::(wrapper) Run and protect: getFileStats(sdUUID=u'1ffcd41b-b204-456f-be0e-1b22cd94da9f', pattern=u'*.iso', caseSensitive=False, options=None) Thread-40273::DEBUG::2015-11-11 10:03:44,310::resourceManager::198::Storage.ResourceManager.Request::(__init__) ResName=`Storage.1ffcd41b-b204-456f-be0e-1b22cd94da9f`ReqID=`08685966-75bf-471f-b911-a5a43430f9f7`::Request was made in '/usr/share/vdsm/storage/hsm.py' line '2317' at 'getFileStats' Thread-40273::DEBUG::2015-11-11 10:03:44,311::resourceManager::542::Storage.ResourceManager::(registerResource) Trying to register resource 'Storage.1ffcd41b-b204-456f-be0e-1b22cd94da9f' for lock type 'shared' Thread-40273::DEBUG::2015-11-11 10:03:44,311::resourceManager::601::Storage.ResourceManager::(registerResource) Resource 'Storage.1ffcd41b-b204-456f-be0e-1b22cd94da9f' is free. Now locking as 'shared' (1 active user) Thread-40273::DEBUG::2015-11-11 10:03:44,311::resourceManager::238::Storage.ResourceManager.Request::(grant) ResName=`Storage.1ffcd41b-b204-456f-be0e-1b22cd94da9f`ReqID=`08685966-75bf-471f-b911-a5a43430f9f7`::Granted request Thread-40273::DEBUG::2015-11-11 10:03:44,311::task::827::Storage.TaskManager.Task::(resourceAcquired) Task=`a1b68de1-2bdb-4aac-a40f-c980d3ecc5ea`::_resourcesAcquired: Storage.1ffcd41b-b204-456f-be0e-1b22cd94da9f (shared) Thread-40273::DEBUG::2015-11-11 10:03:44,311::task::993::Storage.TaskManager.Task::(_decref) Task=`a1b68de1-2bdb-4aac-a40f-c980d3ecc5ea`::ref 1 aborting False Thread-40273::ERROR::2015-11-11 10:03:44,313::task::866::Storage.TaskManager.Task::(_setError) Task=`a1b68de1-2bdb-4aac-a40f-c980d3ecc5ea`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 873, in _run return fn(*args, **kargs) File "/usr/share/vdsm/logUtils.py", line 45, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 2324, in getFileStats caseSensitive=caseSensitive) File "/usr/share/vdsm/storage/fileSD.py", line 271, in getFileList filesList = self.oop.simpleWalk(basedir) File "/usr/share/vdsm/storage/outOfProcess.py", line 373, in simpleWalk if osPath.isdir(fullpath) and not osPath.islink(fullpath): File "/usr/share/vdsm/storage/outOfProcess.py", line 286, in isdir res = self._iop.stat(path) File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 414, in stat resdict = self._sendCommand("stat", {"path": path}, self.timeout) File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 391, in _sendCommand raise OSError(errcode, errstr) OSError: [Errno 13] Permission denied Thread-40273::DEBUG::2015-11-11 10:03:44,338::task::885::Storage.TaskManager.Task::(_run) Task=`a1b68de1-2bdb-4aac-a40f-c980d3ecc5ea`::Task._run: a1b68de1-2bdb-4aac-a40f-c980d3ecc5ea (u'1ffcd41b-b204-456f-be0e-1b22cd94da9f', u'*.iso', False) {} failed - stopping task Thread-40273::DEBUG::2015-11-11 10:03:44,338::task::1217::Storage.TaskManager.Task::(stop) Task=`a1b68de1-2bdb-4aac-a40f-c980d3ecc5ea`::stopping in state preparing (force False) Thread-40273::DEBUG::2015-11-11 10:03:44,338::task::993::Storage.TaskManager.Task::(_decref) Task=`a1b68de1-2bdb-4aac-a40f-c980d3ecc5ea`::ref 1 aborting True Thread-40273::INFO::2015-11-11 10:03:44,338::task::1171::Storage.TaskManager.Task::(prepare) Task=`a1b68de1-2bdb-4aac-a40f-c980d3ecc5ea`::aborting: Task is aborted: u'[Errno 13] Permission denied' - code 100 Thread-40273::DEBUG::2015-11-11 10:03:44,338::task::1176::Storage.TaskManager.Task::(prepare) Task=`a1b68de1-2bdb-4aac-a40f-c980d3ecc5ea`::Prepare: aborted: [Errno 13] Permission denied Version-Release number of selected component (if applicable): 4.17.10 How reproducible: 100% Steps to Reproduce: 1. create an NFS ISO storage domain, add valid ISO there 2. manually create a broken symlink within that NFS share 3. Try to refresh the ISO storage domain images list Actual results: the whole storage domain appears as empty, other valid images got hidden Expected results: a specific error got reported about the broken image specifying its name, other valid images are still usable Additional info:
Liron, haven't you looked into a similar case in the past?
this is a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1130024
we can consider to improve the user experience in that case (for example, display some meaningful message), on the other hand...this issue is fairly rare - Allon, up to you.
*** This bug has been marked as a duplicate of bug 1130024 ***