Created attachment 761265 [details] logs (vdsm, sanlock, engine, server, /var/log/messages...) Description of problem: Detaching of an NFS ISO storage domain attached to a localfs data center failed with the following error in the vdsm.log Thread-724::ERROR::2013-06-14 03:47:01,974::task::850::TaskManager.Task::(_setError) Task=`8a00d0fe-11d4-4399-b868-97af5d8b64e5`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 857, in _run return fn(*args, **kargs) File "/usr/share/vdsm/logUtils.py", line 41, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 732, in detachStorageDomain pool.detachSD(sdUUID) File "/usr/share/vdsm/storage/securable.py", line 68, in wrapper return f(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 1006, in detachSD dom.acquireClusterLock(self.id) File "/usr/share/vdsm/storage/sd.py", line 474, in acquireClusterLock self._clusterLock.acquire(hostID) File "/usr/share/vdsm/storage/clusterlock.py", line 111, in acquire raise se.AcquireLockFailure(self._sdUUID, rc, out, err) AcquireLockFailure: Cannot obtain lock: "id=827f4f3e-c396-4f23-89e7-71ecbe99d3b5, rc=1, out=[], err=['panic: [7653] handler: IO op too long: (Success)']" engine.log: 2013-06-14 03:46:57,614 INFO [org.ovirt.engine.core.bll.storage.DetachStorageDomainFromPoolCommand] (ajp-/127.0.0.1:8702-7) [405e320f] Detach storage domain: after connect 2013-06-14 03:46:57,615 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.DetachStorageDomainVDSCommand] (ajp-/127.0.0.1:8702-7) [405e320f] START, DetachStorageDomainVDSCommand( storagePoolId = 28345cb6-b7cb-4763-ae5c-6ee6e654aebf, ignoreFailoverLimit = false, compatabilityVersion = null, storageDomainId = 827f4f3e-c396-4f23-89e7-71ecbe99d3b5, masterDomainId = 00000000-0000-0000-0000-000000000000, masterVersion = 2, force = false), log id: 62c53e6a 2013-06-14 03:47:01,988 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (ajp-/127.0.0.1:8702-7) [405e320f] Failed in DetachStorageDomainVDS method 2013-06-14 03:47:01,990 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (ajp-/127.0.0.1:8702-7) [405e320f] Error code AcquireLockFailure and error message IRSGenericException: IRSErrorException: Failed to DetachStorageDomainVDS, error = Cannot obtain lock: "id=827f4f3e-c396-4f23-89e7-71ecbe99d3b5, rc=1, out=[], err=['panic: [7653] handler: IO op too long: (Success)']" 2013-06-14 03:47:01,999 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (ajp-/127.0.0.1:8702-7) [405e320f] IrsBroker::Failed::DetachStorageDomainVDS due to: IRSErrorException: IRSGenericException: IRSErrorException: Failed to DetachStorageDomainVDS, error = Cannot obtain lock: "id=827f4f3e-c396-4f23-89e7-71ecbe99d3b5, rc=1, out=[], err=['panic: [7653] handler: IO op too long: (Success)']" Version-Release number of selected component (if applicable): sf18 How reproducible: Happened in automated tests: http://jenkins.qa.lab.tlv.redhat.com:8080/view/Storage/view/3.2/job/3.2-storage_sanity-localfs-rest/149 I cannot reproduce it now, so it might have been only a temporary problem with the ISO domain - if you think it is the case, please just close the bug. Steps to Reproduce: 1. create a localfs data center with two local storage domains 2. attach NFS ISO storage domain to the data center and activate it 4. deactivate non-master storage domains (data and ISO) 5. try to detach non-master storage domains Actual results: Detach of the ISO domain failed Expected results: Domain should be detached Additional info:
This has nothing to do with SANLock, but the underlying problem is simply that the host cannot change the metadata of the iso domain without first obtaining the lock. It may have been in use by another DC at the same time which caused it to fail or a storage issue, either way this is transient indeed.