Bug 974558 - Detaching NFS ISO domain from localfs data center fails with 'panic: [7653] handler: IO op too long: (Success)'
Summary: Detaching NFS ISO domain from localfs data center fails with 'panic: [7653] h...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 3.2.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 3.2.0
Assignee: Nobody's working on this, feel free to take it
QA Contact:
URL:
Whiteboard: storage
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-06-14 12:30 UTC by Katarzyna Jachim
Modified: 2016-02-10 18:38 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-06-20 11:48:40 UTC
oVirt Team: Storage
Target Upstream Version:


Attachments (Terms of Use)
logs (vdsm, sanlock, engine, server, /var/log/messages...) (3.59 MB, application/x-compressed-tar)
2013-06-14 12:30 UTC, Katarzyna Jachim
no flags Details

Description Katarzyna Jachim 2013-06-14 12:30:23 UTC
Created attachment 761265 [details]
logs (vdsm, sanlock, engine, server, /var/log/messages...)

Description of problem:

Detaching of an NFS ISO storage domain attached to a localfs data center failed with the following error in the vdsm.log

Thread-724::ERROR::2013-06-14 03:47:01,974::task::850::TaskManager.Task::(_setError) Task=`8a00d0fe-11d4-4399-b868-97af5d8b64e5`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 857, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 41, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 732, in detachStorageDomain
    pool.detachSD(sdUUID)
  File "/usr/share/vdsm/storage/securable.py", line 68, in wrapper
    return f(self, *args, **kwargs)
  File "/usr/share/vdsm/storage/sp.py", line 1006, in detachSD
    dom.acquireClusterLock(self.id)
  File "/usr/share/vdsm/storage/sd.py", line 474, in acquireClusterLock
    self._clusterLock.acquire(hostID)
  File "/usr/share/vdsm/storage/clusterlock.py", line 111, in acquire
    raise se.AcquireLockFailure(self._sdUUID, rc, out, err)
AcquireLockFailure: Cannot obtain lock: "id=827f4f3e-c396-4f23-89e7-71ecbe99d3b5, rc=1, out=[], err=['panic: [7653] handler: IO op too long: (Success)']"

engine.log:
2013-06-14 03:46:57,614 INFO  [org.ovirt.engine.core.bll.storage.DetachStorageDomainFromPoolCommand] (ajp-/127.0.0.1:8702-7) [405e320f]  Detach storage domain: after connect
2013-06-14 03:46:57,615 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.DetachStorageDomainVDSCommand] (ajp-/127.0.0.1:8702-7) [405e320f] START, DetachStorageDomainVDSCommand( storagePoolId = 28345cb6-b7cb-4763-ae5c-6ee6e654aebf, ignoreFailoverLimit = false, compatabilityVersion = null, storageDomainId = 827f4f3e-c396-4f23-89e7-71ecbe99d3b5, masterDomainId = 00000000-0000-0000-0000-000000000000, masterVersion = 2, force = false), log id: 62c53e6a
2013-06-14 03:47:01,988 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (ajp-/127.0.0.1:8702-7) [405e320f] Failed in DetachStorageDomainVDS method
2013-06-14 03:47:01,990 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (ajp-/127.0.0.1:8702-7) [405e320f] Error code AcquireLockFailure and error message IRSGenericException: IRSErrorException: Failed to DetachStorageDomainVDS, error = Cannot obtain lock: "id=827f4f3e-c396-4f23-89e7-71ecbe99d3b5, rc=1, out=[], err=['panic: [7653] handler: IO op too long: (Success)']"
2013-06-14 03:47:01,999 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (ajp-/127.0.0.1:8702-7) [405e320f] IrsBroker::Failed::DetachStorageDomainVDS due to: IRSErrorException: IRSGenericException: IRSErrorException: Failed to DetachStorageDomainVDS, error = Cannot obtain lock: "id=827f4f3e-c396-4f23-89e7-71ecbe99d3b5, rc=1, out=[], err=['panic: [7653] handler: IO op too long: (Success)']"



Version-Release number of selected component (if applicable): sf18


How reproducible: Happened in automated tests:
http://jenkins.qa.lab.tlv.redhat.com:8080/view/Storage/view/3.2/job/3.2-storage_sanity-localfs-rest/149

I cannot reproduce it now, so it might have been only a temporary problem with the ISO domain - if you think it is the case, please just close the bug.


Steps to Reproduce:
1. create a localfs data center with two local storage domains
2. attach NFS ISO storage domain to the data center and activate it
4. deactivate non-master storage domains (data and ISO)
5. try to detach non-master storage domains

Actual results:
Detach of the ISO domain failed

Expected results:
Domain should be detached

Additional info:

Comment 1 Ayal Baron 2013-06-20 11:48:40 UTC
This has nothing to do with SANLock, but the underlying problem is simply that the host cannot change the metadata of the iso domain without first obtaining the lock.  It may have been in use by another DC at the same time which caused it to fail or a storage issue, either way this is transient indeed.


Note You need to log in before you can comment on or make changes to this bug.