Created attachment 1257813 [details] Logs_On_Google_Drive Description of problem: We noticed inconsistent behavior with re-creating an SD of the same ISCSI luns used previously (after detachment). VDSM fails to add/dettach ISCSI storage domain saying " getDmId deviceMultipathName) OSError: [Errno 19] Could not find dm device named `[unknown]`" ### ! Note ! - Logs https://drive.google.com/drive/folders/0B8V1DXeGhPPWRE5PNmtJYmhEQ3M?usp=sharing ### Version-Release number of selected component (if applicable): Engine: ovirt-engine-4.0.7.1-0.1.el7ev.noarch VDSM Version: vdsm-4.19.4-1.el7ev.x86_64 How reproducible: Consistent Steps to Reproduce: Create multi lun iscsi storage domain successfully 1) Dettach above sd domain and remove domain successfully, now vdsm log showed this error [1] 2) Now with storage domain no longer showing/existing in Webadmin UI I then re-created the storage domain via API and the storage domain attachment phase fails (before it passed) 3) Webadmin UI domains then shows this SD as detached; when attempting manual attachment / activation it fails with [2] Actual results: SD not connected/dettached properly. [1] Error after dettaching 2017-02-21 15:12:30,955 ERROR (jsonrpc/7) [storage.TaskManager.Task] (Task='54963613-a35f-4cb3-8816-ba88395f7439') Unexpected error (task:870) Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 877, in _run return fn(*args, **kargs) File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 50, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 847, in detachStorageDomain pool.detachSD(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 1012, in detachSD dom = sdCache.produce(sdUUID) File "/usr/share/vdsm/storage/sdc.py", line 112, in produce domain.getRealDomain() File "/usr/share/vdsm/storage/sdc.py", line 53, in getRealDomain return self._cache._realProduce(self._sdUUID) File "/usr/share/vdsm/storage/sdc.py", line 136, in _realProduce domain = self._findDomain(sdUUID) File "/usr/share/vdsm/storage/sdc.py", line 155, in _findDomain dom = findMethod(sdUUID) File "/usr/share/vdsm/storage/blockSD.py", line 1599, in findDomain return BlockStorageDomain(BlockStorageDomain.findDomainPath(sdUUID)) File "/usr/share/vdsm/storage/blockSD.py", line 958, in __init__ lvm.checkVGBlockSizes(sdUUID, (self.logBlkSize, self.phyBlkSize)) File "/usr/share/vdsm/storage/lvm.py", line 1124, in checkVGBlockSizes _checkpvsblksize(pvs, vgBlkSize) File "/usr/share/vdsm/storage/lvm.py", line 1101, in _checkpvsblksize pvBlkSize = _getpvblksize(pv) File "/usr/share/vdsm/storage/lvm.py", line 1095, in _getpvblksize dev = devicemapper.getDmId(os.path.basename(pv)) File "/usr/lib/python2.7/site-packages/vdsm/storage/devicemapper.py", line 42, in getDmId deviceMultipathName) OSError: [Errno 19] Could not find dm device named `[unknown]` 2017-02-21 15:12:30,993 INFO (jsonrpc/7) [storage.TaskManager.Task] (Task='54963613-a35f-4cb3-8816-ba88395f7439') aborting: Task is aborted: u'[Errno 19] Could not find dm device named `[unknown]`' - code 100 (task:1175) 2017-02-21 15:12:30,994 ERROR (jsonrpc/7) [storage.Dispatcher] [Errno 19] Could not find dm device named `[unknown]` (dispatcher:80) Traceback (most recent call last): File "/usr/share/vdsm/storage/dispatcher.py", line 72, in wrapper result = ctask.prepare(func, *args, **kwargs) File "/usr/share/vdsm/storage/task.py", line 105, in wrapper return m(self, *a, **kw) File "/usr/share/vdsm/storage/task.py", line 1183, in prepare raise self.error OSError: [Errno 19] Could not find dm device named `[unknown]` [2] Failed Upon re-attachment attempt 2017-02-21 15:17:01,277 ERROR (jsonrpc/4) [storage.TaskManager.Task] (Task='7a74fefc-d2f6-44e4-8ab0-57ae2f02077f') Unexpected error (task:870) Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 877, in _run return fn(*args, **kargs) File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 50, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 1218, in activateStorageDomain pool.activateSD(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 1089, in activateSD dom = sdCache.produce(sdUUID) File "/usr/share/vdsm/storage/sdc.py", line 112, in produce domain.getRealDomain() File "/usr/share/vdsm/storage/sdc.py", line 53, in getRealDomain return self._cache._realProduce(self._sdUUID) File "/usr/share/vdsm/storage/sdc.py", line 136, in _realProduce domain = self._findDomain(sdUUID) File "/usr/share/vdsm/storage/sdc.py", line 155, in _findDomain dom = findMethod(sdUUID) File "/usr/share/vdsm/storage/blockSD.py", line 1599, in findDomain return BlockStorageDomain(BlockStorageDomain.findDomainPath(sdUUID)) File "/usr/share/vdsm/storage/blockSD.py", line 958, in __init__ lvm.checkVGBlockSizes(sdUUID, (self.logBlkSize, self.phyBlkSize)) File "/usr/share/vdsm/storage/lvm.py", line 1124, in checkVGBlockSizes _checkpvsblksize(pvs, vgBlkSize) File "/usr/share/vdsm/storage/lvm.py", line 1101, in _checkpvsblksize pvBlkSize = _getpvblksize(pv) File "/usr/share/vdsm/storage/lvm.py", line 1095, in _getpvblksize dev = devicemapper.getDmId(os.path.basename(pv)) File "/usr/lib/python2.7/site-packages/vdsm/storage/devicemapper.py", line 42, in getDmId deviceMultipathName) OSError: [Errno 19] Could not find dm device named `[unknown]` 2017-02-21 15:17:01,279 INFO (jsonrpc/4) [storage.TaskManager.Task] (Task='7a74fefc-d2f6-44e4-8ab0-57ae2f02077f') aborting: Task is aborted: u'[Errno 19] Could not find dm device named `[unknown]`' - code 100 (task:1175) 2017-02-21 15:17:01,281 ERROR (jsonrpc/4) [storage.Dispatcher] [Errno 19] Could not find dm device named `[unknown]` (dispatcher:80) Traceback (most recent call last): File "/usr/share/vdsm/storage/dispatcher.py", line 72, in wrapper result = ctask.prepare(func, *args, **kwargs) File "/usr/share/vdsm/storage/task.py", line 105, in wrapper return m(self, *a, **kw) File "/usr/share/vdsm/storage/task.py", line 1183, in prepare raise self.error OSError: [Errno 19] Could not find dm device named `[unknown]` [3] Error from Engine side 2017-02-21 15:17:02,285 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.ActivateStorageDomainVDSCommand] (org.ovirt.thread.pool-6-thread-38) [60b905af] Failed in 'ActivateStorageDomainVDS' method 2017-02-21 15:17:02,290 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-6-thread-38) [60b905af] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VDSM command failed: Error in storage domain action: (u'sdUUID=54913ad9-6641-4cb6-93f4-041662855a51, spUUID=e805f94f-228a-48fd-803f-950882473d1c',) 2017-02-21 15:17:02,290 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (org.ovirt.thread.pool-6-thread-38) [60b905af] IrsBroker::Failed::ActivateStorageDomainVDS: IRSGenericException: IRSErrorException: Failed to ActivateStorageDomainVDS, error = Error in storage domain action: (u'sdUUID=54913ad9-6641-4cb6-93f4-041662855a51, spUUID=e805f94f-228a-48fd-803f-950882473d1c',), code = 350 2017-02-21 15:17:02,306 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.ActivateStorageDomainVDSCommand] (org.ovirt.thread.pool-6-thread-38) [60b905af] FINISH, ActivateStorageDomainVDSCommand, log id: 779f70dd 2017-02-21 15:17:02,306 ERROR [org.ovirt.engine.core.bll.storage.domain.ActivateStorageDomainCommand] (org.ovirt.thread.pool-6-thread-38) [60b905af] Command 'org.ovirt.engine.core.bll.storage.domain.ActivateStorageDomainCommand' failed: EngineException: org.ovirt.engine.core.vdsbroker.irsbroker.IRSErrorException: IRSGenericException: IRSErrorException: Failed to ActivateStorageDomainVDS, error = Error in storage domain action: (u'sdUUID=54913ad9-6641-4cb6-93f4-041662855a51, spUUID=e805f94f-228a-48fd-803f-950882473d1c',), code = 350 (Failed with error StorageDomainActionError and code 350) 2017-02-21 15:17:02,312 INFO [org.ovirt.engine.core.bll.storage.domain.ActivateStorageDomainCommand] (org.ovirt.thread.pool-6-thread-38) [60b905af] Command [id=9f49b4cd-bbb9-4ae7-82bc-150803f1c6f0]: Compensating CHANGED_STATUS_ONLY of org.ovirt.engine.core.common.businessentities.StoragePoolIsoMap; snapshot: EntityStatusSnapshot:{id='StoragePoolIsoMapId:{storagePoolId='e805f94f-228a-48fd-803f-950882473d1c', storageId='54913ad9-6641-4cb6-93f4-041662855a51'}', status='Maintenance'}. 2017-02-21 15:17:02,318 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-6-thread-38) [60b905af] Correlation ID: 12af981e, Job ID: 1b36e5cd-b0b4-47dc-ad2b-45118d1ce2ad, Call Stack: null, Custom Event ID: -1, Message: Failed to activate Storage Domain iscsi_sd_test (Data Center DC_REAL_1) by admin@internal-authz 2017-02-21 15:17:16,307 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxyData] (org.ovirt.thread.pool-6-thread-9) [] Adding domain '54913ad9-6641-4cb6-93f4-041662855a51' to the domains in maintenance cache Expected results: Successful domain attachment/dettachment. Additional info: ##Initial Feedback from Dev## This means that a PV is missing. In this case lvm return the special name "[unknown]" instead of the pv name. We have special code for detecting this so we can fail early in a clear way when a PV is missing, but it seems that we have some code path that do not detect this and instead of failing early with a clear message, try to get the device dm id which will obviously fail because "[unknown]" is not a dm device name. The root cause of course is a missing PV, if you look at the output of vgs and pvs commands you should see a "partial" vg and a pv with the name "[unknown]". Inside the logs located # https://drive.google.com/drive/folders/0B8V1DXeGhPPWRE5PNmtJYmhEQ3M?usp=sharing There is info.out folder showing PVs Vgs
This is most likely a dup of a dup of a dup. We don't treat dirty LUNs nicely. I'd ask storage team what the original bug about this is.
I think you are referring to bug 1253640
*** This bug has been marked as a duplicate of bug 1253640 ***