Bug 994582 - [vdsm] cannot activate/detach an ISO domain after first detachment failed
Summary: [vdsm] cannot activate/detach an ISO domain after first detachment failed
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 3.3.0
Hardware: x86_64
OS: Unspecified
unspecified
high
Target Milestone: ---
: 3.5.0
Assignee: Federico Simoncelli
QA Contact: Elad
URL:
Whiteboard: storage
Depends On:
Blocks: 974849 rhev3.5beta 1156165
TreeView+ depends on / blocked
 
Reported: 2013-08-07 14:24 UTC by Elad
Modified: 2016-02-10 16:56 UTC (History)
11 users (show)

Fixed In Version: vt1.3
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-02-16 19:10:34 UTC
oVirt Team: Storage
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
logs (2.01 MB, application/x-gzip)
2013-08-07 14:24 UTC, Elad
no flags Details

Description Elad 2013-08-07 14:24:28 UTC
Created attachment 783953 [details]
logs

Description of problem:
After vdsm crashed during detachStorageDomain, vdsm is unable to activate/detach/remove the domain

Version-Release number of selected component (if applicable):
vdsm-4.12.0-rc3.13.git06ed3cc.el6ev.x86_64

How reproducible:
depends on what phase vdsm crashed during the detachStorageDomain

Steps to Reproduce:
on a data center (block with one host on cluster in my case) with connected storage pool and a an ISO domain (local in my case):
- detach the ISO domain from pool and stop vdsm right after
- start vdsm and wait for host to take SPM
- try to activate/detach the domain 

Actual results:
vdsm fails to perform those actions:

Thread-427::ERROR::2013-08-07 17:11:32,361::task::850::TaskManager.Task::(_setError) Task=`ee8d1d86-54d7-48ee-9f4d-a8b52ab2890f`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 857, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 45, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 783, in detachStorageDomain
    pool.detachSD(sdUUID)
  File "/usr/share/vdsm/storage/securable.py", line 68, in wrapper
    return f(self, *args, **kwargs)
  File "/usr/share/vdsm/storage/sp.py", line 1048, in detachSD
    self.validateAttachedDomain(dom)
  File "/usr/share/vdsm/storage/sp.py", line 515, in validateAttachedDomain
    raise se.StorageDomainNotInPool(self.spUUID, dom.sdUUID)
StorageDomainNotInPool: Storage domain not in pool: 'domain=8ccbd167-a48c-4afd-ab3f-a08f69492486, pool=072c2d76-8886-47ab-a1f9-d97f834115af'



Expected results:
After a faulire in detachStorageDomain, vdsm should roll-back/forward

Additional info:
logs

Comment 1 Ayal Baron 2013-09-01 13:22:56 UTC
The domain itself has been detached (domain MD has been update), so activateStorageDomain naturally cannot succeed.
However, there are 2 operations in order to fully detach:
1. update domain (remove pool=...)
2. update master domain
In order to update master domain in this state you need to call forcedDetachSD
spm cannot decide to do this as it is not necessarily clear that this is the intent (vs. attach for example).
This is one of those problems that would disappear once we no longer have a pool.
Anyway, if this can be fixed it's only in engine side.

Comment 2 Allon Mureinik 2014-05-07 09:00:31 UTC
Fede, shouldn't the memory based pool backed take care of this one too?

Comment 3 Federico Simoncelli 2014-06-26 08:18:35 UTC
(In reply to Allon Mureinik from comment #2)
> Fede, shouldn't the memory based pool backed take care of this one too?

Yes, this will be automatically fixed by the memory based pool backend.

Comment 4 Allon Mureinik 2014-06-26 09:55:56 UTC
(In reply to Federico Simoncelli from comment #3)
> (In reply to Allon Mureinik from comment #2)
> > Fede, shouldn't the memory based pool backed take care of this one too?
> 
> Yes, this will be automatically fixed by the memory based pool backend.

According to this statement, the fix for bug 1058022 should have solved this.
Moving to MODIFIED.

Comment 5 Elad 2014-08-26 12:10:12 UTC
After a failure of vdsm during detachment of an ISO domain, when vdsm starts again and takes SPM, detachment to the ISO again succeeds.

vdsm gets the status of the domains in the pool while it's connecting to the pool again. In case the did moved to detached, it doesn't appear in the domainsMap in the connectStoragePool and in case it wasn't detached, it appears as: '3e85ed9c-16d8-4e76-89cc-d533bcd41b79': 'attached'

Re-attaching the domain to the pool again succeeds.

Verified using upstream ovirt-3.5 RC1.1

Comment 6 Allon Mureinik 2015-02-16 19:10:34 UTC
RHEV-M 3.5.0 has been released, closing this bug.


Note You need to log in before you can comment on or make changes to this bug.