Bug 853748

Summary: [vdsm] unable to detach block storage domain in partial state
Product: Red Hat Enterprise Linux 6 Reporter: vvyazmin <vvyazmin>
Component: vdsmAssignee: Dan Kenigsberg <dkenigsb>
Status: CLOSED NOTABUG QA Contact: Haim <hateya>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.3CC: abaron, bazulay, hateya, iheim, lpeer, yeylon, ykaul
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard: storage
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-09-03 06:37:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
## Logs vdsm, rhevm none

Description vvyazmin@redhat.com 2012-09-02 15:39:45 UTC
Created attachment 609124 [details]
## Logs vdsm, rhevm

Description of problem:
unable to deactivate block storage domain in partial state


Version-Release number of selected component (if applicable):
Verified on RHEVM 3.1 - SI16

RHEVM: rhevm-3.1.0-14.el6ev.noarch
VDSM: vdsm-4.9.6-31.0.el6_3.x86_64
LIBVIRT: libvirt-0.9.10-21.el6_3.4.x86_64
QEMU & KVM: qemu-kvm-rhev-0.12.1.2-2.295.el6_3.1.x86_64
SANLOCK: sanlock-2.3-3.el6_3.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Create DC iSCSI with 2 hosts.
2. Create first Storage Domain (Master) SD-01
3. Create second SD-02 from 4 LUN's (example: LUN-01, LUN-02, LUN-03, LUN-04)
4. Create third SD-03 from 4 LUN's, but one of LUN's used in second SD (example: LUN-05, LUN-06, LUN-03, LUN-07) – accept to warning
5. “Maintenance” and than “Detach” third SD-03
6. “Maintenance” and than “Detach” second SD-02
  
Actual results:
Failed “Detach” second SD-02
On “Detach” second SD-02 action SPM move from host to host

Expected results:
Succeed “Detach” second SD-02

Additional info:

[root@cougar08 ~]# vgs
  Couldn't find device with uuid JsoOxl-4axo-oHAl-B18k-Q7kl-RTdj-KsV2kf.
  VG                                   #PV #LV #SN Attr   VSize   VFree 
  d0727c0c-4542-4319-9132-d0f0017f82be   4   6   0 wz--n-  18.50g 14.62g
  da68e840-61b2-4e7a-b4b3-92810b24a4af   4   6   0 wz-pn-  18.50g 14.62g
  f386dedb-24f9-465c-8718-099d026fbf8c   1   9   0 wz--n-  49.62g 36.75g
  vg0                                    1   3   0 wz--n- 465.27g     0 
[root@cougar08 ~]# vgs  d0727c0c-4542-4319-9132-d0f0017f82be da68e840-61b2-4e7a-b4b3-92810b24a4af -o+pv_name
  Couldn't find device with uuid JsoOxl-4axo-oHAl-B18k-Q7kl-RTdj-KsV2kf.
  VG                                   #PV #LV #SN Attr   VSize  VFree  PV                           
  d0727c0c-4542-4319-9132-d0f0017f82be   4   6   0 wz--n- 18.50g 14.62g /dev/mapper/3514f0c56958002bb
  d0727c0c-4542-4319-9132-d0f0017f82be   4   6   0 wz--n- 18.50g 14.62g /dev/mapper/3514f0c56958002c9
  d0727c0c-4542-4319-9132-d0f0017f82be   4   6   0 wz--n- 18.50g 14.62g /dev/mapper/3514f0c56958002bc
  d0727c0c-4542-4319-9132-d0f0017f82be   4   6   0 wz--n- 18.50g 14.62g /dev/mapper/3514f0c56958002c2
  da68e840-61b2-4e7a-b4b3-92810b24a4af   4   6   0 wz-pn- 18.50g 14.62g /dev/mapper/3514f0c56958002ba
  da68e840-61b2-4e7a-b4b3-92810b24a4af   4   6   0 wz-pn- 18.50g 14.62g /dev/mapper/3514f0c56958002c1
  da68e840-61b2-4e7a-b4b3-92810b24a4af   4   6   0 wz-pn- 18.50g 14.62g unknown device               
  da68e840-61b2-4e7a-b4b3-92810b24a4af   4   6   0 wz-pn- 18.50g 14.62g /dev/mapper/3514f0c56958002bd


Thread-10740::ERROR::2012-09-02 18:23:04,275::sdc::145::Storage.StorageDomainCache::(_findDomain) Error while looking for domain `da68e840-61b2-4e7a-b4b3-92810b24a4af`
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/sdc.py", line 140, in _findDomain
    return mod.findDomain(sdUUID)
  File "/usr/share/vdsm/storage/blockSD.py", line 1095, in findDomain
    return BlockStorageDomain(BlockStorageDomain.findDomainPath(sdUUID))
  File "/usr/share/vdsm/storage/blockSD.py", line 308, in __init__
    lvm.checkVGBlockSizes(sdUUID, (self.logBlkSize, self.phyBlkSize))
  File "/usr/share/vdsm/storage/lvm.py", line 909, in checkVGBlockSizes
    _checkpvsblksize(pvs, vgBlkSize)
  File "/usr/share/vdsm/storage/lvm.py", line 887, in _checkpvsblksize
    pvBlkSize = _getpvblksize(pv)
  File "/usr/share/vdsm/storage/lvm.py", line 882, in _getpvblksize
    dev = devicemapper.getDmId(os.path.basename(pv))
  File "/usr/share/vdsm/storage/devicemapper.py", line 37, in getDmId
    raise OSError(errno.ENODEV, "Could not find dm device named `%s`" % deviceMultipathName)
OSError: [Errno 19] Could not find dm device named `unknown device`
Thread-10740::ERROR::2012-09-02 18:23:04,282::task::853::TaskManager.Task::(_setError) Task=`8598e46d-1f54-46b9-9592-bec943a22833`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 861, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 38, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 638, in detachStorageDomain
    pool.detachSD(sdUUID)
  File "/usr/share/vdsm/storage/securable.py", line 63, in wrapper
    return f(self, *args, **kwargs)
  File "/usr/share/vdsm/storage/sp.py", line 986, in detachSD
    dom = sdCache.produce(sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 110, in produce
    dom.getRealDomain()
  File "/usr/share/vdsm/storage/sdc.py", line 51, in getRealDomain
    return self._cache._realProduce(self._sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 123, in _realProduce
    dom = self._findDomain(sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 147, in _findDomain
    raise se.StorageDomainDoesNotExist(sdUUID)
StorageDomainDoesNotExist: Storage domain does not exist: ('da68e840-61b2-4e7a-b4b3-92810b24a4af',)
Thread-10740::DEBUG::2012-09-02 18:23:04,283::task::872::TaskManager.Task::(_run) Task=`8598e46d-1f54-46b9-9592-bec943a22833`::Task._run: 8598e46d-1f54-46b9-9592-bec943a22833 ('da68e840-61b2-4e7a-b4b3-92810b24a4af', '8f310cee-22b4-482d-b5b2-6f5d3131dd67', '00000000-0000-0000-0000-000000000000', 1) {} failed - stopping task
Thread-10740::DEBUG::2012-09-02 18:23:04,283::task::1199::TaskManager.Task::(stop) Task=`8598e46d-1f54-46b9-9592-bec943a22833`::stopping in state preparing (force False)
Thread-10740::DEBUG::2012-09-02 18:23:04,284::task::978::TaskManager.Task::(_decref) Task=`8598e46d-1f54-46b9-9592-bec943a22833`::ref 1 aborting True
Thread-10740::INFO::2012-09-02 18:23:04,284::task::1157::TaskManager.Task::(prepare) Task=`8598e46d-1f54-46b9-9592-bec943a22833`::aborting: Task is aborted: 'Storage domain does not exist' - code 358
Thread-10740::DEBUG::2012-09-02 18:23:04,284::task::1162::TaskManager.Task::(prepare) Task=`8598e46d-1f54-46b9-9592-bec943a22833`::Prepare: aborted: Storage domain does not exist
Thread-10740::DEBUG::2012-09-02 18:23:04,284::task::978::TaskManager.Task::(_decref) Task=`8598e46d-1f54-46b9-9592-bec943a22833`::ref 0 aborting True
Thread-10740::DEBUG::2012-09-02 18:23:04,285::task::913::TaskManager.Task::(_doAbort) Task=`8598e46d-1f54-46b9-9592-bec943a22833`::Task._doAbort: force False
Thread-10740::DEBUG::2012-09-02 18:23:04,285::resourceManager::844::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {}
Thread-10740::DEBUG::2012-09-02 18:23:04,285::task::588::TaskManager.Task::(_updateState) Task=`8598e46d-1f54-46b9-9592-bec943a22833`::moving from state preparing -> state aborting
Thread-10740::DEBUG::2012-09-02 18:23:04,286::task::537::TaskManager.Task::(__state_aborting) Task=`8598e46d-1f54-46b9-9592-bec943a22833`::_aborting: recover policy none
Thread-10740::DEBUG::2012-09-02 18:23:04,286::task::588::TaskManager.Task::(_updateState) Task=`8598e46d-1f54-46b9-9592-bec943a22833`::moving from state aborting -> state failed
Thread-10740::DEBUG::2012-09-02 18:23:04,286::resourceManager::809::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {'Storage.da68e840-61b2-4e7a-b4b3-92810b24a4af': < ResourceRef 'Storage.da68e840-61b2-4e7a-b4b3-92810b24a4af', isValid: 'True' obj: 'None'>, 'Storage.8f310cee-22b4-482d-b5b2-6f5d3131dd67': < ResourceRef 'Storage.8f310cee-22b4-482d-b5b2-6f5d3131dd67', isValid: 'True' obj: 'None'>}
Thread-10740::DEBUG::2012-09-02 18:23:04,287::resourceManager::844::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {}
Thread-10740::DEBUG::2012-09-02 18:23:04,287::resourceManager::538::ResourceManager::(releaseResource) Trying to release resource 'Storage.da68e840-61b2-4e7a-b4b3-92810b24a4af'
Thread-10740::DEBUG::2012-09-02 18:23:04,288::resourceManager::553::ResourceManager::(releaseResource) Released resource 'Storage.da68e840-61b2-4e7a-b4b3-92810b24a4af' (0 active users)
Thread-10740::DEBUG::2012-09-02 18:23:04,288::resourceManager::558::ResourceManager::(releaseResource) Resource 'Storage.da68e840-61b2-4e7a-b4b3-92810b24a4af' is free, finding out if anyone is waiting for it.
Thread-10740::DEBUG::2012-09-02 18:23:04,288::resourceManager::565::ResourceManager::(releaseResource) No one is waiting for resource 'Storage.da68e840-61b2-4e7a-b4b3-92810b24a4af', Clearing records.
Thread-10740::DEBUG::2012-09-02 18:23:04,289::resourceManager::538::ResourceManager::(releaseResource) Trying to release resource 'Storage.8f310cee-22b4-482d-b5b2-6f5d3131dd67'
Thread-10740::DEBUG::2012-09-02 18:23:04,289::resourceManager::553::ResourceManager::(releaseResource) Released resource 'Storage.8f310cee-22b4-482d-b5b2-6f5d3131dd67' (0 active users)
Thread-10740::DEBUG::2012-09-02 18:23:04,289::resourceManager::558::ResourceManager::(releaseResource) Resource 'Storage.8f310cee-22b4-482d-b5b2-6f5d3131dd67' is free, finding out if anyone is waiting for it.
Thread-10740::DEBUG::2012-09-02 18:23:04,290::resourceManager::565::ResourceManager::(releaseResource) No one is waiting for resource 'Storage.8f310cee-22b4-482d-b5b2-6f5d3131dd67', Clearing records.
Thread-10740::ERROR::2012-09-02 18:23:04,290::dispatcher::66::Storage.Dispatcher.Protect::(run) {'status': {'message': "Storage domain does not exist: ('da68e840-61b2-4e7a-b4b3-92810b24a4af',)", 'code': 358}}

Comment 2 Ayal Baron 2012-09-03 06:37:10 UTC
To detach a faulty domain one must use forcedDetachStorageDomain