Bug 831340

Summary: [RHEV] Exception "Cannot find master domain: 'spUUID=<UID>, msdUUID=None'" during export domain attach
Product: Red Hat Enterprise Linux 6 Reporter: Raul Cheleguini <rcheleguini>
Component: vdsmAssignee: Ayal Baron <abaron>
Status: CLOSED DUPLICATE QA Contact: Haim <hateya>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.2CC: abaron, bazulay, iheim, jmunilla, yeylon, ykaul
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: storage
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-06-13 06:33:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Raul Cheleguini 2012-06-12 20:00:54 UTC
Description of problem:

RHEV-Manager will return the exception "Cannot find master domain (Error code: 304)" on the following situations:

* NFS share when the export domain was added has been removed or moved.
* The directory created by RHEV (Example: /exports/e2ccb5ec-9d37-424a-90ef-f0f0403798e3) has been renamed, deleted or moved.

SPM will look for the export domain on block storage domains and it will ignore file based storage domains.


Exception on SPM/vdsmd log:

Thread-128053::DEBUG::2012-06-08 14:21:54,314::lvm::374::Storage.Misc.excCmd::(cmd) SUCCESS: <err> = ''; <rc> = 0
Thread-128053::DEBUG::2012-06-08 14:21:54,317::lvm::466::OperationMutex::(_reloadvgs) Operation 'lvm reload operation' released the operation mutex
Thread-128053::ERROR::2012-06-08 14:21:54,323::task::868::TaskManager.Task::(_setError) Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 876, in _run
  File "/usr/share/vdsm/storage/spm.py", line 115, in run
  File "/usr/share/vdsm/storage/spm.py", line 1078, in public_attachStorageDomain
  File "/usr/share/vdsm/storage/sp.py", line 912, in refresh
  File "/usr/share/vdsm/storage/sp.py", line 842, in __rebuild
  File "/usr/share/vdsm/storage/sp.py", line 1150, in getMasterDomain
StoragePoolMasterNotFound: Cannot find master domain: 'spUUID=e2ccb5ec-9d37-424a-90ef-f0f0403798e3, msdUUID=None'


Version-Release number of selected component (if applicable):

vdsm-4.9-112.6.el6_2.x86_64
rhevm-3.0.2_0001-2.el6.x86_64


How reproducible/Steps to Reproduce.

1) Add an NFS share as an export domain which will attach it to the DC automatically.

2) Detach the export domain.

3) Remove or rename the directory structure completely from the backend NFS server, example:

rhev3m ~]# tree /export-domain/
/export-domain/
└── cb23b081-8e96-4196-8b39-df967075081c
    ├── dom_md
    │   ├── ids
    │   ├── inbox
    │   ├── leases
    │   ├── metadata
    │   └── outbox
    ├── images
    └── master
        ├── tasks
        └── vms

rhev3m export-domain]# mv cb23b081-8e96-4196-8b39-df967075081c test
rhev3m export-domain]# tree .
.
└── test
    ├── dom_md
    │   ├── ids
    │   ├── inbox
    │   ├── leases
    │   ├── metadata
    │   └── outbox
    ├── images
    └── master
        ├── tasks
        └── vms

Directory names seems to be essential to vdsmd, and after rename the directory which has the export domain UID as name, vdsmd ignored the metadata file and returns the unexpected exception:

4) Try to attach the NFS share to the DC again.


Actual results:

The attach operation will be canceled and the exception "Cannot find master domain (Error code: 304)" will be returned. This can be confused/misunderstood since the issue occurs even with an active Master Domain on Data Center.


Expected results:

RHEV returning a more coherent exception, something like "Cannot find export domain...", or use "dom_md/metadata" as a second reference.

Comment 2 Haim 2012-06-13 06:33:31 UTC

*** This bug has been marked as a duplicate of bug 820557 ***