Bug 1247942

Summary: [hosted-engine-setup] Additional host deployment fails with "Dirty Storage Domain: Cannot find master domain" over block storage
Product: [oVirt] ovirt-hosted-engine-setup Reporter: Elad <ebenahar>
Component: GeneralAssignee: Simone Tiraboschi <stirabos>
Status: CLOSED CURRENTRELEASE QA Contact: Elad <ebenahar>
Severity: urgent Docs Contact:
Priority: high    
Version: 1.3.0CC: acanan, bugs, darryl.bond, ecohen, gklein, lsurette, miac.romanov, rbalakri, sbonazzo, stirabos, yeylon, ylavi
Target Milestone: ovirt-3.6.0-rcKeywords: Regression, TestBlocker
Target Release: 1.3.0Flags: ylavi: ovirt-3.6.0+
ylavi: blocker+
sbonazzo: planning_ack+
sbonazzo: devel_ack+
acanan: testing_ack+
Hardware: x86_64   
OS: Unspecified   
Whiteboard: integration
Fixed In Version: ovirt-engine-3.6.0 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-11-27 07:49:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1258465    
Bug Blocks: 1036731, 1153278, 1169290    

Description Elad 2015-07-29 10:01:43 UTC
Description of problem:
Deployed hosted-engine over FC and tried to add an additional host to the setup using hosted-engine --deploy and picked the LUN that is used by the first host for the HE image. The deployment fails with a dirty storage domain error:

[ INFO  ] Stage: Setup validation
[ ERROR ] Failed to execute stage 'Setup validation': Dirty Storage Domain: Cannot find master domain: 'spUUID=00000000-0000-0000-0000-000000000000, msdUUID=2d33e6a2-8935-4fe0-9868-ff8f38c17ea0' Please clean the storage device and try again


Version-Release number of selected component (if applicable):
ovirt-hosted-engine-setup-1.3.0-0.0.master.20150727115626.git3bf22dc.el7.noarch

How reproducible:
Always

Steps to Reproduce:
1. Deploy hosted-engine over FC
2. Once the setup is running, deploy hosted-engine on a second host that is exposed to the same LUN where the hosted-engine VM is image is located and pick this LUN. Answer 'yes' in the "Is this an additional host setup?"


Actual results:
Deployment failed with dirty storage error. This is wrong because this is an additional host deployment for an existing setup. It should not try to deploy a new HE image over the LUN.

2015-07-29 12:46:18 DEBUG otopi.context context._executeMethod:155 method exception
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/otopi/context.py", line 145, in _executeMethod
    method['method']()
  File "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/ovirt-hosted-engine-setup/storage/storage.py", line 319, in _validation
    self._storagePoolConnection()
  File "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/ovirt-hosted-engine-setup/storage/storage.py", line 898, in _storagePoolConnection
    message=status['status']['message'],
RuntimeError: Dirty Storage Domain: Cannot find master domain: 'spUUID=00000000-0000-0000-0000-000000000000, msdUUID=2d33e6a2-8935-4fe0-9868-ff8f38c17ea0'
Please clean the storage device and try again
2015-07-29 12:46:18 ERROR otopi.context context._executeMethod:164 Failed to execute stage 'Setup validation': Dirty Storage Domain: Cannot find master domain: 'spUUID=00000000-0000-0000-0000-000000000000, msdUUID
=2d33e6a2-8935-4fe0-9868-ff8f38c17ea0'
Please clean the storage device and try again



An attempt to add the same host to the HE setup, to the same cluster where the HE VM is running, only via the webadmin, succeeds.

Expected results:
Deployment of an additional host on the HE setup using hosted-engine --deploy should work.

Additional info:
sosreport:  http://file.tlv.redhat.com/ebenahar/sosreport-purple-vds2.qa.lab.tlv.redhat.com-20150729125935.tar.xz

Comment 1 Elad 2015-07-29 10:09:06 UTC
Correction: 
Please ignore:
"An attempt to add the same host to the HE setup, to the same cluster where the HE VM is running, only via the webadmin, succeeds." 

Adding an additional host via the webadmin does make it a hosted-engine host. 

Therefore, I can't find a workaround.

Comment 2 Elad 2015-08-03 08:07:12 UTC
Occurs also while deploying over iSCSI. Therefore, marking as regression.


Logs for iSCSI case: http://file.tlv.redhat.com/ebenahar/sosreport-camel-vdsb.qa.lab.tlv.redhat.com-20150803110508.tar.xz

Comment 3 Yaniv Lavi 2015-08-31 11:45:19 UTC
What is the status on this?

Comment 4 Simone Tiraboschi 2015-08-31 12:20:08 UTC
I was able to isolate the issue: with latest VDSM we have a different behavior of prepareImage from NFS to iSCSI.
Opening a bug on that and setting as a blocker.

Comment 5 Sandro Bonazzola 2015-09-08 11:16:46 UTC
Simone, can you check if 3.5.4 works as expected with regard to this BZ?

Comment 6 Simone Tiraboschi 2015-09-09 07:37:26 UTC
Yes, 3.5.4 is not affected.

Comment 7 Elad 2015-11-08 09:49:36 UTC
Additional host deployment over FC succeeded. Host was added successfully to the DC.

Verified using:
ovirt-hosted-engine-ha-1.3.2.1-1.el7ev.noarch
ovirt-hosted-engine-setup-1.3.0-1.el7ev.noarch
vdsm-4.17.10.1-0.el7ev.noarch

Comment 8 Sandro Bonazzola 2015-11-27 07:49:35 UTC
Since oVirt 3.6.0 has been released, moving from verified to closed current release.