Bug 1293892

Summary: [storage] hosted-engine-setup fails on additional hosts if the hosted-engine storage domain was already imported into the engine
Product: [oVirt] ovirt-hosted-engine-setup Reporter: Simone Tiraboschi <stirabos>
Component: Plugins.GeneralAssignee: Simone Tiraboschi <stirabos>
Status: CLOSED CURRENTRELEASE QA Contact: Nikolai Sednev <nsednev>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 1.3.2.1CC: bugs, didi, duckd, lveyde, mavital, rmartins, sbonazzo, stirabos, ylavi
Target Milestone: ovirt-3.6.1Keywords: Triaged
Target Release: 1.3.1.4Flags: rule-engine: ovirt-3.6.z+
ylavi: planning_ack+
sbonazzo: devel_ack+
mavital: testing_ack+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: With 3.6 the engine is able to auto-import the hosted-engine storage domain into the datacenter. At that point the hosted-engine storage domain will be connected to another storage pool. Consequence: hosted-engine-setup fails connecting the storagePool cause it has just the uuid of the boostrap storage pool which is not actual. Fix: Avoid connecting at all the storagePool from hosted-engine-setup on additional hosts. Result: hosted-engine-setup is able to deploy an additional host.
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-02-18 10:52:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Integration RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1293928    
Bug Blocks:    

Description Simone Tiraboschi 2015-12-23 12:11:05 UTC
Description of problem:

Since 3.6.1 the hosted engine storage domain get imported into the engine and  attached to the storage pool of the data-center that contains the hosted-engine hosts.

hosted-engine-setup on additional hosts was connecting the bootstrap storage pool just to check its name but now it cannot anymore since the master storage domain of the datacenter storage pool could be a regular storage domain on a storage server that hosted-engine-setup didn't connected at all.

If so, the connectStoragePool call fails cause hosted-engine-setup assumes that the master storage pool can be just the hosted-engine storage domain which is not always true once the hosted-engine storage domain got attached to a real storage pool witch contains also other storage domains.

Version-Release number of selected component (if applicable):
ovirt-hosted-engine-setup  1.3.2-1

How reproducible:
100% if in the right conditions

Steps to Reproduce:
1. deploy hosted-engine on first host
2. add a regular storage domain to the engine
3. wait for the engine to auto import the hosted-engine storage domain

Actual results:
2015-12-23 11:47:04 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:219 DIALOG:SEND                 Please specify the Host ID [Must be integer, default: 2]:
2015-12-23 11:47:08 DEBUG otopi.plugins.ovirt_hosted_engine_setup.storage.storage storage._storagePoolConnection:932 connectStoragePool
2015-12-23 11:47:08 DEBUG otopi.context context._executeMethod:156 method exception
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/otopi/context.py", line 146, in _executeMethod
    method['method']()
  File "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/ovirt-hosted-engine-setup/storage/storage.py", line 1275, in _late_customization
    self._getExistingDomain()
  File "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/ovirt-hosted-engine-setup/storage/storage.py", line 566, in _getExistingDomain
    self._storagePoolConnection()
  File "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/ovirt-hosted-engine-setup/storage/storage.py", line 940, in _storagePoolConnection
    message=status['status']['message'],
RuntimeError: Dirty Storage Domain: Wrong Master domain or its version: 'SD=77d0d6b1-376f-4ee6-a8a3-c998a20e5d69, pool=00000001-0001-0001-0001-000000000374'
Please clean the storage device and try again
2015-12-23 11:47:08 ERROR otopi.context context._executeMethod:165 Failed to execute stage 'Environment customization': Dirty Storage Domain: Wrong Master domain or its version: 'SD=77d0d6b1-376f-4ee6-a8a3-c998a20e5d69, pool=00000001-0001-0001-0001-000000000374'
Please clean the storage device and try again


Expected results:
It successfully deploy the additional host

Additional info:
Workaround: the engine will auto-import the hosted-engine storage domain only after the user manually adds the first regular storage domain.
Deploying all the hosts before adding the first regular storage domain is enough to avoid it.

Comment 1 Red Hat Bugzilla Rules Engine 2015-12-23 16:59:15 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 2 Nikolai Sednev 2016-01-25 13:48:01 UTC
Works for me on these components:
ovirt-vmconsole-1.0.0-1.el7ev.noarch
ovirt-hosted-engine-ha-1.3.3.7-1.el7ev.noarch
mom-0.5.1-1.el7ev.noarch
ovirt-vmconsole-host-1.0.0-1.el7ev.noarch
sanlock-3.2.4-2.el7_2.x86_64
ovirt-hosted-engine-setup-1.3.2.3-1.el7ev.noarch
qemu-kvm-rhev-2.3.0-31.el7_2.6.x86_64
ovirt-host-deploy-1.4.1-1.el7ev.noarch
rhevm-sdk-python-3.6.2.1-1.el7ev.noarch
libvirt-client-1.2.17-13.el7_2.2.x86_64
ovirt-setup-lib-1.0.1-1.el7ev.noarch
Linux version 3.10.0-327.8.1.el7.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-4) (GCC) ) #1 SMP Mon Jan 11 05:03:18 EST 2016


Engine:
rhevm-3.6.2.6-0.1.el6.noarch

To reproduce, I've deployed HE over 1 hosts and added data SD to be able to get the HE-SD auto-imported, then added second hosted-engine host successfully.

Comment 3 Douglas Duckworth 2016-03-07 20:22:25 UTC
Workaround:

In RHEV-M 3.5.7-0.1.el6ev on additional RHEV Hypervisor - 7.2 - 20160219.0.el7ev I was able to add additional Hosted-Engine Host by doing:

hosted-engine --set-maintenance --mode=global
hosted-engine  --vm-stop
hosted-engine  --deploy

Dirty Storage Domain message gone.