Bug 1269768
Summary: | Auto import hosted engine domain | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [oVirt] ovirt-engine | Reporter: | Roy Golan <rgolan> | ||||||||
Component: | BLL.HostedEngine | Assignee: | Roy Golan <rgolan> | ||||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Artyom <alukiano> | ||||||||
Severity: | urgent | Docs Contact: | |||||||||
Priority: | urgent | ||||||||||
Version: | 3.6.0 | CC: | amureini, bhughes, bobby.prins, bugs, cinglese, cpaquin, cshao, dfediuck, didi, ebenahar, fdeutsch, gklein, hawk, h.moeller, huzhao, jbenedic, jforeman, jimmy, joerg.br, lveyde, mavital, mbrancaleoni, miersma, n.pernon, nsednev, nsoffer, obockows, rgolan, rmartins, sbonazzo, s.danzi, stirabos, tlitovsk, vic.ad94, volga629, willarddennis | ||||||||
Target Milestone: | ovirt-3.6.1 | Keywords: | TestBlocker, Triaged | ||||||||
Target Release: | 3.6.1.3 | Flags: | rule-engine:
ovirt-3.6.z+
rule-engine: blocker+ ylavi: planning_ack+ dfediuck: devel_ack+ gklein: testing_ack+ |
||||||||
Hardware: | Unspecified | ||||||||||
OS: | Unspecified | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||
Doc Text: |
Cause: As part of Edit Vm and configuration-on-shared storage RFE's it is a prerequisite to import the hosted engine domain. Using that the OVF_STORE disks will be created to hold the engine vm OVF file, Also the engine VM would be able to be added into the engine DB, making it editable.
Consequence: This process is delicate and can be instrumented by the engine without a user action. While monitoring the hosts, engine now detect a running hosted engine vm, detect its domain details, and imports the engine domain into the db. NOTE - in order for that process to start there *must* be an active master domain and the Datacenter must be active.
Fix: Trigger an auto import flow once the engine identifies it doesn't have the hosted engine domain and VM. ATM only domain's name 'hosted_storage' are *supported*.
Result: After install/upgrade, once the Datacenter is active, the engine automatically imports the hosted engine domain + the hosted engine VM.
|
Story Points: | --- | ||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2016-03-11 07:19:50 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | SLA | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | 1167074, 1273378, 1290518, 1291634, 1293928 | ||||||||||
Bug Blocks: | 1099516, 1139492, 1139793, 1175354, 1244521, 1258386, 1261996, 1274294, 1274315, 1275210, 1281539, 1283458, 1285700, 1315808 | ||||||||||
Attachments: |
|
Description
Roy Golan
2015-10-08 07:53:44 UTC
*** Bug 1267337 has been marked as a duplicate of this bug. *** *** Bug 1269765 has been marked as a duplicate of this bug. *** As noted in duplicate bug 1267337, it is also currently impossible to manually import a hosted engine storage domain on fibre channel storage. Postponing to 3.6.1 not being identified as a blocker. *** Bug 1273524 has been marked as a duplicate of this bug. *** *** Bug 1275210 has been marked as a duplicate of this bug. *** In oVirt testing is done on single release by default. Therefore I'm removing the 4.0 flag. If you think this bug must be tested in 4.0 as well, please re-add the flag. Please note we might not have testing resources to handle the 4.0 clone. The last thing which holding this to work is the fact the VDSM's attachSD is removing the sanlock lockspace and by that killing the hosted engine VM from sanlock log_dump: 2015-11-07 09:18:33+0200 260811 [828]: cmd_rem_lockspace 4,16 398e3075-3777-4439-b077-7e406ee0b8b9 flags 0 2015-11-07 09:18:34+0200 260811 [813]: s8 set killing_pids check 0 remove 1 2015-11-07 09:18:34+0200 260811 [813]: s8:r23 client_using_space pid 10462 2015-11-07 09:18:34+0200 260811 [813]: s8 kill 10462 sig 9 count 1 2015-11-07 09:18:34+0200 260811 [813]: client_pid_dead 2,10,10462 cmd_active 0 suspend 0 2015-11-07 09:18:34+0200 260811 [813]: dead 10462 ci 2 count 1 10462: the pid of the qemu VM of hostedengine ^^^ I was able to avoid the lockspace removal in the sp.py def attachSD(self, sdUUID) finally: # dom.releaseHostId(self.id) pass Question is can we do that for hosted engine flow? we would always have a running VM while attaching. Can we avoid removing the lockspace in case of running VMS on a detached SD? It seems that we can keep the lockspace during attach, implemented in https://gerrit.ovirt.org/48217 Please check if it works for you. *** Bug 1268075 has been marked as a duplicate of this bug. *** *** Bug 1281291 has been marked as a duplicate of this bug. *** *** Bug 1283823 has been marked as a duplicate of this bug. *** Created attachment 1097652 [details]
vdsm and sanlock log dump before, after the attch sd
This fix now allows me to import the hosted-engine's storage domain without the engine crashing but I still cannot succeed in actually seeing the hosted-engine VM in the the GUI afterwards. Should this be automatic? What would be the exact steps to get the hosted-engine VM show up? This is on a fresh install and I am pretty new to oVirt but would be willing to help figure it out if you could point me to what you guys needed ... Harry This issue is still no fixed in 3.6.1, we might respin to fix this. Once it's fixed this will move to ON_QA and we would like to to help us test the fix. Any potential workaround for this? Any pointers as to why this happens? I'm a developer, maybe I can dig in and help? As far as the needinfo flag, you'll let me know what to provide, yes? (Sorry, I'm new to Bugzilla, too). (In reply to Harry Moeller from comment #17) > Any potential workaround for this? Any pointers as to why this happens? I'm > a developer, maybe I can dig in and help? As far as the needinfo flag, > you'll let me know what to provide, yes? (Sorry, I'm new to Bugzilla, too). Harry thanks for that. You can see the root cause at the comment of the proposed VDSM fix https://gerrit.ovirt.org/#/c/48217/1. The rest of the needed fixes for the engine are in the "External Tracker" section above. Still missing backport to 3.6.1 branch I figured with the 3.6 nightly as of today plus manually added IDs 49596, 49597, I should be able to test this flow now. But somehow I cannot ever get the Hosted Engine show in the GUI. After a clean installation using "hosted-engine --deploy" using GlusterFS storage, manually doing the IDs 49596, 49597 modifications, using the 3.6 Appliance plus manually adding the 3.6 nightly upgrades (no mirrors) before then running "engine-setup", everything goes through smoothly without errors. Once it's up, the Default DC is showing "uninitialized" lacking any storage, the Hosted Engine SD hasn't been imported. Then I do create my initial Data SD, it gets selected Master, the DC is being initialized and and a few minutes later my DC switches to "Up" and the Data storage domain to Active. All seem good but still no sight of the Hosted Engine's SD nor VM. One thing to note is that during this creation of the new data SD, I see the message "This DC compatibility version does not support importing a data domain with its entities (VMs and Templates). The imported domain will be imported without them.". This is a little strange as I did add a new Data SD, not import anything. So, I finally give up hope and manually import the Hosted Engine SD into my Default DC as a Data SD manually. Thanks to the IDs 49596, 49597 modifications, this goes through without the Hosted Engine crashing and I end up with an extra data SD in Maintenance Mode but no sight whatsoever of the Hosted Engine VM still. Yet again, I do get "This DC compatibility version does not support importing a data domain with its entities (VMs and Templates). The imported domain will be imported without them." and this time it really worries me, as the only reason I imported this SD was to get the Hosted Engine showing up. May this is a hint of what's going on? Last step I do is activate that Hosted Engine SD in the Default DC, which works fine but does not help any further in finding/importing the Hosted Engine VM either. Am I missing something here? At which point in this described flow would the Hosted Engine SD and Hosted Engine VM supposed to be auto-imported? Thanks for some insight, by now I'm pretty sure I'm doing something wrong here, yes? I am curious when this fix is going to available to the community? We are using oVirt 3.6 and looking forward to this being available publicly via the ovirt repositories (http://resources.ovirt.org/pub/ovirt-3.6/). (In reply to Charlie Inglese from comment #21) > I am curious when this fix is going to available to the community? We are > using oVirt 3.6 and looking forward to this being available publicly via the > ovirt repositories (http://resources.ovirt.org/pub/ovirt-3.6/). Of course it will be. Can you show me a fix that was not available? (In reply to Nir Soffer from comment #22) > (In reply to Charlie Inglese from comment #21) > > I am curious when this fix is going to available to the community? We are > > using oVirt 3.6 and looking forward to this being available publicly via the > > ovirt repositories (http://resources.ovirt.org/pub/ovirt-3.6/). > > Of course it will be. Can you show me a fix that was not available? Thanks Nir, however, my question was more related to time frame rather than availability. The patches on the vdsm side should be merged on Monday, the engine side patches are waiting only for the vdsm patches. I don't know when the next release is planned, but Sandro may add more info. Thanks Nir, are you aware if the ovirt-engine-appliance is impacted by this patch? (In reply to Charlie Inglese from comment #25) > Thanks Nir, are you aware if the ovirt-engine-appliance is impacted by this > patch? yes we have to spin it to contain the engine latest build. Tolik can you confirm so? Works for me on these components: Hosts: ovirt-vmconsole-host-1.0.1-0.0.master.20151105234454.git3e5d52e.el7.noarch ovirt-release36-002-2.noarch sanlock-3.2.4-1.el7.x86_64 ovirt-hosted-engine-ha-1.3.3.3-1.20151211131547.gitb84582e.el7.noarch ovirt-setup-lib-1.0.1-0.0.master.20151126203321.git2da7763.el7.centos.noarch ovirt-engine-sdk-python-3.6.1.1-0.1.20151127.git2400b22.el7.centos.noarch ovirt-vmconsole-1.0.1-0.0.master.20151105234454.git3e5d52e.el7.noarch ovirt-release36-snapshot-002-2.noarch mom-0.5.1-2.el7.noarch qemu-kvm-rhev-2.3.0-31.el7_2.5.x86_64 ovirt-hosted-engine-setup-1.3.2-0.0.master.20151209094106.gitce16937.el7.centos.noarch vdsm-4.17.13-1.git5bc7781.el7.centos.noarch ovirt-host-deploy-1.4.2-0.0.master.20151122153544.gitfc808fc.el7.noarch libvirt-client-1.2.17-13.el7_2.2.x86_64 Linux version 3.10.0-327.el7.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.8.3 20140911 (Red Hat 4.8.3-9) (GCC) ) #1 SMP Thu Oct 29 17:29:29 EDT 2015 Engine: ovirt-host-deploy-java-1.4.1-1.el6ev.noarch ovirt-vmconsole-1.0.0-1.el6ev.noarch ovirt-host-deploy-1.4.1-1.el6ev.noarch ovirt-vmconsole-proxy-1.0.0-1.el6ev.noarch ovirt-engine-extension-aaa-jdbc-1.0.4-1.el6ev.noarch rhevm-3.6.1.3-0.1.el6.noarch Linux version 2.6.32-573.8.1.el6.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-16) (GCC) ) #1 SMP Fri Sep 25 19:24:22 EDT 2015 I've upgraded hosts and the engine from minor version of 3.6 to described above. After engine-setup was finished and both hosts were updated as well, I saw the enine's VM via WEBUI and also the HE-SD in maintenance mode, so I attached it to activate it and it worked just fine. Initial deployment of HE was over NFS SD. Waiting for 1273378 resolution. Could someone please populate the Fixed in Version field? I am not sure which of the ocmponents above did the trick or if multiple are important by sbonazzo email it should be: RHEVM: org.ovirt.engine-root-3.6.1.2-1 VDSM: vdsm-4.17.13-1.el7ev ovirt-hosted-engine-ha 1.3.3.1-1.el7ev.noarch I tried it again on NFS with: ovirt-hosted-engine-ha.noarch 1.3.3.3-1.el7.centos ovirt-hosted-engine-setup.noarch 1.3.1.3-1.el7.centos vdsm.noarch 4.17.13-0.el7.centos on the host and ovirt-engine.noarch 3.6.1.3-1.el7.centos ovirt-engine-backend.noarch 3.6.1.3-1.el7.centos on the engine VM and it failed with: Dec 16, 2015 12:16:57 PM The Hosted Engine Storage Domain isn't Active. Dec 16, 2015 12:16:57 PM Failed to import the Hosted Engine Storage Domain Dec 16, 2015 12:16:57 PM Failed to attach Storage Domain hosted_storage to Data Center Default. (User: SYSTEM) Dec 16, 2015 12:16:57 PM Failed to attach Storage Domains to Data Center Default. (User: SYSTEM) Dec 16, 2015 12:16:57 PM VDSM hosted_engine_1 command failed: Cannot acquire host id: (u'97f5a165-4df5-4bce-99cc-8a634753bc54', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument')) when it failed I got under /var/log/messages: Dec 16 12:16:56 c72he20151216h1 sanlock[12602]: 2015-12-16 12:16:56+0100 8448 [14210]: add_lockspace 97f5a165-4df5-4bce-99cc-8a634753bc54:250:/rhev/data-center/mnt/192.168.1.115:_Virtual_exthe7/97f5a165-4df5-4bce-99cc-8a634753bc54/dom_md/ids:0 conflicts with name of list1 s5 97f5a165-4df5-4bce-99cc-8a634753bc54:1:/rhev/data-center/mnt/192.168.1.115:_Virtual_exthe7/97f5a165-4df5-4bce-99cc-8a634753bc54/dom_md/ids:0 And vdsm.log jsonrpc.Executor/7::INFO::2015-12-16 12:16:56,797::clusterlock::219::Storage.SANLock::(acquireHostId) Acquiring host id for domain 97f5a165-4df5-4bce-99cc-8a634753bc54 (id: 250) jsonrpc.Executor/7::ERROR::2015-12-16 12:16:56,797::task::866::Storage.TaskManager.Task::(_setError) Task=`d1c6366c-3709-4fe9-a0a7-2c0339ecd8e1`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 873, in _run return fn(*args, **kargs) File "/usr/share/vdsm/logUtils.py", line 49, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 996, in createStoragePool leaseParams) File "/usr/share/vdsm/storage/sp.py", line 574, in create self._acquireTemporaryClusterLock(msdUUID, leaseParams) File "/usr/share/vdsm/storage/sp.py", line 506, in _acquireTemporaryClusterLock msd.acquireHostId(self.id) File "/usr/share/vdsm/storage/sd.py", line 533, in acquireHostId self._clusterLock.acquireHostId(hostId, async) File "/usr/share/vdsm/storage/clusterlock.py", line 234, in acquireHostId raise se.AcquireHostIdFailure(self._sdUUID, e) AcquireHostIdFailure: Cannot acquire host id: (u'97f5a165-4df5-4bce-99cc-8a634753bc54', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument')) jsonrpc.Executor/7::DEBUG::2015-12-16 12:16:56,852::task::885::Storage.TaskManager.Task::(_run) Task=`d1c6366c-3709-4fe9-a0a7-2c0339ecd8e1`::Task._run: d1c6366c-3709-4fe9-a0a7-2c0339ecd8e1 (None, u'00000001-0001-0001-0001-000000000212', u'Default', u'97f5a165-4df5-4bce-99cc-8a634753bc54', [u'97f5a165-4df5-4bce-99cc-8a634753bc54'], 1, None, 5, 60, 10, 3) {} failed - stopping task jsonrpc.Executor/7::DEBUG::2015-12-16 12:16:56,852::task::1246::Storage.TaskManager.Task::(stop) Task=`d1c6366c-3709-4fe9-a0a7-2c0339ecd8e1`::stopping in state preparing (force False) jsonrpc.Executor/7::DEBUG::2015-12-16 12:16:56,852::task::993::Storage.TaskManager.Task::(_decref) Task=`d1c6366c-3709-4fe9-a0a7-2c0339ecd8e1`::ref 1 aborting True jsonrpc.Executor/7::INFO::2015-12-16 12:16:56,852::task::1171::Storage.TaskManager.Task::(prepare) Task=`d1c6366c-3709-4fe9-a0a7-2c0339ecd8e1`::aborting: Task is aborted: 'Cannot acquire host id' - code 661 jsonrpc.Executor/7::DEBUG::2015-12-16 12:16:56,852::task::1176::Storage.TaskManager.Task::(prepare) Task=`d1c6366c-3709-4fe9-a0a7-2c0339ecd8e1`::Prepare: aborted: Cannot acquire host id jsonrpc.Executor/7::DEBUG::2015-12-16 12:16:56,852::task::993::Storage.TaskManager.Task::(_decref) Task=`d1c6366c-3709-4fe9-a0a7-2c0339ecd8e1`::ref 0 aborting True jsonrpc.Executor/7::DEBUG::2015-12-16 12:16:56,852::task::928::Storage.TaskManager.Task::(_doAbort) Task=`d1c6366c-3709-4fe9-a0a7-2c0339ecd8e1`::Task._doAbort: force False jsonrpc.Executor/7::DEBUG::2015-12-16 12:16:56,852::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} jsonrpc.Executor/7::DEBUG::2015-12-16 12:16:56,852::task::595::Storage.TaskManager.Task::(_updateState) Task=`d1c6366c-3709-4fe9-a0a7-2c0339ecd8e1`::moving from state preparing -> state aborting Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release. Hello, I was able to get the hosted engine storage domain working properly. I noticed that the ISO_DOMAIN was down so I deleted it and tried again to attach the hosted engine storage domain to the default data center and it worked. The VM was also automatically detected and imported. I am using CentOS 7 1511, all updates and last ovirt 3.6 at 17th of December Hope this help. *** Bug 1274294 has been marked as a duplicate of this bug. *** Bug 1281291 is duplicate to this bug, Still encounter Bug 1281291 in build rhev-hypervisor-7-7.2-20151218.2.e17ev. Test version: rhev-hypervisor-7-7.2-20151218.2.e17ev vdsm-4.17.13-1.el7ev.noarch ovirt-hosted-engine-ha-1.3.3.5-1.el7ev.noarch 20151216.0-1.3.6.ova Steps to Reproduce: 1. TUI clean install rhevh 2. Login rhevh, setup network via dhcp 3. Deploy Hosted Engine 4. Login ovirt-engine/webadmin web, check Virtual Machines page Actual results: 1. After step4, Virtual Machines page is blank. Expected results: 1.After step4, there is engine vm information on Virtual Machines page. I think you don't have an active datacenter yet? Still encounter Bug 1281291 in build rhev-hypervisor-7-7.2-20151221.1.e17ev. Test version: rhev-hypervisor-7-7.2-20151221.1.e17ev vdsm-4.17.13-1.el7ev.noarch ovirt-hosted-engine-ha-1.3.3.6-1.el7ev.noarch 20151216.0-1.3.6.ova(3.6.1.3-0.1.el6) Steps to Reproduce: 1. TUI clean install rhevh 2. Login rhevh, setup network via dhcp 3. Deploy Hosted Engine 4. Login ovirt-engine/webadmin web, check Virtual Machines page Actual results: 1. After step4, Virtual Machines page is blank. Expected results: 1.After step4, there is engine vm information on Virtual Machines page. Roy, should we reopen Bug 1281291? An active DC is precondition. Otherwise we can't import a domain without an Active DC and we can't import a vm without a DC as well. 3.5 supported showing an external VM (the engine vm in that case) without the DC being active. This is no longer supported. (In reply to Roy Golan from comment #42) > An active DC is precondition. Otherwise we can't import a domain without an > Active DC and we can't import a vm without a DC as well. > > 3.5 supported showing an external VM (the engine vm in that case) without > the DC being active. This is no longer supported. It still failed to work with an active DC. Test version: rhev-hypervisor7-7.2-20151221.1 ovirt-node-3.6.0-0.24.20151209gitc0fa931.el7ev.noarch ovirt-node-plugin-hosted-engine-0.3.0-4.el7ev.noarch ovirt-hosted-engine-setup-1.3.1.3-1.el7ev.noarch ovirt-hosted-engine-ha-1.3.3.6-1.el7ev.noarch RHEVM-appliance-20151216.0-1.3.6.ova Test steps: 1. TUI clean install rhevh 2. Login rhevh, setup network via dhcp 3. Deploy Hosted Engine 4. After engine starts add a master nfs storage domain and wait for pool to be Active Test result: 1. Failed to activate Storage Domain hosted_storage. 2. Manually press activate on the domain still got failed. Seem we met Bug 1290518 - failed Activating hosted engine domain during auto-import on NFS (In reply to shaochen from comment #43) > (In reply to Roy Golan from comment #42) > > An active DC is precondition. Otherwise we can't import a domain without an > > Active DC and we can't import a vm without a DC as well. > > > > 3.5 supported showing an external VM (the engine vm in that case) without > > the DC being active. This is no longer supported. > > It still failed to work with an active DC. > > Test version: > rhev-hypervisor7-7.2-20151221.1 > ovirt-node-3.6.0-0.24.20151209gitc0fa931.el7ev.noarch > ovirt-node-plugin-hosted-engine-0.3.0-4.el7ev.noarch > ovirt-hosted-engine-setup-1.3.1.3-1.el7ev.noarch > ovirt-hosted-engine-ha-1.3.3.6-1.el7ev.noarch > RHEVM-appliance-20151216.0-1.3.6.ova > > Test steps: > 1. TUI clean install rhevh > 2. Login rhevh, setup network via dhcp > 3. Deploy Hosted Engine > 4. After engine starts add a master nfs storage domain and wait for pool to > be Active > > Test result: > 1. Failed to activate Storage Domain hosted_storage. > 2. Manually press activate on the domain still got failed. > > Seem we met Bug 1290518 - failed Activating hosted engine domain during > auto-import on NFS I just tested this on our environment using the following RPMs: ovirt-host-deploy-1.4.1-1.el7.centos.noarch ovirt-hosted-engine-setup-1.3.1.4-1.el7.centos.noarch libgovirt-0.3.3-1.el7.x86_64 ovirt-vmconsole-host-1.0.0-1.el7.centos.noarch ovirt-engine-appliance-3.6-20151216.1.el7.centos.noarch ovirt-hosted-engine-ha-1.3.3.6-1.el7.centos.noarch ovirt-iso-uploader-3.6.0-1.el7.centos.noarch ovirt-vmconsole-1.0.0-1.el7.centos.noarch ovirt-setup-lib-1.0.0-1.el7.centos.noarch ovirt-engine-sdk-python-3.6.0.3-1.el7.centos.noarch Experienced same results as shaochen; Storage Domain hosted_storage doesn't exist. See message: "The Hosted Engine Storage Domain doesn't exist. It should be imported into the setup." (In reply to Charlie Inglese from comment #44) > (In reply to shaochen from comment #43) > > (In reply to Roy Golan from comment #42) > > > An active DC is precondition. Otherwise we can't import a domain without an > > > Active DC and we can't import a vm without a DC as well. > > > > > > 3.5 supported showing an external VM (the engine vm in that case) without > > > the DC being active. This is no longer supported. > > > > It still failed to work with an active DC. > > > > Test version: > > rhev-hypervisor7-7.2-20151221.1 > > ovirt-node-3.6.0-0.24.20151209gitc0fa931.el7ev.noarch > > ovirt-node-plugin-hosted-engine-0.3.0-4.el7ev.noarch > > ovirt-hosted-engine-setup-1.3.1.3-1.el7ev.noarch > > ovirt-hosted-engine-ha-1.3.3.6-1.el7ev.noarch > > RHEVM-appliance-20151216.0-1.3.6.ova > > > > Test steps: > > 1. TUI clean install rhevh > > 2. Login rhevh, setup network via dhcp > > 3. Deploy Hosted Engine > > 4. After engine starts add a master nfs storage domain and wait for pool to > > be Active > > > > Test result: > > 1. Failed to activate Storage Domain hosted_storage. > > 2. Manually press activate on the domain still got failed. > > > > Seem we met Bug 1290518 - failed Activating hosted engine domain during > > auto-import on NFS > > I just tested this on our environment using the following RPMs: > ovirt-host-deploy-1.4.1-1.el7.centos.noarch > ovirt-hosted-engine-setup-1.3.1.4-1.el7.centos.noarch > libgovirt-0.3.3-1.el7.x86_64 > ovirt-vmconsole-host-1.0.0-1.el7.centos.noarch > ovirt-engine-appliance-3.6-20151216.1.el7.centos.noarch > ovirt-hosted-engine-ha-1.3.3.6-1.el7.centos.noarch > ovirt-iso-uploader-3.6.0-1.el7.centos.noarch > ovirt-vmconsole-1.0.0-1.el7.centos.noarch > ovirt-setup-lib-1.0.0-1.el7.centos.noarch > ovirt-engine-sdk-python-3.6.0.3-1.el7.centos.noarch > > Experienced same results as shaochen; Storage Domain hosted_storage doesn't > exist. See message: "The Hosted Engine Storage Domain doesn't exist. It > should be imported into the setup." Thanks. Can you also upload your engine.log and vdsm.log? (In reply to Roy Golan from comment #45) > (In reply to Charlie Inglese from comment #44) > > (In reply to shaochen from comment #43) > > > (In reply to Roy Golan from comment #42) > > > > An active DC is precondition. Otherwise we can't import a domain without an > > > > Active DC and we can't import a vm without a DC as well. > > > > > > > > 3.5 supported showing an external VM (the engine vm in that case) without > > > > the DC being active. This is no longer supported. > > > > > > It still failed to work with an active DC. > > > > > > Test version: > > > rhev-hypervisor7-7.2-20151221.1 > > > ovirt-node-3.6.0-0.24.20151209gitc0fa931.el7ev.noarch > > > ovirt-node-plugin-hosted-engine-0.3.0-4.el7ev.noarch > > > ovirt-hosted-engine-setup-1.3.1.3-1.el7ev.noarch > > > ovirt-hosted-engine-ha-1.3.3.6-1.el7ev.noarch > > > RHEVM-appliance-20151216.0-1.3.6.ova > > > > > > Test steps: > > > 1. TUI clean install rhevh > > > 2. Login rhevh, setup network via dhcp > > > 3. Deploy Hosted Engine > > > 4. After engine starts add a master nfs storage domain and wait for pool to > > > be Active > > > > > > Test result: > > > 1. Failed to activate Storage Domain hosted_storage. > > > 2. Manually press activate on the domain still got failed. > > > > > > Seem we met Bug 1290518 - failed Activating hosted engine domain during > > > auto-import on NFS > > > > I just tested this on our environment using the following RPMs: > > ovirt-host-deploy-1.4.1-1.el7.centos.noarch > > ovirt-hosted-engine-setup-1.3.1.4-1.el7.centos.noarch > > libgovirt-0.3.3-1.el7.x86_64 > > ovirt-vmconsole-host-1.0.0-1.el7.centos.noarch > > ovirt-engine-appliance-3.6-20151216.1.el7.centos.noarch > > ovirt-hosted-engine-ha-1.3.3.6-1.el7.centos.noarch > > ovirt-iso-uploader-3.6.0-1.el7.centos.noarch > > ovirt-vmconsole-1.0.0-1.el7.centos.noarch > > ovirt-setup-lib-1.0.0-1.el7.centos.noarch > > ovirt-engine-sdk-python-3.6.0.3-1.el7.centos.noarch > > > > Experienced same results as shaochen; Storage Domain hosted_storage doesn't > > exist. See message: "The Hosted Engine Storage Domain doesn't exist. It > > should be imported into the setup." > > Thanks. Can you also upload your engine.log and vdsm.log? Hi Roy, You might find the data here https://bugzilla.redhat.com/show_bug.cgi?id=1294457, probably it's the same issue. I believe I am experiencing this bug as well... - Running oVirt 3.6.1.3-1.el7.centos on a 3-host cluster - Installed HE on Gluster SD (Gluster 3.7.6, hyperconverged, volume name = engine) -- DC was in "Uninitialized" state, if I clicked "Guide Me" then I could see a task to be done of "Attach Storage"; if I clicked on that, I could see the "hosted_storage" domain was available to be attached - Added new data SD using Gluster (hyperconverged, volume name = vmdata) -- when did this, DC went from "Uninitialized" to "Up"; now the "hosted_storage" domain is not showing up anywhere, either as a existing SD, or an available-to-be-attached SD -- the hosted engine VM also is not showing up in the "VMs" node Saw the following message recorded in /var/log/ovirt-engine/engine.log on the engine VM: Message: This Data center compatibility version does not support importing a data domain with its entities (VMs and Templates). The imported domain will be imported without them. (cf http://paste.fedoraproject.org/305453/45131110/ ) The DC compat version shows as "3.6" *** Bug 1215158 has been marked as a duplicate of this bug. *** Update: I *do* see the "hosted_storage" SD in the "System" node, "Storage" sub-tab. It shows in Cross Data Center Status as "Unattached". Additionally, if one selects the hosted_storage SD and tries to Edit it, an Uncaught exception message popup shows, with the following detail message: Uncaught exception occurred. Please try reloading the page. Details: (TypeError) __gwt$exception: <skipped>: Cannot read property 'b' of null Will attach relevant screenshots. Created attachment 1112290 [details]
hosted_storage SD status in System > Storage tab
Created attachment 1112291 [details]
uncaught exception when trying to edit hosted_storage SD
Update: I was able to successfully re-import the hosted_storage SD and make the HostedEngine VM appear in the webadmin UI by doing the following: 1) In the System node, Storage sub-tab, alt-click on the Unassigned hosted_storage SD, and choose Destroy, confirm and OK - SD is removed 2) Within a minute, the hosted_storage SD reappears; it is in status “Maintenance” 3) Go to the Data Centers node, Storage sub-tab, click on the hosted_storage SD and click the Activate link 4) Within a few seconds, the hosted_storage SD goes into state Active 5) Within a minute or so, the HostedEngine VM appears in the VMs node, and is in status Up I got the idea to do this via comments #9 and #12 in https://bugzilla.redhat.com/show_bug.cgi?id=1290518 Successfully got HE-SD&HE-VM auto-imported on cleanly installed NFS deployment after NFS data SD was added. Engine was installed using PXE. Works for me on these components: Host: ovirt-vmconsole-1.0.0-1.el7ev.noarch ovirt-hosted-engine-ha-1.3.3.7-1.el7ev.noarch mom-0.5.1-1.el7ev.noarch qemu-kvm-rhev-2.3.0-31.el7_2.6.x86_64 ovirt-host-deploy-1.4.1-1.el7ev.noarch libvirt-client-1.2.17-13.el7_2.2.x86_64 ovirt-setup-lib-1.0.1-1.el7ev.noarch vdsm-4.17.18-0.el7ev.noarch ovirt-vmconsole-host-1.0.0-1.el7ev.noarch ovirt-hosted-engine-setup-1.3.2.3-1.el7ev.noarch sanlock-3.2.4-2.el7_2.x86_64 Linux version 3.10.0-327.8.1.el7.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-4) (GCC) ) #1 SMP Mon Jan 11 05:03:18 EST 2016 Engine: ovirt-vmconsole-1.0.0-1.el6ev.noarch ovirt-host-deploy-1.4.1-1.el6ev.noarch ovirt-setup-lib-1.0.1-1.el6ev.noarch ovirt-vmconsole-proxy-1.0.0-1.el6ev.noarch ovirt-host-deploy-java-1.4.1-1.el6ev.noarch ovirt-engine-extension-aaa-jdbc-1.0.5-1.el6ev.noarch rhevm-3.6.2.6-0.1.el6.noarch rhevm-dwh-setup-3.6.2-1.el6ev.noarch rhevm-dwh-3.6.2-1.el6ev.noarch rhevm-reports-setup-3.6.2.4-1.el6ev.noarch rhevm-reports-3.6.2.4-1.el6ev.noarch rhevm-guest-agent-common-1.0.11-2.el6ev.noarch Linux version 2.6.32-573.8.1.el6.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-16) (GCC) ) #1 SMP Fri Sep 25 19:24:22 EDT 2015 For FC: HE SD auto import finished successfully as part of deployment over FC. HE storage domain gets imported automatically with first DC initialization. Tested using: Host: ovirt-hosted-engine-ha-1.3.3.7-1.el7ev.noarch libgovirt-0.3.3-1.el7_2.1.x86_64 ovirt-vmconsole-1.0.0-1.el7ev.noarch ovirt-host-deploy-1.4.1-1.el7ev.noarch ovirt-setup-lib-1.0.1-1.el7ev.noarch ovirt-vmconsole-host-1.0.0-1.el7ev.noarch ovirt-hosted-engine-setup-1.3.2.3-1.el7ev.noarch vdsm-jsonrpc-4.17.18-0.el7ev.noarch vdsm-python-4.17.18-0.el7ev.noarch vdsm-hook-vmfex-dev-4.17.18-0.el7ev.noarch vdsm-cli-4.17.18-0.el7ev.noarch vdsm-yajsonrpc-4.17.18-0.el7ev.noarch vdsm-xmlrpc-4.17.18-0.el7ev.noarch vdsm-4.17.18-0.el7ev.noarch vdsm-infra-4.17.18-0.el7ev.noarch Engine: ovirt-engine-extension-aaa-jdbc-1.0.5-1.el6ev.noarch ovirt-host-deploy-1.4.1-1.el6ev.noarch ovirt-vmconsole-1.0.0-1.el6ev.noarch ovirt-host-deploy-java-1.4.1-1.el6ev.noarch rhevm-setup-plugin-ovirt-engine-common-3.6.2.6-0.1.el6.noarch ovirt-vmconsole-proxy-1.0.0-1.el6ev.noarch rhevm-setup-plugin-ovirt-engine-3.6.2.6-0.1.el6.noarch ovirt-setup-lib-1.0.1-1.el6ev.noarch rhevm-setup-plugin-websocket-proxy-3.6.2.6-0.1.el6.noarch rhevm-vmconsole-proxy-helper-3.6.2.6-0.1.el6.noarch rhevm-spice-client-x86-msi-3.6-6.el6.noarch rhevm-lib-3.6.2.6-0.1.el6.noarch rhevm-cli-3.6.0.0-1.el6ev.noarch rhevm-webadmin-portal-3.6.2.6-0.1.el6.noarch rhevm-tools-3.6.2.6-0.1.el6.noarch rhevm-iso-uploader-3.6.0-1.el6ev.noarch rhevm-doc-3.6.0-2.el6eng.noarch rhevm-backend-3.6.2.6-0.1.el6.noarch rhevm-setup-3.6.2.6-0.1.el6.noarch rhevm-spice-client-x64-cab-3.6-6.el6.noarch rhevm-userportal-3.6.2.6-0.1.el6.noarch rhevm-image-uploader-3.6.0-1.el6ev.noarch rhevm-branding-rhev-3.6.0-3.el6ev.noarch rhevm-sdk-python-3.6.2.1-1.el6ev.noarch rhevm-log-collector-3.6.0-1.el6ev.noarch rhevm-dependencies-3.6.0-1.el6ev.noarch rhevm-setup-plugin-ovirt-engine-common-3.6.2.6-0.1.el6.noarch rhevm-dbscripts-3.6.2.6-0.1.el6.noarch rhevm-setup-plugins-3.6.1-2.el6ev.noarch rhevm-spice-client-x64-msi-3.6-6.el6.noarch rhevm-restapi-3.6.2.6-0.1.el6.noarch rhevm-setup-plugin-ovirt-engine-3.6.2.6-0.1.el6.noarch rhevm-3.6.2.6-0.1.el6.noarch rhevm-setup-base-3.6.2.6-0.1.el6.noarch rhevm-extensions-api-impl-3.6.2.6-0.1.el6.noarch rhevm-websocket-proxy-3.6.2.6-0.1.el6.noarch rhevm-setup-plugin-vmconsole-proxy-helper-3.6.2.6-0.1.el6.noarch rhevm-spice-client-x86-cab-3.6-6.el6.noarch We just attempted to run this with the following RPMs and it did not work on our environment. The Hosted Engine domain was not imported neither was the VM. We are using a gluster deployment instead of NFS as noted in comment 53 so not sure if that is breaking us or if there is something else. ovirt-vmconsole-host-1.0.0-1.el7.centos.noarch ovirt-hosted-engine-ha-1.3.3.7-1.el7.centos.noarch ovirt-setup-lib-1.0.1-1.el7.centos.noarch ovirt-hosted-engine-setup-1.3.2.3-1.el7.centos.noarch ovirt-engine-appliance-3.6-20160126.1.el7.centos.noarch ovirt-vmconsole-1.0.0-1.el7.centos.noarch ovirt-host-deploy-1.4.1-1.el7.centos.noarch In the engine.log we see this error: 2016-01-28 02:49:52,155 INFO [org.ovirt.engine.core.bll.ImportHostedEngineStorageDomainCommand] (org.ovirt.thread.pool-8-thread-39) [18b0ac7b] Lock Acquired to object 'EngineLock:{exclusiveLocks='[]', sharedLocks='null'}' 2016-01-28 02:49:52,156 WARN [org.ovirt.engine.core.bll.ImportHostedEngineStorageDomainCommand] (org.ovirt.thread.pool-8-thread-39) [18b0ac7b] CanDoAction of action 'ImportHostedEngineStorageDomain' failed for user SYSTEM. Reasons: VAR__ACTION__ADD,VAR__TYPE__STORAGE__DOMAIN,ACTION_TYPE_FAILED_STORAGE_DOMAIN_NOT_EXIST Around the same time in the vdsm logs we see the following messages: jsonrpc.Executor/5::DEBUG::2016-01-28 02:49:52,904::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {u'Storage.9495c726-12a1-4fa5-b943-a0dc0a91bd16': < ResourceRef 'Storage.9495c726-12a1-4fa5-b943-a0dc0a91bd16', isValid: 'True' obj: 'None'>} I can see the volume exists here: [root@pserver7 config]# vdsClient -s 0 getStorageDomainsList dae81b36-caf2-4b58-8040-ab39ec1a1105 9495c726-12a1-4fa5-b943-a0dc0a91bd16 7192e134-b33e-4ce9-8bf5-cc5d749c38c7 06675908-925d-467a-a635-36119710d641 29ddf8d4-9af2-4a00-9e90-575f2adaf4ce Also to add onto comment 56. We had 4 storage domains successfully created in Ovirt so the datacenter does become active. We were also able to create another VM with no issues. I do see this warnings in the engine.log but am not entirely sure they are related at the moment, still doing some digging but in case someone has seen them: 2016-01-28 02:34:03,696 WARN [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturnForXmlRpc] (DefaultQuartzScheduler_Worker-1) [] Could not add brick 'pserver7:/gluster/sdb1/engine/brick' to volume '674edbae-b960-4e16-89db-49c297b96b4b' - server uuid 'edb0245b-883f-451d-b92d-9a91038de8f8' not found in cluster '00000002-0002-0002-0002-000000000295' Sorry to keep commenting but just wanted to add that we DO NOT see the hosted_engine storage domain in the System node, Storage sub-tab. (In reply to Bryan from comment #57) > Sorry to keep commenting but just wanted to add that we DO NOT see the > hosted_engine storage domain in the System node, Storage sub-tab. Possibly you're hitting this one https://bugzilla.redhat.com/show_bug.cgi?id=1300749 , look at the comment # 12. That makes sense but my install is solely on a single host. I am not trying to add a second host to the cluster. I am attempting to install a single hosted engine server and am unable to see the engine SD in the ovirt UI after installing the single host. (In reply to Bryan from comment #59) > That makes sense but my install is solely on a single host. I am not trying > to add a second host to the cluster. > > I am attempting to install a single hosted engine server and am unable to > see the engine SD in the ovirt UI after installing the single host. After the comment from Roy, we changed the SD name to be 'hosted_storage' and the import was successful through an automated install. Will there be a fix to allow for other names since it is a property in the answer file as well as a text field in the interactive install? It would be nice to be able to change the name which is supported in the install to allow for easier understanding when looking at the storage domains in the UI. Verified on ovirt-hosted-engine-ha-1.3.4.3-1.el7ev.noarch and rhevm-3.6.3.3-0.1.el6.noarch 1) Checked auto-import on clean install ISCSI and NFS 2) Checked auto-import on upgrade from 3.5 to 3.6 ISCSI and NFS. this bug is similar to my issue and i try to reinstall RHEV and follow any instructions for Self Hosted deployment as to set default storage domain name "hosted_storage"... SH deployment is OK but hosted_storage and SH VM aren't available on RHEVM. After do the trick given on comment 52: - add another SD (master) and make it UP with hosted_engine_1 as SPM. - detach "hosted_storage" SD from the storage tab into system node. the detached "hosted_storage" SD appear after a will into the Default storage tree. So i could see "hosted_storage" SD and VM of self hosted engine. Sorry to notice that auto import hosted engine domain isn't ON_QA->Verified for my environment: RHEVH 7.2 Beta RHEVM self-hosted "rhevm-appliance-20160120.0-1.x86_64.rhevm.ova" deployed with FC storage. After upgrading from 3.6 to 3.6.3 I still cannot see the hosted_engine vm. I'm using glusterfs with replica 3. looking into the logs, both of my 2 nodes cluster complain about having the hosted engine storage domain already mounted, with wrong path: Mar 11 10:01:04 pdp-cluster2 ovirt-ha-agent: WARNING:ovirt_hosted_engine_ha.agent.agent.Agent:Restarting agent, attempt '9' Mar 11 10:01:04 pdp-cluster2 journal: ovirt-ha-agent ovirt_hosted_engine_ha.agent.agent.Agent ERROR Too many errors occurred, giving up. Please review the log and consider filing a bug. Mar 11 10:01:04 pdp-cluster2 ovirt-ha-agent: ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Too many errors occurred, giving up. Please review the log and consider filing a bug. Mar 11 10:01:04 pdp-cluster2 ovirt-ha-agent: INFO:ovirt_hosted_engine_ha.agent.agent.Agent:Agent shutting down Mar 11 10:01:04 pdp-cluster2 systemd: ovirt-ha-agent.service: main process exited, code=exited, status=157/n/a Mar 11 10:01:04 pdp-cluster2 systemd: Unit ovirt-ha-agent.service entered failed state. Mar 11 10:01:04 pdp-cluster2 systemd: ovirt-ha-agent.service failed. Mar 11 10:01:04 pdp-cluster2 systemd: ovirt-ha-agent.service holdoff time over, scheduling restart. Mar 11 10:01:04 pdp-cluster2 systemd: Started oVirt Hosted Engine High Availability Monitoring Agent. Mar 11 10:01:04 pdp-cluster2 systemd: Starting oVirt Hosted Engine High Availability Monitoring Agent... Mar 11 10:01:04 pdp-cluster2 ovirt-ha-agent: INFO:ovirt_hosted_engine_ha.agent.agent.Agent:ovirt-hosted-engine-ha agent 1.3.4.3 started Mar 11 10:01:04 pdp-cluster2 ovirt-ha-agent: INFO:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:Found certificate common name: pdp-cluster2 Mar 11 10:01:04 pdp-cluster2 ovirt-ha-agent: INFO:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:Initializing VDSM Mar 11 10:01:04 pdp-cluster2 ovirt-ha-agent: INFO:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:Connecting the storage Mar 11 10:01:04 pdp-cluster2 ovirt-ha-agent: INFO:ovirt_hosted_engine_ha.lib.storage_server.StorageServer:Connecting storage server Mar 11 10:01:04 pdp-cluster2 ovirt-ha-agent: INFO:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:VDSM domain monitor status: PENDING Mar 11 10:01:04 pdp-cluster2 ovirt-ha-agent: INFO:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:Failed to stop monitoring domain (sd_uuid=142838d0-eda2-4513-bc5e-50d482ba0685): Error 900 from stopMonitoringDomain: Storage domain is member of pool: 'domain=142838d0-eda2-4513-bc5e-50d482ba0685' Mar 11 10:01:04 pdp-cluster2 ovirt-ha-agent: INFO:ovirt_hosted_engine_ha.lib.image.Image:Teardown images Mar 11 10:01:04 pdp-cluster2 ovirt-ha-agent: WARNING:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:Disconnecting the storage Mar 11 10:01:04 pdp-cluster2 ovirt-ha-agent: INFO:ovirt_hosted_engine_ha.lib.storage_server.StorageServer:Disconnecting storage server Mar 11 10:01:05 pdp-cluster2 kernel: device-mapper: table: 253:0: multipath: error getting device Mar 11 10:01:05 pdp-cluster2 kernel: device-mapper: ioctl: error adding target to table Mar 11 10:01:05 pdp-cluster2 multipathd: dm-0: remove map (uevent) Mar 11 10:01:05 pdp-cluster2 multipathd: dm-0: remove map (uevent) Mar 11 10:01:05 pdp-cluster2 kernel: device-mapper: table: 253:0: multipath: error getting device Mar 11 10:01:05 pdp-cluster2 kernel: device-mapper: ioctl: error adding target to table Mar 11 10:01:05 pdp-cluster2 multipathd: dm-0: remove map (uevent) Mar 11 10:01:05 pdp-cluster2 multipathd: dm-0: remove map (uevent) Mar 11 10:01:05 pdp-cluster2 ovirt-ha-agent: INFO:ovirt_hosted_engine_ha.lib.upgrade.StorageServer:Fixing storage path in conf file Mar 11 10:01:05 pdp-cluster2 ovirt-ha-agent: INFO:ovirt_hosted_engine_ha.lib.upgrade.StorageServer:Reading conf file: hosted-engine.conf Mar 11 10:01:05 pdp-cluster2 ovirt-ha-agent: INFO:ovirt_hosted_engine_ha.lib.upgrade.StorageServer:Successfully fixed path in conf file Mar 11 10:01:05 pdp-cluster2 ovirt-ha-agent: INFO:ovirt_hosted_engine_ha.lib.storage_server.StorageServer:Connecting storage server Mar 11 10:01:05 pdp-cluster2 journal: ovirt-ha-agent ovirt_hosted_engine_ha.agent.agent.Agent ERROR Error: 'The hosted-engine storage domain is already mounted on '/rhev/data-center/mnt/glusterSD/gluster2:_ovirt-storage/142838d0-eda2-4513-bc5e-50d482ba0685' with a path that is not supported anymore: the right path should be '/rhev/data-center/mnt/gluster1:_ovirt-storage/142838d0-eda2-4513-bc5e-50d482ba0685'.' - trying to restart agent Mar 11 10:01:05 pdp-cluster2 ovirt-ha-agent: ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Error: 'The hosted-engine storage domain is already mounted on '/rhev/data-center/mnt/glusterSD/gluster2:_ovirt-storage/142838d0-eda2-4513-bc5e-50d482ba0685' with a path that is not supported anymore: the right path should be '/rhev/data-center/mnt/gluster1:_ovirt-storage/142838d0-eda2-4513-bc5e-50d482ba0685'.' - trying to restart agent Mar 11 10:01:07 pdp-cluster2 sanlock[997]: 2016-03-11 10:01:07+0100 81894 [1002]: add_lockspace 142838d0-eda2-4513-bc5e-50d482ba0685:2:/rhev/data-center/mnt/glusterSD/gluster2:_ovirt-storage/142838d0-eda2-4513-bc5e-50d482ba0685/dom_md/ids:0 conflicts with name of list1 s2373 142838d0-eda2-4513-bc5e-50d482ba0685:2:/rhev/data-center/mnt/glusterSD/gluster1:_ovirt-storage/142838d0-eda2-4513-bc5e-50d482ba0685/dom_md/ids:0 and so on.... (In reply to Matteo Brancaleoni from comment #63) > After upgrading from 3.6 to 3.6.3 I still cannot see the hosted_engine vm. > > I'm using glusterfs with replica 3. > > looking into the logs, both of my 2 nodes cluster complain about having the > hosted engine storage domain already mounted, with wrong path: You bumped into another bug 1300749, which should be fixed in this version. Please add your report to that bug and reopen if if needed. Here the issue is about the gluster entry point: /rhev/data-center/mnt/gluster1:_ovirt-storage/142838d0-eda2-4513-bc5e-50d482ba0685 /rhev/data-center/mnt/glusterSD/gluster2:_ovirt-storage/142838d0-eda2-4513-bc5e-50d482ba0685 Did you used the same gluster entry point (gluster1/... or gluster2/...) on both your hosts? The issue is this one https://bugzilla.redhat.com/show_bug.cgi?id=1299427 Yes, just doublechecked and it points to the same entry point (looked into hosted-engine.conf). Further investigation, led me to find an unattached storage which was a previous try into importing the hosted storage domain in the "hope" to solve error exposed in comment 53. So I destroyed it from the web gui, but still a reboot of the nodes was required. now, after reboot, I don't see anymore the error regarding the already mounted error, but still the hosted engine storage/vm does not appear between vm list. the only log that keeps being emitted is: Mar 11 12:34:26 pdp-cluster1 ovirt-ha-agent: INFO:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:Preparing images Mar 11 12:34:26 pdp-cluster1 ovirt-ha-agent: INFO:ovirt_hosted_engine_ha.lib.image.Image:Preparing images Mar 11 12:34:26 pdp-cluster1 ovirt-ha-agent: INFO:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:Reloading vm.conf from the shared storage domain Mar 11 12:34:26 pdp-cluster1 ovirt-ha-agent: INFO:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config:Trying to get a fresher copy of vm configuration from the OVF_STORE Mar 11 12:34:26 pdp-cluster1 ovirt-ha-agent: WARNING:ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore:Unable to find OVF_STORE Mar 11 12:34:26 pdp-cluster1 journal: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config ERROR Unable to get vm.conf from OVF_STORE, falling back to initial vm.conf Mar 11 12:34:26 pdp-cluster1 ovirt-ha-agent: ERROR:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config:Unable to get vm.conf from OVF_STORE, falling back to initial vm.conf Mar 11 12:34:26 pdp-cluster1 ovirt-ha-agent: INFO:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:VDSM domain monitor status: NONE Mar 11 12:34:26 pdp-cluster1 ovirt-ha-agent: INFO:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:Started VDSM domain monitor for 142838d0-eda2-4513-bc5e-50d482ba0685 and so no You have now to add your first regular storage domain Yes it (In reply to Simone Tiraboschi from comment #67) > You have now to add your first regular storage domain Yes it worked, thanks! I am running into this same issue on a fresh install of RHEV 3.6. The work around is not working for me. I have already added my regular storage domain. However, the hosted-engine vm does not appear in the UI. Regarding the hypervisor node where the hosted-engine is running. Thus far I am unable to put that host into maintenance mode. I am seeing this error in the engine.log. "Failed to find host on any provider by host name 'rhev.lab.localdomain' " This is the hostname of my hosted-engine. I see similar issue on 4.05 version. Impossible to import default iscsi storage VDSM hosted_engine_1 command failed: Cannot acquire host id: ('aaf27ad6-ef68-4fde-bd3a-a24ac3110b91', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument')) |