Created attachment 1079221 [details] vdsm log showing the import exception Description of problem: After finishing a self hosted engine setup and logging into the web UI. Importing the storage domain the engine VM is running on fails with 'Error while executing action Attach Storage Domain: AcquireHostIdFailure'. Version-Release number of selected component (if applicable): ovirt-release36-001-0.5.beta.noarch vdsm-gluster-4.17.8-0.el7.centos.noarc How reproducible: Every time I tried. Steps to Reproduce: Setup for the host: - CentOS 7.1 minimal installation - Following the steps to setup oVirt in the 3.6 RC notes http://www.ovirt.org/OVirt_3.6_Release_Notes and http://www.ovirt.org/Hosted_Engine_Howto#Fresh_Install Storage: - CentOS 7.1 minimal installation - GlusterFS 3.7 - replica 3 volume set up according to this http://www.ovirt.org/Features/Self_Hosted_Engine_Gluster_Support 1. Finish the self hosted engine setup and log in to the web UI 2. Select 'Import Pre-Configured Domain' from the Storage tab 3. Change storage type to glusterfs, fill out the name and export path information (same as during the hosted-engine --deploy) Actual results: 'Error while executing action Attach Storage Domain: AcquireHostIdFailure' Expected results: Imported storage domain. Additional info: vdsm.log on the host shows this error when trying to import the storage domain: Thread-1640::ERROR::2015-10-01 15:08:39,788::task::866::Storage.TaskManager.Task::(_setError) Task=`2d6cafb6-1192-4b68-9592-269aaffa7da9`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 873, in _run return fn(*args, **kargs) File "/usr/share/vdsm/logUtils.py", line 49, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 998, in createStoragePool leaseParams) File "/usr/share/vdsm/storage/sp.py", line 574, in create self._acquireTemporaryClusterLock(msdUUID, leaseParams) File "/usr/share/vdsm/storage/sp.py", line 506, in _acquireTemporaryClusterLock msd.acquireHostId(self.id) File "/usr/share/vdsm/storage/sd.py", line 532, in acquireHostId self._clusterLock.acquireHostId(hostId, async) File "/usr/share/vdsm/storage/clusterlock.py", line 234, in acquireHostId raise se.AcquireHostIdFailure(self._sdUUID, e) AcquireHostIdFailure: Cannot acquire host id: (u'df673b5c-3c20-4cf5-a0c2-f9b55559d917', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument'))
Created attachment 1079222 [details] sanlock log file
Created attachment 1079223 [details] ovirt hosted engine setup log
Created attachment 1079224 [details] ovirt hosted engine setup answer file
Importing a domain involves attaching it to a pool, which can't be done since the self hosted engine's domain is in its own "pool". Roy - weren't you guys working on a procedure that does this?
(In reply to Allon Mureinik from comment #4) > Importing a domain involves attaching it to a pool, which can't be done > since the self hosted engine's domain is in its own "pool". > > Roy - weren't you guys working on a procedure that does this? In 3.6 we import the hosted engine domain into the engine. But here it seems that the lock is held by someone which isn't SPM. Did you do some manual recovery procedure or something else? Maor/Allon - how is gluster related to that as with ISCSI and NFS this works (and we use sanlock three too cmiiw)
(In reply to Roy Golan from comment #5) > (In reply to Allon Mureinik from comment #4) > > Importing a domain involves attaching it to a pool, which can't be done > > since the self hosted engine's domain is in its own "pool". > > > > Roy - weren't you guys working on a procedure that does this? > > In 3.6 we import the hosted engine domain into the engine. > > > But here it seems that the lock is held by someone which isn't SPM. Did you > do > some manual recovery procedure or something else? > > > Maor/Allon - how is gluster related to that as with ISCSI and NFS this works > (and we use sanlock three too cmiiw) Basically Gluster and NFS should reflect similar logic, The AcquireHostIdFailure usually fails when there is a Host which still acquire sanlock lease on the Storage Domain. This could be happened when the environment has been recovered and the Hosts were not rebooted primarily. Was this environment is a recovered environment? Can you please also attach the full engine log with the error of the import operation?
The engine.log of the machine I created the bug report off got rotated and removed. Therefore I wiped everything and reinstalled from scratch. Notable is that after the installation went through and the engine was shut down the HA agent did not start it. But I guess this is another problem. Since I'm quite new to ovirt I'm not sure what you mean by 'recovered environment'. Follow this path: yum -y install http://resources.ovirt.org/pub/yum-repo/ovirt-release36.rpm yum -y install ovirt-hosted-engine-setup screen vdsm-gluster screen hosted-engine --deploy I install the engine from the local CentOS 7.1 ISO image. Normally I access the engine immediately after the installation is done and try to import the storage domain. In this instance I rebooted. After rebooting the system the engine got start automatically. As described in the initial report I logged in. The Storage tab is empty. I tried to import the storage domain created during the deploy process. This failed again with 'AcquireHostIdFailure'. The vdsm log shows the same errr as above. The engine log shows this error: 2015-10-14 11:18:06,864 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-10) [66bb3917] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VDSM cube-one command failed: Cannot acquire host id: (u'2fe5d951-060c-45 5a-af2c-aeb77120b969', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument')) If I try to import the domain again the error changes to 'Storage connection already exists'. I'm attaching three archives to this bug containing all logs of the engine, the host, a list of packages installed on those systems and screenshots from the engine WebUI.
Created attachment 1082765 [details] Logs from the host system.
Created attachment 1082766 [details] Logs from the engine system.
Created attachment 1082768 [details] Screenshots of the WebUI during the import.
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.
this is an automated message. oVirt 3.6.0 RC3 has been released and GA is targeted to next week, Nov 4th 2015. Please review this bug and if not a blocker, please postpone to a later release. All bugs not postponed on GA release will be automatically re-targeted to - 3.6.1 if severity >= high - 4.0 if severity < high
Can you try with latest RC? I think it is resolved, note that it still doesn't auto import. This should be fixed as part of BZ #1269768.
I tried 3.6 RC 3 following the same installation steps as before. The manual import still fails with the same error. I'm attaching the latest log files from the host and the engine.
Created attachment 1087470 [details] Logs from the Engine RC3
Created attachment 1087471 [details] Logs from the Host RC3
Do you know about this issue? should this be on SLA?
(In reply to Yaniv Dary from comment #17) > Do you know about this issue? should this be on SLA? That has been stated before, this will be solved in bug 1269768. *** This bug has been marked as a duplicate of bug 1269768 ***