Created attachment 1001003 [details] host logs In the Hyper Converged configuration gluster daemon is configured and running on the host. Hosted Engine Setup creates a local brick and use it for creating a master data storage domain where to deploy the Hosted Engine VM. Once the VM is deployed and the engine is running, the setup uses the SDK to add the host to the cluster. ovirt-host-deploy is then executed by the engine. At this point the mount point under /rhev used by the data storage domain disappears and the Hosted Engine VM enters paused state. In the mount logs: [2015-02-12 15:33:45.223019] W [glusterfsd.c:1194:cleanup_and_exit] (--> 0-: received signum (15), shutting down [2015-02-12 15:33:45.223040] I [fuse-bridge.c:5599:fini] 0-fuse: Unmounting '/rhev/data-center/mnt/glusterSD/mididell.home:_hosted__engine__glusterfs'. It seems that something is terminating glusterd with sigterm. Restarting glusterd shouldn't be causing it and ovirt-host-deploy and ovirt-hosted-engine-setup are not sending sigterm.
Created attachment 1001005 [details] hosted engine vm logs
Host is RHEL 7.1 fabric script for installing rpms used for the test: testing(): fabric.api.run('yum -y install glusterfs-server python-magic') urls = [ ('http://jenkins.ovirt.org/job/' 'ovirt-hosted-engine-setup_any_create-rpms_manual/11/ARCH=x86_64,' 'DISTRIBUTION=epel-7/artifact/exported-artifacts/' 'ovirt-hosted-engine-setup-1.2.3-0.0.master.el7.centos.noarch.rpm') ] for package in ( 'vdsm-4.16.12-7.gita30da75.el7.centos.x86_64.rpm', 'vdsm-cli-4.16.12-7.gita30da75.el7.centos.noarch.rpm', 'vdsm-python-zombiereaper-4.16.12-7.gita30da75.el7.centos.noarch.rpm', 'vdsm-jsonrpc-4.16.12-7.gita30da75.el7.centos.noarch.rpm', 'vdsm-xmlrpc-4.16.12-7.gita30da75.el7.centos.noarch.rpm', 'vdsm-yajsonrpc-4.16.12-7.gita30da75.el7.centos.noarch.rpm', 'vdsm-python-4.16.12-7.gita30da75.el7.centos.noarch.rpm', 'vdsm-gluster-4.16.12-7.gita30da75.el7.centos.noarch.rpm', ): urls.append( 'http://jenkins.ovirt.org/job/manual-build-tarball/490/label=el7/' 'artifact/exported-artifacts/%s' % package ) fabric.api.run('rpm -Uvh --force ' + ' '.join(urls)) gluster rpms come from http://download.gluster.org/pub/gluster/glusterfs/3.6/3.6.2/ other ovirt rpms are from nightly master snapshot.
Using VDSM component since it seems that the removal of the storage is due to fuse termination while libgfapi is supposed to be enabled in the rpms used.
I experienced a similar behaviour. While hosted-engine-setup was adding the host to the newly created cluster, VDSM crashed. On the hosted-engine-setup logs we have: [...] 2015-03-26 18:36:48 DEBUG otopi.plugins.ovirt_hosted_engine_setup.engine.add_host add_host._closeup:635 Adding the host to the cluster 2015-03-26 18:36:49 INFO otopi.plugins.ovirt_hosted_engine_setup.engine.add_host add_host._wait_host_ready:169 Waiting for the host to become operational in the engine. This may take several minutes... 2015-03-26 18:36:50 DEBUG otopi.plugins.ovirt_hosted_engine_setup.engine.add_host add_host._wait_host_ready:189 VDSM host in installing state 2015-03-26 18:36:54 DEBUG otopi.plugins.ovirt_hosted_engine_setup.engine.add_host add_host._wait_host_ready:189 VDSM host in installing state 2015-03-26 18:36:57 DEBUG otopi.plugins.ovirt_hosted_engine_setup.engine.add_host add_host._wait_host_ready:189 VDSM host in installing state 2015-03-26 18:54:43 DEBUG otopi.plugins.ovirt_hosted_engine_setup.engine.add_host add_host._wait_host_ready:183 Error fetching host state: [ERROR]::oVirt API connection failure, (7, 'I/O operation timed out') 2015-03-26 18:54:43 DEBUG otopi.plugins.ovirt_hosted_engine_setup.engine.add_host add_host._wait_host_ready:189 VDSM host in state 2015-03-26 18:56:51 DEBUG otopi.plugins.ovirt_hosted_engine_setup.engine.add_host add_host._wait_host_ready:183 Error fetching host state: [ERROR]::oVirt API connection failure, (7, 'Failed connect to m5.imatronix.com:443; Connection timed out') 2015-03-26 18:56:51 DEBUG otopi.plugins.ovirt_hosted_engine_setup.engine.add_host add_host._wait_host_ready:189 VDSM host in state 2015-03-26 18:59:00 DEBUG otopi.plugins.ovirt_hosted_engine_setup.engine.add_host add_host._wait_host_ready:183 Error fetching host state: [ERROR]::oVirt API connection failure, (7, 'Failed connect to m5.imatronix.com:443; Connection timed out') [...] On the VDSM logs, we have: [...] MainThread::DEBUG::2015-03-26 18:36:56,767::vdsm::66::vds::(sigtermHandler) Received signal 15 MainThread::DEBUG::2015-03-26 18:36:56,768::protocoldetector::158::vds.MultiProtocolAcceptor::(stop) Stopping Acceptor MainThread::INFO::2015-03-26 18:36:56,768::__init__::565::jsonrpc.JsonRpcServer::(stop) Stopping JsonRPC Server Detector thread::DEBUG::2015-03-26 18:36:56,768::protocoldetector::130::vds.MultiProtocolAcceptor::(_cleanup) Cleaning up Thread-88::DEBUG::2015-03-26 18:36:56,769::fileSD::262::Storage.Misc.excCmd::(getReadDelay) FAILED: <err> = ''; <rc> = -15 MainThread::DEBUG::2015-03-26 18:36:56,769::vmchannels::192::vds::(stop) VM channels listener was stopped. Thread-88::ERROR::2015-03-26 18:36:56,769::misc::208::Storage.Misc::(readspeed) Unable to read file '/rhev/data-center/mnt/glusterSD/h2.imatronix.com:engine/7ba46e75-51af-4648-becc-5a469cb8e9c2/dom_md/metadata' Thread-88::ERROR::2015-03-26 18:36:56,769::monitor::250::Storage.Monitor::(_monitorDomain) Error monitoring domain 7ba46e75-51af-4648-becc-5a469cb8e9c2 Traceback (most recent call last): File "/usr/share/vdsm/storage/monitor.py", line 247, in _monitorDomain self._checkReadDelay() File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 726, in wrapper value = meth(self, *a, **kw) File "/usr/share/vdsm/storage/monitor.py", line 319, in _checkReadDelay self.nextStatus.readDelay = self.domain.getReadDelay() File "/usr/share/vdsm/storage/fileSD.py", line 262, in getReadDelay stats = misc.readspeed(self.metafile, 4096) File "/usr/share/vdsm/storage/misc.py", line 209, in readspeed raise se.MiscFileReadException(path) MiscFileReadException: Internal file read failure: (u'/rhev/data-center/mnt/glusterSD/h2.imatronix.com:engine/7ba46e75-51af-4648-becc-5a469cb8e9c2/dom_md/metadata',) ioprocess communication (3617)::ERROR::2015-03-26 18:36:56,768::__init__::152::IOProcessClient::(_communicate) IOProcess failure Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 107, in _communicate raise Exception("FD closed") Exception: FD closed MainThread::INFO::2015-03-26 18:36:56,769::momIF::87::MOM::(stop) Shutting down MOM Thread-88::INFO::2015-03-26 18:36:56,770::monitor::273::Storage.Monitor::(_notifyStatusChanges) Domain 7ba46e75-51af-4648-becc-5a469cb8e9c2 became INVALID Thread-147::DEBUG::2015-03-26 18:36:56,771::misc::813::Storage.Event.Storage.DomainMonitor.onDomainStateChange::(_emit) Emitting event MainThread::DEBUG::2015-03-26 18:36:56,771::task::592::Storage.TaskManager.Task::(_updateState) Task=`aac0fcbf-3a3c-48be-bd72-26e4bcc2baf6`::moving from state init -> state preparing Thread-147::DEBUG::2015-03-26 18:36:56,771::misc::823::Storage.Event.Storage.DomainMonitor.onDomainStateChange::(_emit) Calling registered method `contEIOVms` ioprocess communication (3617)::DEBUG::2015-03-26 18:36:56,771::__init__::298::IOProcessClient::(_run) Starting IOProcess... MainThread::INFO::2015-03-26 18:36:56,771::logUtils::48::dispatcher::(wrapper) Run and protect: prepareForShutdown(options=None) Thread-147::DEBUG::2015-03-26 18:36:56,771::misc::833::Storage.Event.Storage.DomainMonitor.onDomainStateChange::(_emit) Event emitted Thread-11::DEBUG::2015-03-26 18:36:56,772::storageServer::725::Storage.ConnectionMonitor::(_monitorConnections) Monitoring stopped MainThread::INFO::2015-03-26 18:36:56,782::monitor::134::Storage.Monitor::(close) Stop monitoring all domains MainThread::INFO::2015-03-26 18:36:56,782::monitor::147::Storage.Monitor::(_stopMonitors) Stop monitoring 7ba46e75-51af-4648-becc-5a469cb8e9c2 Thread-88::DEBUG::2015-03-26 18:36:56,782::monitor::209::Storage.Monitor::(_run) Domain monitor for 7ba46e75-51af-4648-becc-5a469cb8e9c2 stopped MainThread::DEBUG::2015-03-26 18:36:56,782::monitor::153::Storage.Monitor::(_stopMonitors) Waiting for monitor 7ba46e75-51af-4648-becc-5a469cb8e9c2 Thread-88::INFO::2015-03-26 18:36:56,782::clusterlock::245::Storage.SANLock::(releaseHostId) Releasing host id for domain 7ba46e75-51af-4648-becc-5a469cb8e9c2 (id: 1) Thread-88::ERROR::2015-03-26 18:36:56,782::monitor::366::Storage.Monitor::(_releaseHostId) Error releasing host id 1 for domain 7ba46e75-51af-4648-becc-5a469cb8e9c2 Traceback (most recent call last): File "/usr/share/vdsm/storage/monitor.py", line 363, in _releaseHostId self.domain.releaseHostId(self.hostId, unused=True) File "/usr/share/vdsm/storage/sd.py", line 480, in releaseHostId self._clusterLock.releaseHostId(hostId, async, unused) File "/usr/share/vdsm/storage/clusterlock.py", line 252, in releaseHostId raise se.ReleaseHostIdFailure(self._sdUUID, e) ReleaseHostIdFailure: Cannot release host id: (u'7ba46e75-51af-4648-becc-5a469cb8e9c2', SanlockException(16, 'Sanlock lockspace remove failure', 'Device or resource busy')) MainThread::DEBUG::2015-03-26 18:36:56,783::taskManager::90::Storage.TaskManager::(prepareForShutdown) Request to stop all tasks MainThread::INFO::2015-03-26 18:36:56,783::taskManager::96::Storage.TaskManager::(prepareForShutdown) cdb4f72b-1789-4a9f-adc5-69f7488f5992 MainThread::INFO::2015-03-26 18:36:56,783::logUtils::51::dispatcher::(wrapper) Run and protect: prepareForShutdown, Return response: None MainThread::DEBUG::2015-03-26 18:36:56,783::task::1188::Storage.TaskManager.Task::(prepare) Task=`aac0fcbf-3a3c-48be-bd72-26e4bcc2baf6`::finished: None MainThread::DEBUG::2015-03-26 18:36:56,783::task::592::Storage.TaskManager.Task::(_updateState) Task=`aac0fcbf-3a3c-48be-bd72-26e4bcc2baf6`::moving from state preparing -> state finished [...] When trying to restart VDSM, we get: Thread-46::ERROR::2015-03-26 19:24:31,843::vm::1237::vm.Vm::(_startUnderlyingVm) vmId=`191045ac-79e4-4ce8-aad7-52cc9af313c5`::The vm start process failed Traceback (most recent call last): File "/usr/share/vdsm/virt/vm.py", line 1185, in _startUnderlyingVm self._run() File "/usr/share/vdsm/virt/vm.py", line 2253, in _run self._connection.createXML(domxml, flags), File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 126, in wrapper ret = f(*args, **kwargs) File "/usr/lib64/python2.7/site-packages/libvirt.py", line 3427, in createXML if ret is None:raise libvirtError('virDomainCreateXML() failed', conn=self) libvirtError: Failed to acquire lock: No space left on device Thread-46::INFO::2015-03-26 19:24:31,844::vm::1709::vm.Vm::(setDownStatus) vmId=`191045ac-79e4-4ce8-aad7-52cc9af313c5`::Changed state to Down: Failed to acquire lock: No space left on device (code=1) Thread-46::DEBUG::2015-03-26 19:24:31,844::vmchannels::214::vds::(unregister) Delete fileno 60 from listener. VM Channels Listener::DEBUG::2015-03-26 19:24:32,346::vmchannels::121::vds::(_do_del_channels) fileno 60 was removed from listener. In sanlock.log we have: 2015-03-26 19:24:30+0000 7589 [752]: cmd 9 target pid 9559 not found 2015-03-26 19:24:31+0000 7589 [764]: r7 cmd_acquire 2,8,9559 invalid lockspace found -1 failed 935819904 name 7ba46e75-51af-4648-becc-5a469cb8e9c2 All 3 lease files are present.
Facts: 1) The engine storage (/rhev/data-center/mmt/...) was umounted during this process. 2) Another mount of the same volume (I mounted it before the setup) was still mounted after the VDSM crash, so maybe the problem is not related with gluster. 3) After reconnecting storage with "hosted-engine --connect-storage", the volume is mounted again. Now, when trying to restart VDSM, we get an "invalid lockspace": Thread-46::ERROR::2015-03-26 19:24:31,843::vm::1237::vm.Vm::(_startUnderlyingVm) vmId=`191045ac-79e4-4ce8-aad7-52cc9af313c5`::The vm start process failed Traceback (most recent call last): File "/usr/share/vdsm/virt/vm.py", line 1185, in _startUnderlyingVm self._run() File "/usr/share/vdsm/virt/vm.py", line 2253, in _run self._connection.createXML(domxml, flags), File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 126, in wrapper ret = f(*args, **kwargs) File "/usr/lib64/python2.7/site-packages/libvirt.py", line 3427, in createXML if ret is None:raise libvirtError('virDomainCreateXML() failed', conn=self) libvirtError: Failed to acquire lock: No space left on device Thread-46::INFO::2015-03-26 19:24:31,844::vm::1709::vm.Vm::(setDownStatus) vmId=`191045ac-79e4-4ce8-aad7-52cc9af313c5`::Changed state to Down: Failed to acquire lock: No space left on device (code=1) Thread-46::DEBUG::2015-03-26 19:24:31,844::vmchannels::214::vds::(unregister) Delete fileno 60 from listener. VM Channels Listener::DEBUG::2015-03-26 19:24:32,346::vmchannels::121::vds::(_do_del_channels) fileno 60 was removed from listener. In sanlock.log we have: 2015-03-26 19:24:30+0000 7589 [752]: cmd 9 target pid 9559 not found 2015-03-26 19:24:31+0000 7589 [764]: r7 cmd_acquire 2,8,9559 invalid lockspace found -1 failed 935819904 name 7ba46e75-51af-4648-becc-5a469cb8e9c2 (All 3 lease files are present)
About the hosted-engine VM not being resumed after restarting VDSM, I confirmed that QEMU is not reopening the file descriptors when resuming a paused VM. See also: - https://bugzilla.redhat.com/show_bug.cgi?id=1172905 - https://bugzilla.redhat.com/show_bug.cgi?id=1058300
I found out that the gluster engine storage disappears when VDSM is restarted during the activation process, because it is a child-process of the vdsmd.service: CGroup: /system.slice/vdsmd.service ├─14895 /usr/bin/python /usr/share/vdsm/vdsm ├─14964 /usr/libexec/ioprocess --read-pipe-fd 43 --write-pipe-fd 42 --max-threads 10 --ma ├─15836 /usr/libexec/ioprocess --read-pipe-fd 49 --write-pipe-fd 48 --max-threads 10 --ma ├─15911 /usr/sbin/glusterfs --volfile-server=h2.imatronix.com --volfile-id=engine /rhev/d └─15922 /usr/libexec/ioprocess --read-pipe-fd 61 --write-pipe-fd 60 --max-threads 10 --ma A workarround is to mount the engine-storage manually, so when vdsmd is restarted on the host, it doesn't kill the storage and the engine VM.
(In reply to Christopher Pereira from comment #7) > I found out that the gluster engine storage disappears when VDSM is > restarted during the activation process, because it is a child-process of > the vdsmd.service: > > CGroup: /system.slice/vdsmd.service > ├─14895 /usr/bin/python /usr/share/vdsm/vdsm > ├─14964 /usr/libexec/ioprocess --read-pipe-fd 43 --write-pipe-fd > 42 --max-threads 10 --ma > ├─15836 /usr/libexec/ioprocess --read-pipe-fd 49 --write-pipe-fd > 48 --max-threads 10 --ma > ├─15911 /usr/sbin/glusterfs --volfile-server=h2.imatronix.com > --volfile-id=engine /rhev/d > └─15922 /usr/libexec/ioprocess --read-pipe-fd 61 --write-pipe-fd > 60 --max-threads 10 --ma > > A workarround is to mount the engine-storage manually, so when vdsmd is > restarted on the host, it doesn't kill the storage and the engine VM. Thanks Christopher. The one thing that bothers me is that this issue only happens when we're working with Gluster storage. Any other storage type works fine with the same flow. Any idea why?
Doron, Gluster FDs get invalidated because the glusterfs process is killed together with vdsmd. Maybe with NFS the process hosting those FDs is not inside the same cgroup and is not killed during vdsmd restart.
Guys, an easy workarround to keep the hosted-engine VM alive during the setup, is to move the glusterfs process back to glusterd.service's cgroup with this command: echo $glusterfspid > /sys/fs/cgroup/systemd/system.slice/glusterd.service/tasks
hi Christopher, Will restarting glusterd kill glusterfs process when this is done? Pranith
No. Actually even after stopping glusterd, I still see glusterfs and glusterfsd processes running. I'm not sure if this is normal, but it happens with 3.7dev. But if you kill those procs, QEMU's file-descriptors will be invalidated and the VM won't start anymore. You will see a "block I/O error in device 'drive-virtio-disk0': Transport endpoint is not connected (107)" But AFAIR there was a trick: virsh suspend + resume. This probably works because QEMU reopens the files (this is possible, because libvirt is passing files, not file-descriptors). In the future, when we pass file-descriptors to QEMU instead of file paths, we will need to reinit them in libvirt of wherever they are recevied from.
(In reply to Christopher Pereira from comment #9) > Doron, > > Gluster FDs get invalidated because the glusterfs process is killed together > with vdsmd. > Maybe with NFS the process hosting those FDs is not inside the same cgroup > and is not killed during vdsmd restart. This could be because the GlusterFS mount process is a user-space process, unlike NFS mounts which are handled by the kernel. (In reply to Christopher Pereira from comment #12) > No. > Actually even after stopping glusterd, I still see glusterfs and glusterfsd > processes running. I'm not sure if this is normal, but it happens with > 3.7dev. This is actually surprising to me. I always thought that systemd would keep track of processes under a service's cgroup and terminate all of them when the service was stopped. Did you stop glusterd manually or via systemctl?
The best solution here would be to have QEMU use gfapi to do access the disk image. This will make it free from the VDSM mount, and VDSM is free to do as it pleases with the mount.
(In reply to Kaushal from comment #13) > (In reply to Christopher Pereira from comment #12) > > No. > > Actually even after stopping glusterd, I still see glusterfs and glusterfsd > > processes running. I'm not sure if this is normal, but it happens with > > 3.7dev. > > This is actually surprising to me. I always thought that systemd would keep > track of processes under a service's cgroup and terminate all of them when > the service was stopped. > Did you stop glusterd manually or via systemctl? I checked the glusterd.service unit file. It has KillMode=process, which instructs systemd to kill only the main glusterd process not the child processes. This is why the mount wasn't being killed when you stopped GlusterD.
Why is libgfapi not being used here? Would it be possible to patch VDSM so that the glusterd process is not executed inside the VDSM cgroup? As a workarround, to keep the gluster volume alive I'm currently: Stopping VDSM, mounting the /rhev share manually and then restarting VDSM. I check that the gluster process stays outside the VDSM CGROUP, ie. "systemctl status vdsmd" must not show the glusterd process under the vdsmd cgroup.
Please change status to fixed. Thanks
(In reply to Christopher Pereira from comment #17) > Please change status to fixed. Hi Christopher, the patches fixing them were not merged upstream yet, we cannot change status to fixed yet.
*** Bug 1220310 has been marked as a duplicate of this bug. ***
Merged upstream: https://github.com/oVirt/vdsm/commit/48841c0956ca91af5cd6aeb66782c8d2115f1d1e
Working fine in 3.6 Alpha (tested on CentOS 7). Can be closed.
Hi Sandro, May you change the target release 3.6.0->4.0 please?
Dropping Bug #1083025 from the blocked bugs, this bug shouldn't affect external glusterfs support
Sandro, when will this bug move to ON_QA?
I'm going to move to ON_QA when we release a 4.0 milestone build. We may have a "checkpoint" build after 3.6.0 will be out if we don't want to wait until the first alpha.
Sandro, can this be closed?
(In reply to Sahina Bose from comment #26) > Sandro, can this be closed? closed no, moved to qa yes.
Isn't this fix available in 3.6.x?
(In reply to Sahina Bose from comment #28) > Isn't this fix available in 3.6.x? As far as I can tell, no, it's not in 3.6 (change-id: I79714a9a6ecbde17c396eca6b17b9e3605fad688 )
(In reply to Sandro Bonazzola from comment #29) > (In reply to Sahina Bose from comment #28) > > Isn't this fix available in 3.6.x? > > As far as I can tell, no, it's not in 3.6 (change-id: > I79714a9a6ecbde17c396eca6b17b9e3605fad688 ) This patch was backported to the 3.6 branch - see commit hash 48841c0956ca91af5cd6aeb66782c8d2115f1d1e. It's included in 4.17.0.4
Tested with gluster 3.7.9 and RHV 4.0.2-7. While deploying hosted engine, used glusterfs storage domain backed with replica 3 volume. No problems found. glusterfs mount process is under the system.slice └─system.slice ├─run-18300.scope │ └─18363 /usr/sbin/glusterfs --volfile-server=10.70.37.157 --volfile-server=10.70.37.162 --volfile-server=10.70.37.187 --volfile-id=/rep2vol /rhev/data-center/mnt/glusterSD/10.70.37.157:_rep2vol
Also found that the gluster fuse mount persisted, even after restarting glusterd
(In reply to SATHEESARAN from comment #33) > Also found that the gluster fuse mount persisted, even after restarting > glusterd And also after restarting vdsmd