Description of problem: VDSM fails to mount hosted-engine gluster replica3 volume. (Default name "hosted_storage") Detection of backup-volfile-servers in "/usr/share/vdsm/storage/storageServer.py" leads to mount Option "-o backup-volfile-servers=server1,server2,server3". Mount is executed without "-t glusterfs" since vfsType=none is specified in storageServer.py class GlusterFSConnection(MountConnection). So mount ist tried as NFS which fails due to wrong option. Version-Release number of selected component (if applicable): tested on 4.17.28, 4.18.4.1. How reproducible: Install hosted engine on glusterfs replica 3 Volume, reboot any host in the cluster when hosted_storage is defined after initial setup. Actual results: Mount of hosted_storage fails. jsonrpc.Executor/3::DEBUG::2016-06-25 19:40:02,520::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bi n/taskset --cpu-list 0-7 /usr/bin/sudo -n /usr/bin/systemd-run --scope --slice=vdsm-glusterfs /usr/bin /mount -o backup-volfile-servers=microcloud21.rxmgmt.databay.de:microcloud24.rxmgmt.databay.de:microcl oud27.rxmgmt.databay.de glusterfs.rxmgmt.databay.de:/engine /rhev/data-center/mnt/glusterSD/glusterfs. rxmgmt.databay.de:_engine (cwd None) jsonrpc.Executor/3::ERROR::2016-06-25 19:40:02,540::hsm::2473::Storage.HSM::(connectStorageServer) Cou ld not connect to storageServer Traceback (most recent call last): File "/usr/share/vdsm/storage/hsm.py", line 2470, in connectStorageServer conObj.connect() File "/usr/share/vdsm/storage/storageServer.py", line 237, in connect six.reraise(t, v, tb) File "/usr/share/vdsm/storage/storageServer.py", line 229, in connect self._mount.mount(self.options, self._vfsType, cgroup=self.CGROUP) File "/usr/share/vdsm/storage/mount.py", line 225, in mount return self._runcmd(cmd, timeout) File "/usr/share/vdsm/storage/mount.py", line 241, in _runcmd raise MountError(rc, ";".join((out, err))) MountError: (32, ';Running scope as unit run-13461.scope.\nmount.nfs: an incorrect mount option was specified\n') Expected results: Mount succeeds. Additional info: A workaround for me is: [root@microcloud28 storage]# diff -u storageServer.py.orig storageServer.py --- storageServer.py.orig 2016-06-25 20:20:32.372965968 +0200 +++ storageServer.py 2016-06-25 20:20:44.490640046 +0200 @@ -308,7 +308,7 @@ def __init__(self, spec, - vfsType=None, + vfsType="glusterfs", options="", mountClass=mount.Mount): which leads to successful mount: jsonrpc.Executor/4::DEBUG::2016-06-25 20:22:16,804::fileUtils::143::Storage.fileUtils::(createdir) Cre ating directory: /rhev/data-center/mnt/glusterSD/glusterfs.rxmgmt.databay.de:_engine mode: None jsonrpc.Executor/4::DEBUG::2016-06-25 20:22:16,804::storageServer::364::Storage.StorageServer.MountCon nection::(_get_backup_servers_option) Using bricks: ['microcloud21.rxmgmt.databay.de', 'microcloud24.r xmgmt.databay.de', 'microcloud27.rxmgmt.databay.de'] jsonrpc.Executor/4::WARNING::2016-06-25 20:22:16,804::storageServer::370::Storage.StorageServer.MountC onnection::(_get_backup_servers_option) gluster server u'glusterfs.rxmgmt.databay.de' is not in bricks ['microcloud21.rxmgmt.databay.de', 'microcloud24.rxmgmt.databay.de', 'microcloud27.rxmgmt.databay.de' ], possibly mounting duplicate servers jsonrpc.Executor/4::DEBUG::2016-06-25 20:22:16,804::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bi n/taskset --cpu-list 0-7 /usr/bin/sudo -n /usr/bin/systemd-run --scope --slice=vdsm-glusterfs /usr/bin /mount -t glusterfs -o backup-volfile-servers=microcloud21.rxmgmt.databay.de:microcloud24.rxmgmt.datab ay.de:microcloud27.rxmgmt.databay.de glusterfs.rxmgmt.databay.de:/engine /rhev/data-center/mnt/gluster SD/glusterfs.rxmgmt.databay.de:_engine (cwd None)
Wouldn't the vfsType be correctly picked as "glusterfs" from the hosted-engine.conf file?
Ralf, can you please attach here /etc/ovirt-hosted-engine/hosted-engine.conf from one of your hosts and /var/log/vdsm/vdsm.log? Sahina, ovirt-ha-agent will initially mount based on hosted-engine.conf, then when we have an engine the engine will try to remount as for a regular host based on how it auto-imported the hosted-engine storage domain. In the past we had a bug there: https://bugzilla.redhat.com/1317699 Ralf, which engine version did you initially installed?
Initially I installed 3.6.5, I sometimes (i.e: after Host-Reboot/Upgrade) had the problem that my gluster volume hosted_storage was not mounted by VDSM. I got around by mounting the gluster Volume manually. I investigated the problem deeper and patched my storageServer.py after upgrading to 3.6.6. (I think this is when the -o backup-volfile-servers=server1,server2,server3) was introduced. Now I upgraded to 4.0.0 and the problem still exists because my patched storageServer.py was overwritten. My /etc/ovirt-hosted-engine/hosted-engine.conf is attached. I have to search for a piece of vdsm.log where it happened because I patched the storageServer.py also with 3.6.6
Created attachment 1174459 [details] hosted-engine.conf for HA-Engine on Gluster
Something i have to notice: I'm not sure if the above problem also occured on a node which was installed as hosted_engine host. It was definitly on a node of the cluster that is not installed as hosted_engine host. So I attach the vdsm.log of such a host.
(In reply to Ralf Schenk from comment #3) > Initially I installed 3.6.5, OK, so it's just a duplicate of 1317699 since it got in only with 3.6.7 and it fixes only for new deployments. To get it automatically fixed you need to destroy (without deleting its content!!!) the hosted-engine storage domain to trigger again the auto-import procedure. If you are brave enough you can quickly tweak the missing value in the DB: on the engine VM: sudo -u postgres psql \c engine select * from storage_server_connections; # and identify the ID of your affected connection update storage_server_connections set vfs_type = 'glusterfs' where id = 'THE_ID_YOU_FOUND_IN_THE_OUTPUT_ABOVE_FOR_THE_HOSTED_ENGINE_CONNECTION'; commit; \q > I sometimes (i.e: after Host-Reboot/Upgrade) had the problem that my gluster > volume hosted_storage was not mounted by VDSM. I got around by mounting the > gluster Volume manually. > > I investigated the problem deeper and patched my storageServer.py after > upgrading to 3.6.6. (I think this is when the -o > backup-volfile-servers=server1,server2,server3) was introduced. Please avoid touching it since the issue is not there. > Now I upgraded to 4.0.0 and the problem still exists because my patched > storageServer.py was overwritten. > > My /etc/ovirt-hosted-engine/hosted-engine.conf is attached. > I have to search for a piece of vdsm.log where it happened because I patched > the storageServer.py also with 3.6.6 *** This bug has been marked as a duplicate of bug 1317699 ***