Bug 1351203 - VDSM fails to mount hosted-engine gluster volume
Summary: VDSM fails to mount hosted-engine gluster volume
Keywords:
Status: CLOSED DUPLICATE of bug 1317699
Alias: None
Product: vdsm
Classification: oVirt
Component: Gluster
Version: 4.18.15
Hardware: Unspecified
OS: All
unspecified
high
Target Milestone: ---
: ---
Assignee: sankarshan
QA Contact: SATHEESARAN
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-06-29 13:04 UTC by Ralf Schenk
Modified: 2016-06-30 12:21 UTC (History)
4 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2016-06-30 12:21:02 UTC
oVirt Team: Storage
Embargoed:
rule-engine: planning_ack?
rule-engine: devel_ack?
rule-engine: testing_ack?


Attachments (Terms of Use)
hosted-engine.conf for HA-Engine on Gluster (1.02 KB, text/plain)
2016-06-30 10:12 UTC, Ralf Schenk
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1317699 0 high CLOSED Hosted engine on Gluster prevents additional non-ha hosts being added 2021-02-22 00:41:40 UTC

Internal Links: 1317699

Description Ralf Schenk 2016-06-29 13:04:07 UTC
Description of problem:

VDSM fails to mount hosted-engine gluster replica3 volume. (Default name "hosted_storage")

Detection of backup-volfile-servers in "/usr/share/vdsm/storage/storageServer.py" leads to mount Option "-o backup-volfile-servers=server1,server2,server3". 

Mount is executed without "-t glusterfs" since vfsType=none is specified in  
storageServer.py class GlusterFSConnection(MountConnection). So mount ist tried as NFS which fails due to wrong option.

Version-Release number of selected component (if applicable): tested on 4.17.28, 4.18.4.1.

How reproducible:
Install hosted engine on glusterfs replica 3 Volume, reboot any host in the cluster when hosted_storage is defined after initial setup.

Actual results:
Mount of hosted_storage fails.

jsonrpc.Executor/3::DEBUG::2016-06-25 19:40:02,520::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bi
n/taskset --cpu-list 0-7 /usr/bin/sudo -n /usr/bin/systemd-run --scope --slice=vdsm-glusterfs /usr/bin
/mount -o backup-volfile-servers=microcloud21.rxmgmt.databay.de:microcloud24.rxmgmt.databay.de:microcl
oud27.rxmgmt.databay.de glusterfs.rxmgmt.databay.de:/engine /rhev/data-center/mnt/glusterSD/glusterfs.
rxmgmt.databay.de:_engine (cwd None)
jsonrpc.Executor/3::ERROR::2016-06-25 19:40:02,540::hsm::2473::Storage.HSM::(connectStorageServer) Cou
ld not connect to storageServer
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/hsm.py", line 2470, in connectStorageServer
    conObj.connect()
  File "/usr/share/vdsm/storage/storageServer.py", line 237, in connect
    six.reraise(t, v, tb)
  File "/usr/share/vdsm/storage/storageServer.py", line 229, in connect
    self._mount.mount(self.options, self._vfsType, cgroup=self.CGROUP)
  File "/usr/share/vdsm/storage/mount.py", line 225, in mount
    return self._runcmd(cmd, timeout)
  File "/usr/share/vdsm/storage/mount.py", line 241, in _runcmd
    raise MountError(rc, ";".join((out, err)))
MountError: (32, ';Running scope as unit run-13461.scope.\nmount.nfs: an incorrect mount option was specified\n')


Expected results:
Mount succeeds.

Additional info:
A workaround for me is:
[root@microcloud28 storage]# diff -u storageServer.py.orig storageServer.py
--- storageServer.py.orig       2016-06-25 20:20:32.372965968 +0200
+++ storageServer.py    2016-06-25 20:20:44.490640046 +0200
@@ -308,7 +308,7 @@

     def __init__(self,
                  spec,
-                 vfsType=None,
+                 vfsType="glusterfs",
                  options="",
                  mountClass=mount.Mount):

which leads to successful mount:
jsonrpc.Executor/4::DEBUG::2016-06-25 20:22:16,804::fileUtils::143::Storage.fileUtils::(createdir) Cre
ating directory: /rhev/data-center/mnt/glusterSD/glusterfs.rxmgmt.databay.de:_engine mode: None
jsonrpc.Executor/4::DEBUG::2016-06-25 20:22:16,804::storageServer::364::Storage.StorageServer.MountCon
nection::(_get_backup_servers_option) Using bricks: ['microcloud21.rxmgmt.databay.de', 'microcloud24.r
xmgmt.databay.de', 'microcloud27.rxmgmt.databay.de']
jsonrpc.Executor/4::WARNING::2016-06-25 20:22:16,804::storageServer::370::Storage.StorageServer.MountC
onnection::(_get_backup_servers_option) gluster server u'glusterfs.rxmgmt.databay.de' is not in bricks
 ['microcloud21.rxmgmt.databay.de', 'microcloud24.rxmgmt.databay.de', 'microcloud27.rxmgmt.databay.de'
], possibly mounting duplicate servers
jsonrpc.Executor/4::DEBUG::2016-06-25 20:22:16,804::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bi
n/taskset --cpu-list 0-7 /usr/bin/sudo -n /usr/bin/systemd-run --scope --slice=vdsm-glusterfs /usr/bin
/mount -t glusterfs -o backup-volfile-servers=microcloud21.rxmgmt.databay.de:microcloud24.rxmgmt.datab
ay.de:microcloud27.rxmgmt.databay.de glusterfs.rxmgmt.databay.de:/engine /rhev/data-center/mnt/gluster
SD/glusterfs.rxmgmt.databay.de:_engine (cwd None)

Comment 1 Sahina Bose 2016-06-30 07:18:48 UTC
Wouldn't the vfsType be correctly picked as "glusterfs" from the hosted-engine.conf file?

Comment 2 Simone Tiraboschi 2016-06-30 07:51:45 UTC
Ralf, can you please attach here /etc/ovirt-hosted-engine/hosted-engine.conf from one of your hosts and /var/log/vdsm/vdsm.log?

Sahina, ovirt-ha-agent will initially mount based on hosted-engine.conf, then when we have an engine the engine will try to remount as for a regular host based on how it auto-imported the hosted-engine storage domain.
In the past we had a bug there:
https://bugzilla.redhat.com/1317699

Ralf, which engine version did you initially installed?

Comment 3 Ralf Schenk 2016-06-30 10:09:31 UTC
Initially I installed 3.6.5,

I sometimes (i.e: after Host-Reboot/Upgrade) had the problem that my gluster volume hosted_storage was not mounted by VDSM. I got around by mounting the gluster Volume manually.

I investigated the problem deeper and patched my storageServer.py after upgrading to 3.6.6. (I think this is when the -o backup-volfile-servers=server1,server2,server3) was introduced.

Now I upgraded to 4.0.0 and the problem still exists because my patched storageServer.py was overwritten.

My /etc/ovirt-hosted-engine/hosted-engine.conf is attached.
I have to search for a piece of vdsm.log where it happened because I patched the storageServer.py also with 3.6.6

Comment 4 Ralf Schenk 2016-06-30 10:12:32 UTC
Created attachment 1174459 [details]
hosted-engine.conf for HA-Engine on Gluster

Comment 5 Ralf Schenk 2016-06-30 10:20:50 UTC
Something i have to notice:
I'm not sure if the above problem also occured on a node which was installed as hosted_engine host. It was definitly on a node of the cluster that is not installed as hosted_engine host. So I attach the vdsm.log of such a host.

Comment 6 Simone Tiraboschi 2016-06-30 12:21:02 UTC
(In reply to Ralf Schenk from comment #3)
> Initially I installed 3.6.5,

OK, so it's just a duplicate of 1317699 since it got in only with 3.6.7 and it fixes only for new deployments.

To get it automatically fixed you need to destroy (without deleting its content!!!) the hosted-engine storage domain to trigger again the auto-import procedure.

If you are brave enough you can quickly tweak the missing value in the DB:
on the engine VM:

sudo -u postgres psql
\c engine
select * from storage_server_connections;
# and identify the ID of your affected connection
update storage_server_connections set vfs_type = 'glusterfs' where
id = 'THE_ID_YOU_FOUND_IN_THE_OUTPUT_ABOVE_FOR_THE_HOSTED_ENGINE_CONNECTION';
commit;
\q

> I sometimes (i.e: after Host-Reboot/Upgrade) had the problem that my gluster
> volume hosted_storage was not mounted by VDSM. I got around by mounting the
> gluster Volume manually.
> 
> I investigated the problem deeper and patched my storageServer.py after
> upgrading to 3.6.6. (I think this is when the -o
> backup-volfile-servers=server1,server2,server3) was introduced.

Please avoid touching it since the issue is not there.
 
> Now I upgraded to 4.0.0 and the problem still exists because my patched
> storageServer.py was overwritten.
> 
> My /etc/ovirt-hosted-engine/hosted-engine.conf is attached.
> I have to search for a piece of vdsm.log where it happened because I patched
> the storageServer.py also with 3.6.6

*** This bug has been marked as a duplicate of bug 1317699 ***


Note You need to log in before you can comment on or make changes to this bug.