Bug 1324075

Summary: Can not access storage domain hosted_storage
Product: [oVirt] ovirt-engine Reporter: Richard Neuboeck <hawk>
Component: BLL.HostedEngineAssignee: Roy Golan <rgolan>
Status: CLOSED DUPLICATE QA Contact: meital avital <mavital>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.6.4CC: amureini, bugs, rgolan, stirabos, tnisan
Target Milestone: ---Flags: rule-engine: planning_ack?
rule-engine: devel_ack?
rule-engine: testing_ack?
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard: sla
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-04-08 06:43:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: SLA RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Richard Neuboeck 2016-04-05 13:08:39 UTC
Description of problem:

During the addition of another host through the WebUI it fails to mount the engine storage. The host enters 'non operational' status. Every attempt to 'activate' the host fails with the same error message:

'Host cube-two cannot access the Storage Domain(s) hosted_storage attached to the Data Center Default. Setting Host state to Non-Operational.'

Other Storage Domains are mounted automatically without problem.

vdsm.log of the newly added host shows that the information retrieved for the hosted_engine storage domain lacks vfs_type:

jsonrpc.Executor/4::DEBUG::2016-04-05 14:37:34,740::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'StoragePool.connectStorageServer' in bridge with {u'connectionParams': [{u'id': u'f707b926-10db-46bc-8873-0d990c5b73b1', 
u'connection': u'borg-sphere-one:/plexus', u'iqn': u'', u'user': u'', u'tpgt': u'1', u'vfs_type': u'glusterfs', u'password': '********', u'port': u''}, {u'id': u'874705cb-44c1-4aab-8844-82f50d559684', u'connection': u'borg-sphere-one:/e
ngine', u'iqn': u'', u'user': u'', u'tpgt': u'1', u'password': '********', u'port': u''}], u'storagepoolID': u'00000001-0001-0001-0001-0000000003e1', u'domainType': 7}

The following mount process then does not include -t glusterfs and therefore fails:

jsonrpc.Executor/4::DEBUG::2016-04-05 14:37:35,077::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset --cpu-list 0-39 /usr/bin/sudo -n /usr/bin/systemd-run --scope --slice=vdsm-glusterfs /usr/bin/mount -o backup-volfile-servers=borg-sphere-two:borg-sphere-three borg-sphere-one:/engine /rhev/data-center/mnt/glusterSD/borg-sphere-one:_engine (cwd None)
jsonrpc.Executor/4::ERROR::2016-04-05 14:37:35,284::hsm::2473::Storage.HSM::(connectStorageServer) Could not connect to storageServer




Version-Release number of selected component (if applicable):

vdsm-4.17.23.2-0.el7.centos.noarch


How reproducible:

Always.


Steps to Reproduce:
1. Hosted engine installation of current oVirt 3.6
2. On the WebUI in the Hosts tab press new and enter the appropriate information.
3. The package installation runs without problems.
4. As soon as the mounts of the shared storage domains starts the hosted_engine mount fails with the above error.


Actual results:

Mount failure that inhibits the host from functioning correctly.


Expected results:

mount -t glusterfs instead of an assumed NFS mount.


Additional info:

Both glusterfs volumes (called engine and plexus) are running as expected in replica 3 mode.

Comment 1 Tal Nisan 2016-04-06 07:44:20 UTC
Roy, this seems like more of a hosted engine issue, can someone from your team have a look?

Comment 2 Allon Mureinik 2016-04-07 05:10:55 UTC
Pretty sure this is a dup of a bug we've already seen, but Tal seems to be correct - the hosted_engine domain is not registered properly.

Comment 3 Richard Neuboeck 2016-04-07 05:44:43 UTC
This problem must have been introduced after March 22. At this point installing the hosted-engine setup and adding additional hosts worked.

Comment 4 Simone Tiraboschi 2016-04-07 15:32:16 UTC
Yes, it seams that the auto-import procedure in the engine took that storage domain as an NFS one.

Comment 5 Richard Neuboeck 2016-04-08 06:42:55 UTC
The temporary bug fix seems to be to update the storage connection in the database manually:

On the Engine VM you find information to access the DB in /etc/ovirt-engine/engine.conf.d/10-setup-database.conf

Then access the engine database and update the vfs_type field in the storage_server_connections table of the engine storage volume entry:

psql -U engine -W -h localhost
select * from storage_server_connections;
update storage_server_connections set vfs_type = 'glusterfs' where id = 'THE_ID_YOU_FOUND_IN_THE_OUTPUT_ABOVE_FOR_THE_ENGINE_VOLUME';

Comment 6 Richard Neuboeck 2016-04-08 06:43:21 UTC

*** This bug has been marked as a duplicate of bug 1317699 ***