Bug 1243323 - "Internal engine error" when creating a single node gluster domain
Summary: "Internal engine error" when creating a single node gluster domain
Keywords:
Status: CLOSED DUPLICATE of bug 1238093
Alias: None
Product: oVirt
Classification: Retired
Component: ovirt-engine-core
Version: 3.5
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: m1
: 3.6.0
Assignee: Ala Hino
QA Contact: Aharon Canan
URL:
Whiteboard: storage
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-07-15 08:26 UTC by Amit Aviram
Modified: 2016-03-10 06:17 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-08-03 14:51:39 UTC
oVirt Team: Storage
Embargoed:


Attachments (Terms of Use)
Engine logs (1.69 MB, application/zip)
2015-07-15 08:26 UTC, Amit Aviram
no flags Details

Description Amit Aviram 2015-07-15 08:26:59 UTC
Created attachment 1052258 [details]
Engine logs

This bug was brought up from the users list by "Konstantinos Christidis" <kochrist>


************ Description of problem:
>>> I created (through oVirt web) a GlusterFS distributed volume with four 
>>> bricks.  When I try to add a New Domain - GlusterFS Data I am getting:
>>> 
>>> "Error while executing action Add Storage Connection: Internal Engine Error"
>>> 
>>> and
>>> "Error validating master storage domain: ('MD read error',)"
>>> 
>>> ps: My installation seems to work only with replica-3 oVirt Optimized 
>>> volumes. Every other combination fails with the error above.


************ Version-Release number of selected component (if applicable):
Centos 7 oVirt 3.6
Full installed list 

host
# rpm -qa | grep -Ei 'ovirt|vdsm'
vdsm-infra-4.17.0-1124.git0b2fc17.el7.noarch
ovirt-release-master-001-0.10.master.noarch
vdsm-yajsonrpc-4.17.0-1124.git0b2fc17.el7.noarch
ovirt-vmconsole-1.0.0-0.0.master.20150616120945.gitc1fb2bd.el7.noarch
vdsm-xmlrpc-4.17.0-1124.git0b2fc17.el7.noarch
vdsm-gluster-4.17.0-1124.git0b2fc17.el7.noarch
vdsm-python-4.17.0-1124.git0b2fc17.el7.noarch
vdsm-4.17.0-1124.git0b2fc17.el7.noarch
vdsm-jsonrpc-4.17.0-1124.git0b2fc17.el7.noarch
ovirt-vmconsole-host-1.0.0-0.0.master.20150616120945.gitc1fb2bd.el7.noarch
vdsm-cli-4.17.0-1124.git0b2fc17.el7.noarch


engine 
# rpm -qa | grep -Ei 'ovirt|vdsm'
ovirt-engine-jboss-as-7.1.1-1.el7.x86_64
ovirt-engine-tools-3.6.0-0.0.master.20150709173330.git9b5919d.el7.centos.noarch
ovirt-engine-userportal-3.6.0-0.0.master.20150709173330.git9b5919d.el7.centos.noarch
ovirt-engine-lib-3.6.0-0.0.master.20150712172339.git22cb5aa.el7.centos.noarch
ovirt-engine-setup-plugin-vmconsole-proxy-helper-3.6.0-0.0.master.20150712172339.git22cb5aa.el7.centos.noarch
ovirt-engine-setup-3.6.0-0.0.master.20150712172339.git22cb5aa.el7.centos.noarch
ovirt-iso-uploader-3.6.0-0.0.master.20150618074838.gitea4158a.el7.noarch
ebay-cors-filter-1.0.1-0.1.ovirt.el7.noarch
ovirt-host-deploy-1.4.0-0.0.master.20150703061610.gitda2ec90.el7.noarch
ovirt-engine-wildfly-overlay-001-2.el7.noarch
ovirt-engine-backend-3.6.0-0.0.master.20150709173330.git9b5919d.el7.centos.noarch
ovirt-engine-setup-base-3.6.0-0.0.master.20150712172339.git22cb5aa.el7.centos.noarch
ovirt-engine-setup-plugin-ovirt-engine-3.6.0-0.0.master.20150712172339.git22cb5aa.el7.centos.noarch
ovirt-engine-vmconsole-proxy-helper-3.6.0-0.0.master.20150712172339.git22cb5aa.el7.centos.noarch
ovirt-image-uploader-3.6.0-0.0.master.20150128151259.git3f60704.el7.noarch
ovirt-host-deploy-java-1.4.0-0.0.master.20150703061610.gitda2ec90.el7.noarch
ovirt-engine-cli-3.6.0.0-0.3.20150709.git53408f5.el7.centos.noarch
ovirt-engine-3.6.0-0.0.master.20150709173330.git9b5919d.el7.centos.noarch
ovirt-engine-setup-plugin-ovirt-engine-common-3.6.0-0.0.master.20150712172339.git22cb5aa.el7.centos.noarch
ovirt-engine-extensions-api-impl-3.6.0-0.0.master.20150712172339.git22cb5aa.el7.centos.noarch
ovirt-engine-websocket-proxy-3.6.0-0.0.master.20150712172339.git22cb5aa.el7.centos.noarch
ovirt-engine-sdk-python-3.6.0.0-0.15.20150625.gitfc90daf.el7.centos.noarch
vdsm-jsonrpc-java-1.1.3-0.0.master.20150701140902.giteb3f88c.el7.noarch
ovirt-engine-wildfly-8.2.0-1.el7.x86_64
ovirt-engine-restapi-3.6.0-0.0.master.20150709173330.git9b5919d.el7.centos.noarch
ovirt-engine-webadmin-portal-3.6.0-0.0.master.20150709173330.git9b5919d.el7.centos.noarch
ovirt-engine-dbscripts-3.6.0-0.0.master.20150709173330.git9b5919d.el7.centos.noarch
ovirt-engine-setup-plugin-websocket-proxy-3.6.0-0.0.master.20150712172339.git22cb5aa.el7.centos.noarch
ovirt-engine-extension-aaa-jdbc-1.0.0-0.0.master.20150712201948.git84298fe.el7.noarch
ovirt-release-master-001-0.10.master.noarch

************ Steps to Reproduce:
1. Vanilla installation.
2. I successfully created a glusterfs enabled Cluster. Added 4 hosts and couple of vm-Networks
3. I successfully created a distributed volume with 4 Bricks (no replica). 
4  I started the volume and then I run Optimize for oVirt option.
5. Then I navigated to storage menu and tried to add GlusterFS/Data Storage with a name and volume (eg ov_gluster and hv00.ekt.gr:/volumename).

************ Actual results:
The Internal Engine Error appears on screen (every time). The Error/Exception is logged (engine-log/vdsm.log).
A storage entry is created but is not attached/connected to the Data Center, so I have to delete it and try again.

************ Expected results:
I expected to see a glusterfs storage activated and connected to the Data Center.

************ Additional info:
Logs are attached.

Quoted relevant logs:

VDSM:
Thread-446::DEBUG::2015-07-13 14:35:00,035::task::595::Storage.TaskManager.Task::(_updateState) Task=`021b72e6-fc2c-40f6-a223-db965ac4cebe`::moving from state preparing -> state finished
Thread-446::DEBUG::2015-07-13 14:35:00,035::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {}
Thread-446::DEBUG::2015-07-13 14:35:00,035::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {}
Thread-446::DEBUG::2015-07-13 14:35:00,035::task::993::Storage.TaskManager.Task::(_decref) Task=`021b72e6-fc2c-40f6-a223-db965ac4cebe`::ref 0 aborting False
Thread-446::DEBUG::2015-07-13 14:35:00,035::__init__::527::jsonrpc.JsonRpcServer::(_serveRequest) Return 'StoragePool.connectStorageServer' in bridge with [{'status': 0, 'id': u'e1c2b44e-ccbd-48ef-a597-c8889d0ebffd'}]
Thread-446::DEBUG::2015-07-13 14:35:00,035::stompreactor::304::yajsonrpc.StompServer::(send) Sending response
Reactor thread::INFO::2015-07-13 14:35:02,393::protocoldetector::72::ProtocolDetector.AcceptorImpl::(handle_accept) Accepting connection from 127.0.0.1:36736
Reactor thread::DEBUG::2015-07-13 14:35:02,399::protocoldetector::82::ProtocolDetector.Detector::(__init__) Using required_size=11
Reactor thread::INFO::2015-07-13 14:35:02,399::protocoldetector::118::ProtocolDetector.Detector::(handle_read) Detected protocol xml from 127.0.0.1:36736
Reactor thread::DEBUG::2015-07-13 14:35:02,399::bindingxmlrpc::1286::XmlDetector::(handle_socket) xml over http detected from ('127.0.0.1', 36736)
JsonRpc (StompReactor)::DEBUG::2015-07-13 14:35:09,517::stompreactor::235::Broker.StompAdapter::(handle_frame) Handling message <StompFrame command=u'SEND'>
JsonRpcServer::DEBUG::2015-07-13 14:35:09,518::__init__::533::jsonrpc.JsonRpcServer::(serve_requests) Waiting for request
Thread-449::DEBUG::2015-07-13 14:35:09,518::__init__::496::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'StoragePool.connectStorageServer' in bridge with {u'connectionParams': [{u'id': u'00000000-0000-0000-0000-000000000000', u'connection': u'hv00.mytld:/BRICKS/lv00/ditributedvol', u'iqn': u'', u'user': u'', u'tpgt': u'1', u'vfs_type': u'glusterfs', u'password': '********', u'port': u''}], u'storagepoolID': u'00000000-0000-0000-0000-000000000000', u'domainType': 7}
Thread-449::DEBUG::2015-07-13 14:35:09,519::task::595::Storage.TaskManager.Task::(_updateState) Task=`f8fa5304-239c-484a-ba75-f9c3c59541e9`::moving from state init -> state preparing
Thread-449::INFO::2015-07-13 14:35:09,519::logUtils::48::dispatcher::(wrapper) Run and protect: connectStorageServer(domType=7, spUUID=u'00000000-0000-0000-0000-000000000000', conList=[{u'id': u'00000000-0000-0000-0000-000000000000', u'connection': u'hv00.mytld:/BRICKS/lv00/ditributedvol', u'iqn': u'', u'user': u'', u'tpgt': u'1', u'vfs_type': u'glusterfs', u'password': '********', u'port': u''}], options=None)
Thread-449::ERROR::2015-07-13 14:35:09,605::hsm::2464::Storage.HSM::(connectStorageServer) Could not connect to storageServer
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/hsm.py", line 2461, in connectStorageServer
    conObj.connect()
  File "/usr/share/vdsm/storage/storageServer.py", line 213, in connect
    self.validate()
  File "/usr/share/vdsm/storage/storageServer.py", line 315, in validate
    replicaCount = self.volinfo['replicaCount']
  File "/usr/share/vdsm/storage/storageServer.py", line 311, in volinfo
    self._volinfo = self._get_gluster_volinfo()
  File "/usr/share/vdsm/storage/storageServer.py", line 330, in _get_gluster_volinfo
    return volinfo[self._volname]
KeyError: u'BRICKS/lv00/ditributedvol'
Thread-449::DEBUG::2015-07-13 14:35:09,605::hsm::2483::Storage.HSM::(connectStorageServer) knownSDs: {c36b5dab-6dd6-483a-a76f-4e1933226d05: storage.nfsSD.findDomain}
Thread-449::INFO::2015-07-13 14:35:09,605::logUtils::51::dispatcher::(wrapper) Run and protect: connectStorageServer, Return response: {'statuslist': [{'status': 100, 'id': u'00000000-0000-0000-0000-000000000000'}]}
Thread-449::DEBUG::2015-07-13 14:35:09,606::task::1191::Storage.TaskManager.Task::(prepare) Task=`f8fa5304-239c-484a-ba75-f9c3c59541e9`::finished: {'statuslist': [{'status': 100, 'id': u'00000000-0000-0000-0000-000000000000'}]}
Thread-449::DEBUG::2015-07-13 14:35:09,606::task::595::Storage.TaskManager.Task::(_updateState) Task=`f8fa5304-239c-484a-ba75-f9c3c59541e9`::moving from state preparing -> state finished
Thread-449::DEBUG::2015-07-13 14:35:09,606::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {}
Thread-449::DEBUG::2015-07-13 14:35:09,606::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {}
Thread-449::DEBUG::2015-07-13 14:35:09,606::task::993::Storage.TaskManager.Task::(_decref) Task=`f8fa5304-239c-484a-ba75-f9c3c59541e9`::ref 0 aborting False
Thread-449::DEBUG::2015-07-13 14:35:09,606::__init__::527::jsonrpc.JsonRpcServer::(_serveRequest) Return 'StoragePool.connectStorageServer' in bridge with [{'status': 100, 'id': u'00000000-0000-0000-0000-000000000000'}]
Thread-449::DEBUG::2015-07-13 14:35:09,606::stompreactor::304::yajsonrpc.StompServer::(send) Sending response



Engine:
2015-07-13 14:35:09,496 INFO  [org.ovirt.engine.core.bll.storage.AddStorageServerConnectionCommand] (default task-51) [5fd6fd3e] Lock Acquired to object 'EngineLock:{exclusiveLocks='[hv00.mytld:/BRICKS/lv00/ditributedvol=<STORAGE_CONNECTION, ACTION_TYPE_FAILED_OBJECT_LOCKED>]', sharedLocks='null'}'
2015-07-13 14:35:09,506 INFO  [org.ovirt.engine.core.bll.storage.AddStorageServerConnectionCommand] (default task-51) [5fd6fd3e] Running command: AddStorageServerConnectionCommand internal: false. Entities affected :  ID: aaa00000-0000-0000-0000-123456789aaa Type: SystemAction group CREATE_STORAGE_DOMAIN with role type ADMIN
2015-07-13 14:35:09,507 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStorageServerVDSCommand] (default task-51) [5fd6fd3e] START, ConnectStorageServerVDSCommand(HostName = hv00.mytld, StorageServerConnectionManagementVDSParameters:{runAsync='true', hostId='d9781856-26f3-47c6-97e2-7752b45f17ab', storagePoolId='00000000-0000-0000-0000-000000000000', storageType='GLUSTERFS', connectionList='[StorageServerConnections:{id='null', connection='hv00.mytld:/BRICKS/lv00/ditributedvol', iqn='null', vfsType='glusterfs', mountOptions='null', nfsVersion='null', nfsRetrans='null', nfsTimeo='null', iface='null', netIfaceName='null'}]'}), log id: 35f9e6d9
2015-07-13 14:35:09,601 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStorageServerVDSCommand] (default task-51) [5fd6fd3e] FINISH, ConnectStorageServerVDSCommand, return: {00000000-0000-0000-0000-000000000000=100}, log id: 35f9e6d9
2015-07-13 14:35:09,602 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-51) [5fd6fd3e] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: The error message for connection hv00.mytld:/BRICKS/lv00/ditributedvol returned by VDSM was: General Exception
2015-07-13 14:35:09,603 ERROR [org.ovirt.engine.core.bll.storage.BaseFsStorageHelper] (default task-51) [5fd6fd3e] The connection with details 'hv00.mytld:/BRICKS/lv00/ditributedvol' failed because of error code '100' and error message is: general exception
2015-07-13 14:35:09,603 ERROR [org.ovirt.engine.core.bll.storage.AddStorageServerConnectionCommand] (default task-51) [5fd6fd3e] Command 'org.ovirt.engine.core.bll.storage.AddStorageServerConnectionCommand' failed: EngineException: GeneralException (Failed with error GeneralException and code 100)
2015-07-13 14:35:09,604 ERROR [org.ovirt.engine.core.bll.storage.AddStorageServerConnectionCommand] (default task-51) [5fd6fd3e] Transaction rolled-back for command 'org.ovirt.engine.core.bll.storage.AddStorageServerConnectionCommand'.
2015-07-13 14:35:09,609 INFO  [org.ovirt.engine.core.bll.storage.AddStorageServerConnectionCommand] (default task-51) [5fd6fd3e] Lock freed to object 'EngineLock:{exclusiveLocks='[hv00.mytld:/BRICKS/lv00/ditributedvol=<STORAGE_CONNECTION, ACTION_TYPE_FAILED_OBJECT_LOCKED>]', sharedLocks='null'}'
2015-07-13 14:35:09,775 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-80) [] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VDSM hv03.mytld command failed: Error validating master storage domain: ('MD read error',)
2015-07-13 14:35:09,775 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStatusVDSCommand] (DefaultQuartzScheduler_Worker-80) [] Command 'SpmStatusVDSCommand(HostName = hv03.mytld, SpmStatusVDSCommandParameters:{runAsync='true', hostId='f938dbd4-0cb3-4db7-a3d4-e7379152d434', storagePoolId='00000001-0001-0001-0001-000000000208'})' execution failed: IRSGenericException: IRSErrorException: IRSNoMasterDomainException: Error validating master storage domain: ('MD read error',)
2015-07-13 14:35:09,775 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxyData] (DefaultQuartzScheduler_Worker-80) [] hostFromVds::selectedVds - 'hv03.mytld', spmStatus returned null!
2015-07-13 14:35:09,775 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-80) [] IrsBroker::Failed::GetStoragePoolInfoVDS: IRSGenericException: IRSErrorException: IRSNoMasterDomainException: Error validating master storage domain: ('MD read error',)
2015-07-13 14:35:09,780 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxyData] (DefaultQuartzScheduler_Worker-80) [] IRS failover failed - cant allocate vds server
2015-07-13 14:35:11,019 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand] (DefaultQuartzScheduler_Worker-24) [] START, GlusterVolumesListVDSCommand(HostName = hv03.mytld, GlusterVolumesListVDSParameters:{runAsync='true', hostId='f938dbd4-0cb3-4db7-a3d4-e7379152d434'}), log id: 7ebe24ae
2015-07-13 14:35:11,115 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand] (DefaultQuartzScheduler_Worker-24) [] FINISH, GlusterVolumesListVDSCommand, return: {e0a085c8-9621-4e97-8ff2-4256023006ae=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity@7ce52de9, 31646ac8-f690-4ea0-bb0d-89bc5d44b1a2=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity@8f4f3695}, log id: 7ebe24ae

Comment 2 Amit Aviram 2015-08-02 08:24:56 UTC
Still, a proper message should be returned- we need to check why this is the error message we are getting.

Comment 4 Yaniv Lavi 2015-08-03 14:51:39 UTC

*** This bug has been marked as a duplicate of bug 1238093 ***

Comment 5 Amit Aviram 2015-08-03 15:31:17 UTC
bug 1238093 states that 1 rep gluster is not supported, this bug states that we are getting the wrong message while trying to set it up.

are you sure you want to mark it as duplication?

Comment 6 Yaniv Lavi 2015-08-09 15:47:11 UTC
(In reply to Amit Aviram from comment #5)
> bug 1238093 states that 1 rep gluster is not supported, this bug states that
> we are getting the wrong message while trying to set it up.
> 
> are you sure you want to mark it as duplication?

It is supported and should work with replica 1 from next build.


Note You need to log in before you can comment on or make changes to this bug.