Created attachment 1052258 [details] Engine logs This bug was brought up from the users list by "Konstantinos Christidis" <kochrist> ************ Description of problem: >>> I created (through oVirt web) a GlusterFS distributed volume with four >>> bricks. When I try to add a New Domain - GlusterFS Data I am getting: >>> >>> "Error while executing action Add Storage Connection: Internal Engine Error" >>> >>> and >>> "Error validating master storage domain: ('MD read error',)" >>> >>> ps: My installation seems to work only with replica-3 oVirt Optimized >>> volumes. Every other combination fails with the error above. ************ Version-Release number of selected component (if applicable): Centos 7 oVirt 3.6 Full installed list host # rpm -qa | grep -Ei 'ovirt|vdsm' vdsm-infra-4.17.0-1124.git0b2fc17.el7.noarch ovirt-release-master-001-0.10.master.noarch vdsm-yajsonrpc-4.17.0-1124.git0b2fc17.el7.noarch ovirt-vmconsole-1.0.0-0.0.master.20150616120945.gitc1fb2bd.el7.noarch vdsm-xmlrpc-4.17.0-1124.git0b2fc17.el7.noarch vdsm-gluster-4.17.0-1124.git0b2fc17.el7.noarch vdsm-python-4.17.0-1124.git0b2fc17.el7.noarch vdsm-4.17.0-1124.git0b2fc17.el7.noarch vdsm-jsonrpc-4.17.0-1124.git0b2fc17.el7.noarch ovirt-vmconsole-host-1.0.0-0.0.master.20150616120945.gitc1fb2bd.el7.noarch vdsm-cli-4.17.0-1124.git0b2fc17.el7.noarch engine # rpm -qa | grep -Ei 'ovirt|vdsm' ovirt-engine-jboss-as-7.1.1-1.el7.x86_64 ovirt-engine-tools-3.6.0-0.0.master.20150709173330.git9b5919d.el7.centos.noarch ovirt-engine-userportal-3.6.0-0.0.master.20150709173330.git9b5919d.el7.centos.noarch ovirt-engine-lib-3.6.0-0.0.master.20150712172339.git22cb5aa.el7.centos.noarch ovirt-engine-setup-plugin-vmconsole-proxy-helper-3.6.0-0.0.master.20150712172339.git22cb5aa.el7.centos.noarch ovirt-engine-setup-3.6.0-0.0.master.20150712172339.git22cb5aa.el7.centos.noarch ovirt-iso-uploader-3.6.0-0.0.master.20150618074838.gitea4158a.el7.noarch ebay-cors-filter-1.0.1-0.1.ovirt.el7.noarch ovirt-host-deploy-1.4.0-0.0.master.20150703061610.gitda2ec90.el7.noarch ovirt-engine-wildfly-overlay-001-2.el7.noarch ovirt-engine-backend-3.6.0-0.0.master.20150709173330.git9b5919d.el7.centos.noarch ovirt-engine-setup-base-3.6.0-0.0.master.20150712172339.git22cb5aa.el7.centos.noarch ovirt-engine-setup-plugin-ovirt-engine-3.6.0-0.0.master.20150712172339.git22cb5aa.el7.centos.noarch ovirt-engine-vmconsole-proxy-helper-3.6.0-0.0.master.20150712172339.git22cb5aa.el7.centos.noarch ovirt-image-uploader-3.6.0-0.0.master.20150128151259.git3f60704.el7.noarch ovirt-host-deploy-java-1.4.0-0.0.master.20150703061610.gitda2ec90.el7.noarch ovirt-engine-cli-3.6.0.0-0.3.20150709.git53408f5.el7.centos.noarch ovirt-engine-3.6.0-0.0.master.20150709173330.git9b5919d.el7.centos.noarch ovirt-engine-setup-plugin-ovirt-engine-common-3.6.0-0.0.master.20150712172339.git22cb5aa.el7.centos.noarch ovirt-engine-extensions-api-impl-3.6.0-0.0.master.20150712172339.git22cb5aa.el7.centos.noarch ovirt-engine-websocket-proxy-3.6.0-0.0.master.20150712172339.git22cb5aa.el7.centos.noarch ovirt-engine-sdk-python-3.6.0.0-0.15.20150625.gitfc90daf.el7.centos.noarch vdsm-jsonrpc-java-1.1.3-0.0.master.20150701140902.giteb3f88c.el7.noarch ovirt-engine-wildfly-8.2.0-1.el7.x86_64 ovirt-engine-restapi-3.6.0-0.0.master.20150709173330.git9b5919d.el7.centos.noarch ovirt-engine-webadmin-portal-3.6.0-0.0.master.20150709173330.git9b5919d.el7.centos.noarch ovirt-engine-dbscripts-3.6.0-0.0.master.20150709173330.git9b5919d.el7.centos.noarch ovirt-engine-setup-plugin-websocket-proxy-3.6.0-0.0.master.20150712172339.git22cb5aa.el7.centos.noarch ovirt-engine-extension-aaa-jdbc-1.0.0-0.0.master.20150712201948.git84298fe.el7.noarch ovirt-release-master-001-0.10.master.noarch ************ Steps to Reproduce: 1. Vanilla installation. 2. I successfully created a glusterfs enabled Cluster. Added 4 hosts and couple of vm-Networks 3. I successfully created a distributed volume with 4 Bricks (no replica). 4 I started the volume and then I run Optimize for oVirt option. 5. Then I navigated to storage menu and tried to add GlusterFS/Data Storage with a name and volume (eg ov_gluster and hv00.ekt.gr:/volumename). ************ Actual results: The Internal Engine Error appears on screen (every time). The Error/Exception is logged (engine-log/vdsm.log). A storage entry is created but is not attached/connected to the Data Center, so I have to delete it and try again. ************ Expected results: I expected to see a glusterfs storage activated and connected to the Data Center. ************ Additional info: Logs are attached. Quoted relevant logs: VDSM: Thread-446::DEBUG::2015-07-13 14:35:00,035::task::595::Storage.TaskManager.Task::(_updateState) Task=`021b72e6-fc2c-40f6-a223-db965ac4cebe`::moving from state preparing -> state finished Thread-446::DEBUG::2015-07-13 14:35:00,035::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-446::DEBUG::2015-07-13 14:35:00,035::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-446::DEBUG::2015-07-13 14:35:00,035::task::993::Storage.TaskManager.Task::(_decref) Task=`021b72e6-fc2c-40f6-a223-db965ac4cebe`::ref 0 aborting False Thread-446::DEBUG::2015-07-13 14:35:00,035::__init__::527::jsonrpc.JsonRpcServer::(_serveRequest) Return 'StoragePool.connectStorageServer' in bridge with [{'status': 0, 'id': u'e1c2b44e-ccbd-48ef-a597-c8889d0ebffd'}] Thread-446::DEBUG::2015-07-13 14:35:00,035::stompreactor::304::yajsonrpc.StompServer::(send) Sending response Reactor thread::INFO::2015-07-13 14:35:02,393::protocoldetector::72::ProtocolDetector.AcceptorImpl::(handle_accept) Accepting connection from 127.0.0.1:36736 Reactor thread::DEBUG::2015-07-13 14:35:02,399::protocoldetector::82::ProtocolDetector.Detector::(__init__) Using required_size=11 Reactor thread::INFO::2015-07-13 14:35:02,399::protocoldetector::118::ProtocolDetector.Detector::(handle_read) Detected protocol xml from 127.0.0.1:36736 Reactor thread::DEBUG::2015-07-13 14:35:02,399::bindingxmlrpc::1286::XmlDetector::(handle_socket) xml over http detected from ('127.0.0.1', 36736) JsonRpc (StompReactor)::DEBUG::2015-07-13 14:35:09,517::stompreactor::235::Broker.StompAdapter::(handle_frame) Handling message <StompFrame command=u'SEND'> JsonRpcServer::DEBUG::2015-07-13 14:35:09,518::__init__::533::jsonrpc.JsonRpcServer::(serve_requests) Waiting for request Thread-449::DEBUG::2015-07-13 14:35:09,518::__init__::496::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'StoragePool.connectStorageServer' in bridge with {u'connectionParams': [{u'id': u'00000000-0000-0000-0000-000000000000', u'connection': u'hv00.mytld:/BRICKS/lv00/ditributedvol', u'iqn': u'', u'user': u'', u'tpgt': u'1', u'vfs_type': u'glusterfs', u'password': '********', u'port': u''}], u'storagepoolID': u'00000000-0000-0000-0000-000000000000', u'domainType': 7} Thread-449::DEBUG::2015-07-13 14:35:09,519::task::595::Storage.TaskManager.Task::(_updateState) Task=`f8fa5304-239c-484a-ba75-f9c3c59541e9`::moving from state init -> state preparing Thread-449::INFO::2015-07-13 14:35:09,519::logUtils::48::dispatcher::(wrapper) Run and protect: connectStorageServer(domType=7, spUUID=u'00000000-0000-0000-0000-000000000000', conList=[{u'id': u'00000000-0000-0000-0000-000000000000', u'connection': u'hv00.mytld:/BRICKS/lv00/ditributedvol', u'iqn': u'', u'user': u'', u'tpgt': u'1', u'vfs_type': u'glusterfs', u'password': '********', u'port': u''}], options=None) Thread-449::ERROR::2015-07-13 14:35:09,605::hsm::2464::Storage.HSM::(connectStorageServer) Could not connect to storageServer Traceback (most recent call last): File "/usr/share/vdsm/storage/hsm.py", line 2461, in connectStorageServer conObj.connect() File "/usr/share/vdsm/storage/storageServer.py", line 213, in connect self.validate() File "/usr/share/vdsm/storage/storageServer.py", line 315, in validate replicaCount = self.volinfo['replicaCount'] File "/usr/share/vdsm/storage/storageServer.py", line 311, in volinfo self._volinfo = self._get_gluster_volinfo() File "/usr/share/vdsm/storage/storageServer.py", line 330, in _get_gluster_volinfo return volinfo[self._volname] KeyError: u'BRICKS/lv00/ditributedvol' Thread-449::DEBUG::2015-07-13 14:35:09,605::hsm::2483::Storage.HSM::(connectStorageServer) knownSDs: {c36b5dab-6dd6-483a-a76f-4e1933226d05: storage.nfsSD.findDomain} Thread-449::INFO::2015-07-13 14:35:09,605::logUtils::51::dispatcher::(wrapper) Run and protect: connectStorageServer, Return response: {'statuslist': [{'status': 100, 'id': u'00000000-0000-0000-0000-000000000000'}]} Thread-449::DEBUG::2015-07-13 14:35:09,606::task::1191::Storage.TaskManager.Task::(prepare) Task=`f8fa5304-239c-484a-ba75-f9c3c59541e9`::finished: {'statuslist': [{'status': 100, 'id': u'00000000-0000-0000-0000-000000000000'}]} Thread-449::DEBUG::2015-07-13 14:35:09,606::task::595::Storage.TaskManager.Task::(_updateState) Task=`f8fa5304-239c-484a-ba75-f9c3c59541e9`::moving from state preparing -> state finished Thread-449::DEBUG::2015-07-13 14:35:09,606::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-449::DEBUG::2015-07-13 14:35:09,606::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-449::DEBUG::2015-07-13 14:35:09,606::task::993::Storage.TaskManager.Task::(_decref) Task=`f8fa5304-239c-484a-ba75-f9c3c59541e9`::ref 0 aborting False Thread-449::DEBUG::2015-07-13 14:35:09,606::__init__::527::jsonrpc.JsonRpcServer::(_serveRequest) Return 'StoragePool.connectStorageServer' in bridge with [{'status': 100, 'id': u'00000000-0000-0000-0000-000000000000'}] Thread-449::DEBUG::2015-07-13 14:35:09,606::stompreactor::304::yajsonrpc.StompServer::(send) Sending response Engine: 2015-07-13 14:35:09,496 INFO [org.ovirt.engine.core.bll.storage.AddStorageServerConnectionCommand] (default task-51) [5fd6fd3e] Lock Acquired to object 'EngineLock:{exclusiveLocks='[hv00.mytld:/BRICKS/lv00/ditributedvol=<STORAGE_CONNECTION, ACTION_TYPE_FAILED_OBJECT_LOCKED>]', sharedLocks='null'}' 2015-07-13 14:35:09,506 INFO [org.ovirt.engine.core.bll.storage.AddStorageServerConnectionCommand] (default task-51) [5fd6fd3e] Running command: AddStorageServerConnectionCommand internal: false. Entities affected : ID: aaa00000-0000-0000-0000-123456789aaa Type: SystemAction group CREATE_STORAGE_DOMAIN with role type ADMIN 2015-07-13 14:35:09,507 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStorageServerVDSCommand] (default task-51) [5fd6fd3e] START, ConnectStorageServerVDSCommand(HostName = hv00.mytld, StorageServerConnectionManagementVDSParameters:{runAsync='true', hostId='d9781856-26f3-47c6-97e2-7752b45f17ab', storagePoolId='00000000-0000-0000-0000-000000000000', storageType='GLUSTERFS', connectionList='[StorageServerConnections:{id='null', connection='hv00.mytld:/BRICKS/lv00/ditributedvol', iqn='null', vfsType='glusterfs', mountOptions='null', nfsVersion='null', nfsRetrans='null', nfsTimeo='null', iface='null', netIfaceName='null'}]'}), log id: 35f9e6d9 2015-07-13 14:35:09,601 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStorageServerVDSCommand] (default task-51) [5fd6fd3e] FINISH, ConnectStorageServerVDSCommand, return: {00000000-0000-0000-0000-000000000000=100}, log id: 35f9e6d9 2015-07-13 14:35:09,602 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-51) [5fd6fd3e] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: The error message for connection hv00.mytld:/BRICKS/lv00/ditributedvol returned by VDSM was: General Exception 2015-07-13 14:35:09,603 ERROR [org.ovirt.engine.core.bll.storage.BaseFsStorageHelper] (default task-51) [5fd6fd3e] The connection with details 'hv00.mytld:/BRICKS/lv00/ditributedvol' failed because of error code '100' and error message is: general exception 2015-07-13 14:35:09,603 ERROR [org.ovirt.engine.core.bll.storage.AddStorageServerConnectionCommand] (default task-51) [5fd6fd3e] Command 'org.ovirt.engine.core.bll.storage.AddStorageServerConnectionCommand' failed: EngineException: GeneralException (Failed with error GeneralException and code 100) 2015-07-13 14:35:09,604 ERROR [org.ovirt.engine.core.bll.storage.AddStorageServerConnectionCommand] (default task-51) [5fd6fd3e] Transaction rolled-back for command 'org.ovirt.engine.core.bll.storage.AddStorageServerConnectionCommand'. 2015-07-13 14:35:09,609 INFO [org.ovirt.engine.core.bll.storage.AddStorageServerConnectionCommand] (default task-51) [5fd6fd3e] Lock freed to object 'EngineLock:{exclusiveLocks='[hv00.mytld:/BRICKS/lv00/ditributedvol=<STORAGE_CONNECTION, ACTION_TYPE_FAILED_OBJECT_LOCKED>]', sharedLocks='null'}' 2015-07-13 14:35:09,775 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-80) [] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VDSM hv03.mytld command failed: Error validating master storage domain: ('MD read error',) 2015-07-13 14:35:09,775 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStatusVDSCommand] (DefaultQuartzScheduler_Worker-80) [] Command 'SpmStatusVDSCommand(HostName = hv03.mytld, SpmStatusVDSCommandParameters:{runAsync='true', hostId='f938dbd4-0cb3-4db7-a3d4-e7379152d434', storagePoolId='00000001-0001-0001-0001-000000000208'})' execution failed: IRSGenericException: IRSErrorException: IRSNoMasterDomainException: Error validating master storage domain: ('MD read error',) 2015-07-13 14:35:09,775 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxyData] (DefaultQuartzScheduler_Worker-80) [] hostFromVds::selectedVds - 'hv03.mytld', spmStatus returned null! 2015-07-13 14:35:09,775 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-80) [] IrsBroker::Failed::GetStoragePoolInfoVDS: IRSGenericException: IRSErrorException: IRSNoMasterDomainException: Error validating master storage domain: ('MD read error',) 2015-07-13 14:35:09,780 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxyData] (DefaultQuartzScheduler_Worker-80) [] IRS failover failed - cant allocate vds server 2015-07-13 14:35:11,019 INFO [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand] (DefaultQuartzScheduler_Worker-24) [] START, GlusterVolumesListVDSCommand(HostName = hv03.mytld, GlusterVolumesListVDSParameters:{runAsync='true', hostId='f938dbd4-0cb3-4db7-a3d4-e7379152d434'}), log id: 7ebe24ae 2015-07-13 14:35:11,115 INFO [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand] (DefaultQuartzScheduler_Worker-24) [] FINISH, GlusterVolumesListVDSCommand, return: {e0a085c8-9621-4e97-8ff2-4256023006ae=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity@7ce52de9, 31646ac8-f690-4ea0-bb0d-89bc5d44b1a2=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity@8f4f3695}, log id: 7ebe24ae
Still, a proper message should be returned- we need to check why this is the error message we are getting.
*** This bug has been marked as a duplicate of bug 1238093 ***
bug 1238093 states that 1 rep gluster is not supported, this bug states that we are getting the wrong message while trying to set it up. are you sure you want to mark it as duplication?
(In reply to Amit Aviram from comment #5) > bug 1238093 states that 1 rep gluster is not supported, this bug states that > we are getting the wrong message while trying to set it up. > > are you sure you want to mark it as duplication? It is supported and should work with replica 1 from next build.