Created attachment 1675826 [details] vdsm.log Description of problem: VDSM errors out while creating a new Gluster Storage Domain. Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1.From ovirt-ansible-hosted-engine-setup create a new Gluster Storage Domain 2. 3. Actual results: Vdsm errors out and storage Domain is not created. ERROR (jsonrpc/3) [storage.Dispatcher] FINISH createStorageDomain error=[Errno 2] No such file or directory (dispatcher:87) Expected results: Gluster Storage Domain should be created with no vdsm errors. Additional info: No directory created at /rhev/data-center/mnt/glusterSD and hence gluster volume is also not mounted although the log shows otherwise. --- Additional comment from Nir Soffer on 2020-04-02 17:00:32 UTC --- Can you describe how ovirt-ansible-hosted-engine-setup uses engine or vdsm? Does it use vdsm API directly or via engine? If via engine, what are the minimal call sequence that reproduce this issue? Is this reproducible via engine UI? if it is, what are the steps to reproduce? If you cannot answer these questions, what are the steps to reproduce using the mentioned ansible script? We need something that a developer can run to reproduce the issue locally. --- Additional comment from Kaustav Majumder on 2020-04-02 17:13:33 UTC --- The error is encountered while deploying ovirt + gluster via cockpit ui, exactly during hosted engine setup. This is a call to an ansible playbook https://github.com/gluster/gluster-ansible/tree/master/playbooks/hc-ansible-deployment. This internally calls ovirt-ansible-hosted-engine-setup role. The playbook errors at https://github.com/oVirt/ovirt-ansible-hosted-engine-setup/blob/master/tasks/create_storage_domain.yml#L56 The role calls ovirt engine which is deployed as a temp vm HostedEngineLocal. Checking engine logs points to failure of creating Storage domain ->https://pastebin.com/35JQyuGE [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-1) [6ac6e81f] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), VDSM tendrl25.lab.eng.blr.redhat.com command CreateStorageDomainVDS failed: Error creating a storage domain: ('storageType=7, sdUUID=508909e2-8a8d-4e3f-a1ea-f3b0c8dcc4f8, domainName=hosted_storage, domClass=1, typeSpecificArg=tendrl25.lab.eng.blr.redhat.com:/engine domVersion=5block_size=0, max_hosts=250',) The corresponding VDSM verb errors out at "/usr/lib/python3.6/site-packages/vdsm/storage/hsm.py", line 2644, in createStorageDomain max_hosts=max_hosts) I have not tested it via engine ui. Will update bug if it is reproducible. --- Additional comment from Kaustav Majumder on 2020-04-02 17:23:04 UTC --- (In reply to Kaustav Majumder from comment #2) > The error is encountered while deploying ovirt + gluster via cockpit ui, > exactly during hosted engine setup. > This is a call to an ansible playbook > https://github.com/gluster/gluster-ansible/tree/master/playbooks/hc-ansible- > deployment. > This internally calls ovirt-ansible-hosted-engine-setup role. > The playbook errors at > https://github.com/oVirt/ovirt-ansible-hosted-engine-setup/blob/master/tasks/ > create_storage_domain.yml#L56 > > The role calls ovirt engine which is deployed as a temp vm HostedEngineLocal. > Checking engine logs points to failure of creating Storage domain > ->https://pastebin.com/35JQyuGE > [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] > (default task-1) [6ac6e81f] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), > VDSM tendrl25.lab.eng.blr.redhat.com command CreateStorageDomainVDS failed: > Error creating a storage domain: ('storageType=7, > sdUUID=508909e2-8a8d-4e3f-a1ea-f3b0c8dcc4f8, domainName=hosted_storage, > domClass=1, typeSpecificArg=tendrl25.lab.eng.blr.redhat.com:/engine > domVersion=5block_size=0, max_hosts=250',) > > The corresponding VDSM verb errors out at > "/usr/lib/python3.6/site-packages/vdsm/storage/hsm.py", line 2644, in > createStorageDomain > max_hosts=max_hosts) > I have not tested it via engine ui. Will update bug if it is reproducible. Gluster storage domain can be created via engine ui with no errors. --- Additional comment from Yaniv Kaul on 2020-04-02 18:58:39 UTC --- Gobinda, are you familiar with such failure? How's our CI doing? --- Additional comment from Gobinda Das on 2020-04-03 06:55:56 UTC --- Hi Yaniv, This issue We found during ovirt 4.4.0 Beta testing. Still we are struggling to make OST hc master suite to pass because of different different issues like infra,move centos7 to centos8 etc. OST is not green yet for 4.4.0 but we are working hard to make it success. --- Additional comment from Michal Skrivanek on 2020-04-03 12:47:02 UTC --- I'm not aware of any issue with regular HE deployment (in this are at least) and since it works later on...could it be that not all of gluster is set up/running correctly at that point?
This issue prevents from completing RHHI-V deployment and so marked as TESTBLOCKER
Tested with following components: python3-ioprocess-1.4.1-1.el8ev.x86_64 ioprocess-1.4.1-1.el8ev.x86_64 vdsm-client-4.40.13-1.el8ev.noarch vdsm-common-4.40.13-1.el8ev.noarch vdsm-hook-fcoe-4.40.13-1.el8ev.noarch vdsm-api-4.40.13-1.el8ev.noarch vdsm-hook-openstacknet-4.40.13-1.el8ev.noarch vdsm-network-4.40.13-1.el8ev.x86_64 vdsm-jsonrpc-4.40.13-1.el8ev.noarch vdsm-hook-vmfex-dev-4.40.13-1.el8ev.noarch vdsm-yajsonrpc-4.40.13-1.el8ev.noarch vdsm-python-4.40.13-1.el8ev.noarch vdsm-4.40.13-1.el8ev.x86_64 vdsm-hook-vhostmd-4.40.13-1.el8ev.noarch vdsm-hook-ethtool-options-4.40.13-1.el8ev.noarch vdsm-http-4.40.13-1.el8ev.noarch vdsm-gluster-4.40.13-1.el8ev.x86_64 Now glusterfs storage domain can be mounted successfully, during HE deployment. There is still one another issue, where the Hosted Engine VM is unable to boot, for which I will raise a separate bug
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (RHHI for Virtualization 1.8 bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2020:3314