Bug 1821288

Summary: Error creating storage domain via vdsm
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: SATHEESARAN <sasundar>
Component: rhhiAssignee: Kaustav Majumder <kmajumde>
Status: CLOSED ERRATA QA Contact: SATHEESARAN <sasundar>
Severity: high Docs Contact:
Priority: unspecified    
Version: rhgs-3.5CC: bugs, godas, kmajumde, michal.skrivanek, nsoffer, rhs-bugs
Target Milestone: ---Keywords: Regression, TestBlocker
Target Release: RHHI-V 1.8   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: ioprocess-1.4.1-1.el8ev.x86_64, vdsm-4.40.13-1.el8ev.x86_64 Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: 1820283 Environment:
Last Closed: 2020-08-04 14:52:07 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1820283    
Bug Blocks: 1779977    

Description SATHEESARAN 2020-04-06 13:52:29 UTC
Created attachment 1675826 [details]
vdsm.log

Description of problem:
VDSM errors out while creating a new Gluster Storage Domain.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1.From ovirt-ansible-hosted-engine-setup create a new Gluster Storage Domain
2.
3.

Actual results:
Vdsm errors out and storage Domain is not created.
ERROR (jsonrpc/3) [storage.Dispatcher] FINISH createStorageDomain error=[Errno 2] No such file or directory (dispatcher:87)

Expected results:
 Gluster Storage Domain should be created with no vdsm errors.

Additional info:

No directory created at /rhev/data-center/mnt/glusterSD and hence gluster volume is also not mounted although the log shows otherwise.

--- Additional comment from Nir Soffer on 2020-04-02 17:00:32 UTC ---

Can you describe how ovirt-ansible-hosted-engine-setup uses engine 
or vdsm?

Does it use vdsm API directly or via engine? If via engine, what are
the minimal call sequence that reproduce this issue?

Is this reproducible via engine UI? if it is, what are the steps 
to reproduce?

If you cannot answer these questions, what are the steps to reproduce
using the mentioned ansible script? We need something that a developer
can run to reproduce the issue locally.

--- Additional comment from Kaustav Majumder on 2020-04-02 17:13:33 UTC ---

The error is encountered while deploying ovirt + gluster via cockpit ui, exactly during hosted engine setup.
This is a call to an ansible playbook https://github.com/gluster/gluster-ansible/tree/master/playbooks/hc-ansible-deployment.
This internally calls  ovirt-ansible-hosted-engine-setup role.
The playbook errors at https://github.com/oVirt/ovirt-ansible-hosted-engine-setup/blob/master/tasks/create_storage_domain.yml#L56

The role calls ovirt engine which is deployed as a temp vm HostedEngineLocal.
Checking engine logs points to failure of creating Storage domain ->https://pastebin.com/35JQyuGE
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-1) [6ac6e81f] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), VDSM tendrl25.lab.eng.blr.redhat.com command CreateStorageDomainVDS failed: Error creating a storage domain: ('storageType=7, sdUUID=508909e2-8a8d-4e3f-a1ea-f3b0c8dcc4f8, domainName=hosted_storage, domClass=1, typeSpecificArg=tendrl25.lab.eng.blr.redhat.com:/engine domVersion=5block_size=0, max_hosts=250',)

The corresponding VDSM verb errors out at "/usr/lib/python3.6/site-packages/vdsm/storage/hsm.py", line 2644, in createStorageDomain
    max_hosts=max_hosts)
I have not tested it via engine ui. Will update bug if it is reproducible.

--- Additional comment from Kaustav Majumder on 2020-04-02 17:23:04 UTC ---

(In reply to Kaustav Majumder from comment #2)
> The error is encountered while deploying ovirt + gluster via cockpit ui,
> exactly during hosted engine setup.
> This is a call to an ansible playbook
> https://github.com/gluster/gluster-ansible/tree/master/playbooks/hc-ansible-
> deployment.
> This internally calls  ovirt-ansible-hosted-engine-setup role.
> The playbook errors at
> https://github.com/oVirt/ovirt-ansible-hosted-engine-setup/blob/master/tasks/
> create_storage_domain.yml#L56
> 
> The role calls ovirt engine which is deployed as a temp vm HostedEngineLocal.
> Checking engine logs points to failure of creating Storage domain
> ->https://pastebin.com/35JQyuGE
> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
> (default task-1) [6ac6e81f] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802),
> VDSM tendrl25.lab.eng.blr.redhat.com command CreateStorageDomainVDS failed:
> Error creating a storage domain: ('storageType=7,
> sdUUID=508909e2-8a8d-4e3f-a1ea-f3b0c8dcc4f8, domainName=hosted_storage,
> domClass=1, typeSpecificArg=tendrl25.lab.eng.blr.redhat.com:/engine
> domVersion=5block_size=0, max_hosts=250',)
> 
> The corresponding VDSM verb errors out at
> "/usr/lib/python3.6/site-packages/vdsm/storage/hsm.py", line 2644, in
> createStorageDomain
>     max_hosts=max_hosts)
> I have not tested it via engine ui. Will update bug if it is reproducible.
Gluster storage domain can be created via engine ui with no errors.

--- Additional comment from Yaniv Kaul on 2020-04-02 18:58:39 UTC ---

Gobinda, are you familiar with such failure? How's our CI doing?

--- Additional comment from Gobinda Das on 2020-04-03 06:55:56 UTC ---

Hi Yaniv,
 This issue We found during ovirt 4.4.0 Beta testing. Still we are struggling to make OST hc master suite to pass because of different different issues like infra,move centos7 to centos8 etc. OST is not green yet for 4.4.0 but we are working hard to make it success.

--- Additional comment from Michal Skrivanek on 2020-04-03 12:47:02 UTC ---

I'm not aware of any issue with regular HE deployment (in this are at least) and since it works later on...could it be that not all of gluster is set up/running correctly at that point?

Comment 1 SATHEESARAN 2020-04-08 02:53:12 UTC
This issue prevents from completing RHHI-V deployment and so marked as TESTBLOCKER

Comment 4 SATHEESARAN 2020-04-13 16:14:47 UTC
Tested with following components:
python3-ioprocess-1.4.1-1.el8ev.x86_64
ioprocess-1.4.1-1.el8ev.x86_64

vdsm-client-4.40.13-1.el8ev.noarch
vdsm-common-4.40.13-1.el8ev.noarch
vdsm-hook-fcoe-4.40.13-1.el8ev.noarch
vdsm-api-4.40.13-1.el8ev.noarch
vdsm-hook-openstacknet-4.40.13-1.el8ev.noarch
vdsm-network-4.40.13-1.el8ev.x86_64
vdsm-jsonrpc-4.40.13-1.el8ev.noarch
vdsm-hook-vmfex-dev-4.40.13-1.el8ev.noarch
vdsm-yajsonrpc-4.40.13-1.el8ev.noarch
vdsm-python-4.40.13-1.el8ev.noarch
vdsm-4.40.13-1.el8ev.x86_64

vdsm-hook-vhostmd-4.40.13-1.el8ev.noarch
vdsm-hook-ethtool-options-4.40.13-1.el8ev.noarch
vdsm-http-4.40.13-1.el8ev.noarch
vdsm-gluster-4.40.13-1.el8ev.x86_64

Now glusterfs storage domain can be mounted successfully, during HE deployment.

There is still one another issue, where the Hosted Engine VM is unable to boot, for which
I will raise a separate bug

Comment 6 errata-xmlrpc 2020-08-04 14:52:07 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (RHHI for Virtualization 1.8 bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:3314