Bug 1721457

Summary:	[Dalton] Optimize for virt store fails with distribute volume type
Product:	[Red Hat Storage] Red Hat Gluster Storage	Reporter:	SATHEESARAN <sasundar>
Component:	rhhi	Assignee:	Sahina Bose <sabose>
Status:	CLOSED CURRENTRELEASE	QA Contact:	SATHEESARAN <sasundar>
Severity:	medium	Docs Contact:
Priority:	high
Version:	unspecified	CC:	amukherj, bugs, godas, kdhananj, rhs-bugs, sabose, seamurph
Target Milestone:	---	Keywords:	ZStream
Target Release:	RHHI-V 1.6.z Async Update
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:	1638674	Environment:
Last Closed:	2019-09-06 05:23:07 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1638674
Bug Blocks:

Description SATHEESARAN 2019-06-18 10:38:52 UTC

Description of problem:

Fails to set group virt option on distribute volume type since some of the options are specific to replica volume types.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Create a distribute volume and select optimize for virt store

Actual results:

--- Additional comment from Sahina Bose on 2018-11-21 10:26:59 UTC ---

Do we need to create a separate profile for distribute and virt store.

--- Additional comment from Krutika Dhananjay on 2018-11-21 12:12:22 UTC ---

(In reply to Sahina Bose from comment #1)
> Do we need to create a separate profile for distribute and virt store.

That is one option.

But I just tried setting group virt on a plain distribute volume and group virt has the following options that are afr-specific:

cluster.eager-lock=enable
cluster.quorum-type=auto
cluster.server-quorum-type=server
cluster.data-self-heal-algorithm=full
cluster.locking-scheme=granular
cluster.shd-max-threads=8
cluster.shd-wait-qlength=10000
cluster.choose-local=off

Turns out this error is being thrown only while setting cluster.shd-max-threads.
So in that sense, glusterd behavior is quite inconsistent.

We could create a separate profile for distribute-only volume but in that case care must be taken to set the actual group virt options whenever the volume is converted to replicated configuration.

-Krutika

--- Additional comment from SATHEESARAN on 2018-11-27 08:37:56 UTC ---

(In reply to Krutika Dhananjay from comment #2)
> (In reply to Sahina Bose from comment #1)
> > Do we need to create a separate profile for distribute and virt store.
> 
> That is one option.
> 
> But I just tried setting group virt on a plain distribute volume and group
> virt has the following options that are afr-specific:
> 
> cluster.eager-lock=enable
> cluster.quorum-type=auto
> cluster.server-quorum-type=server
> cluster.data-self-heal-algorithm=full
> cluster.locking-scheme=granular
> cluster.shd-max-threads=8
> cluster.shd-wait-qlength=10000
> cluster.choose-local=off
> 
> Turns out this error is being thrown only while setting
> cluster.shd-max-threads.
> So in that sense, glusterd behavior is quite inconsistent.
> 
> We could create a separate profile for distribute-only volume but in that
> case care must be taken to set the actual group virt options whenever the
> volume is converted to replicated configuration.
> 
> -Krutika

Yes, that could be documented for volume conversion procedure.

--- Additional comment from SATHEESARAN on 2018-11-27 08:58:16 UTC ---

Sahina,

So if there are multiple virt profiles, say virt profile1 for replicate volumes and virt profile2 for distribute volumes - then engine may also need code change to understand which virt profile needs to be invoked based on the volume type.

There are couple of places where RHV Manager UI calls virt profile:
1. Volume creation dialog, has a check box 'Optimize for Virt Store'
2. Selecting the volume and do 'Optimize for Virt Store'

Have you also thought about these changes ?

--- Additional comment from Sahina Bose on 2018-11-27 10:20:01 UTC ---

(In reply to SATHEESARAN from comment #4)
> Sahina,
> 
> So if there are multiple virt profiles, say virt profile1 for replicate
> volumes and virt profile2 for distribute volumes - then engine may also need
> code change to understand which virt profile needs to be invoked based on
> the volume type.
> 
> There are couple of places where RHV Manager UI calls virt profile:
> 1. Volume creation dialog, has a check box 'Optimize for Virt Store'
> 2. Selecting the volume and do 'Optimize for Virt Store'
> 
> Have you also thought about these changes ?

Yes - the engine code will need to be changed as well if we have agreed to create a separate profile. The other option is to ensure the option does not error out when set on a distribute volume.

--- Additional comment from Krutika Dhananjay on 2018-11-27 12:12:36 UTC ---

(In reply to Sahina Bose from comment #5)
> (In reply to SATHEESARAN from comment #4)
> > Sahina,
> > 
> > So if there are multiple virt profiles, say virt profile1 for replicate
> > volumes and virt profile2 for distribute volumes - then engine may also need
> > code change to understand which virt profile needs to be invoked based on
> > the volume type.
> > 
> > There are couple of places where RHV Manager UI calls virt profile:
> > 1. Volume creation dialog, has a check box 'Optimize for Virt Store'
> > 2. Selecting the volume and do 'Optimize for Virt Store'
> > 
> > Have you also thought about these changes ?
> 
> Yes - the engine code will need to be changed as well if we have agreed to
> create a separate profile. The other option is to ensure the option does not
> error out when set on a distribute volume.

Setting needinfo on Atin.
Atin,

The behavior wrt executing volume-set of an option where the translator itself is not in the graph is inconsistent. For example setting some of the afr-specific options on a plain distribute volume succeeds whereas one such option fails if the volume is not replicated.
What's the expected behavior? If there is no harm as such in succeeding such a volume-set operation, then maybe we can ask the afr guys to fix issue with the lone option cluster.shd-max-threads which is currently failed.

--- Additional comment from Atin Mukherjee on 2018-11-28 03:40:32 UTC ---

I don't think we can afford to ignore volume set failures when done for a wrong volume type. The reason cluster.shd-max-threads option fail here is because of an added validation to check if this option is set for replica type or not.

    {.key = "cluster.shd-max-threads",                                              
     .voltype = "cluster/replicate",                                                
     .op_version = GD_OP_VERSION_3_7_12,                                            
     .flags = VOLOPT_FLAG_CLIENT_OPT,                                               
     .validate_fn = validate_replica}, 

The other replica options are missing that validation which allows such options to go through. I believe such additional validation was added to address bugs raised by QE/GSS to block such operations for incompatible volume type, so we can't afford to revert back that additional validation.

IMO, having a separate group profile is the way forward to avoid more complications.

--- Additional comment from Krutika Dhananjay on 2018-11-28 07:42:01 UTC ---

Also we need to make sure granular-entry-heal - which is set during cockpit-based installation - is not set on Dalton volumes.

--- Additional comment from SATHEESARAN on 2018-11-28 11:10:57 UTC ---

(In reply to Krutika Dhananjay from comment #8)
> Also we need to make sure granular-entry-heal - which is set during
> cockpit-based installation - is not set on Dalton volumes.

We can make sure that this the distribute volume will not have the granular-entry-heal turned on.

I have raised a bug to support single brick creation with gluster-ansible - BZ https://bugzilla.redhat.com/show_bug.cgi?id=1653575 - This will also make sure that the granular-entry-heal option is not set on the distributed volume

--- Additional comment from SATHEESARAN on 2019-01-09 06:41:48 UTC ---

Additional information here is that the new virt profile for distributed volume is now available with the name 'distributed-virt'
The following way used to optimize the distribute volume for virt store usecase:
# gluster volume set <vol> group distributed-virt

This is fixed in RHGS 3.4 update3 ( Downstream )

Comment 1 SATHEESARAN 2019-07-17 13:43:47 UTC

Verified with RHV 4.3.5.3

1. Created the distribute volume from RHV Manager UI
2. Optimized this volume for virt store

All options are set on the volume as expected.

Note: Error seen while enabling 'granular-entry-heal' on this volume which is tracked as part 
of bug - https://bugzilla.redhat.com/show_bug.cgi?id=1673277