1530495 – [RFE[]RHV Reduce polling interval for gluster

Bug 1530495 - [RFE[]RHV Reduce polling interval for gluster

Summary: [RFE[]RHV Reduce polling interval for gluster

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Virtualization Manager
Classification:	Red Hat
Component:	ovirt-engine
Sub Component:
Version:	4.2.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	high
Target Milestone:	ovirt-4.3.5
Target Release:	4.3.0
Assignee:	Nobody
QA Contact:	SATHEESARAN
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1723680
TreeView+	depends on / blocked

Reported:	2018-01-03 06:35 UTC by Abhishek Kumar
Modified:	2021-06-10 14:09 UTC (History)
CC List:	9 users (show)
Fixed In Version:	ovirt-engine-4.3.3
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1723680 (view as bug list)
Environment:
Last Closed:	2019-08-12 11:53:27 UTC
oVirt Team:	Gluster
Target Upstream Version:
Embargoed:
Flags:	lsvaty: testing_plan_complete-

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHEA-2019:2431	0	None	None	None	2019-08-12 11:53:39 UTC
oVirt gerrit	96034	0	master	MERGED	gluster: Reduce polling interval for gluster	2019-03-18 09:13:41 UTC

Description Abhishek Kumar 2018-01-03 06:35:51 UTC

Description of problem:

To fix the locking issue, RHV monitoring of gluster will need to change to use get-state and aggregate information collected from each node

Version-Release number of selected component (if applicable):

NA


Additional info:

Currently, Vdsm triggering many call per status check which is resulting in failed locking inside gluster. This is stopping user to run any manual gluster related commands, since locks are already held.

Comment 4 Sahina Bose 2018-03-09 09:48:39 UTC

Gluster eventing has been introduced, so we can change the monitoring interval to a less frequent value.

Comment 6 Sahina Bose 2018-12-03 08:42:42 UTC

We have integrated with gluster eventing, so can we ensure that we do not poll as frequently to avoid the locking issues.

Comment 7 Kaustav Majumder 2018-12-06 08:50:42 UTC

What should be the refresh rate? Currently I have checked it to be 15 sec.

Comment 8 Sahina Bose 2018-12-06 08:55:22 UTC

(In reply to kmajumde from comment #7)
> What should be the refresh rate? Currently I have checked it to be 15 sec.

Volume status are every 5 mins...it's part of the RefreshRateHeavy, I think - can you check.
So this can be changed to 15mins. but also check if there's any other polling within this same method that needs to continue at 5 mins?

Comment 9 Kaustav Majumder 2018-12-06 09:02:53 UTC

(In reply to Sahina Bose from comment #8)
> (In reply to kmajumde from comment #7)
> > What should be the refresh rate? Currently I have checked it to be 15 sec.
> 
> Volume status are every 5 mins...it's part of the RefreshRateHeavy, I think
> - can you check.
> So this can be changed to 15mins. but also check if there's any other
> polling within this same method that needs to continue at 5 mins?

brick details, volume capacity , volume advanced details , volume online status 
 are under this polling method . None i feel that needs to continue at 5 mins.

Comment 10 Sandro Bonazzola 2019-01-28 09:41:08 UTC

This bug has not been marked as blocker for oVirt 4.3.0.
Since we are releasing it tomorrow, January 29th, this bug has been re-targeted to 4.3.1.

Comment 12 Sahina Bose 2019-04-23 06:54:30 UTC

Kaustav, can you backport to 4.3?

Comment 13 Sandro Bonazzola 2019-06-25 15:33:39 UTC

Kaustav in which version is this fix included? I see only one patch attached and it's on the engine side and merged in ovirt-engine-4.3.3.
This bug is on vdsm, can you please update this bug to match current real state?

Comment 14 Kaustav Majumder 2019-06-26 04:50:52 UTC

Done. The bug was wrongly tagged to vdsm.

Comment 16 SATHEESARAN 2019-07-26 15:18:07 UTC

Tested with RHV 4.3.3

'gluster volume status detail' is queried for every 15 mins

[2019-07-26 15:00:38.425519]  : system:: uuid get : SUCCESS
[2019-07-26 15:00:54.035630]  : system:: uuid get : SUCCESS
[2019-07-26 15:01:09.674883]  : system:: uuid get : SUCCESS
[2019-07-26 15:01:12.398454]  : volume status vmstore : SUCCESS
[2019-07-26 15:01:12.586130]  : volume status vmstore detail : SUCCESS
[2019-07-26 15:01:16.200499]  : volume status engine : SUCCESS
[2019-07-26 15:01:16.378735]  : volume status engine detail : SUCCESS
[2019-07-26 15:01:19.968367]  : volume status non_vdo : SUCCESS
[2019-07-26 15:01:20.152647]  : volume status non_vdo detail : SUCCESS
[2019-07-26 15:01:23.795427]  : volume status data : SUCCESS
[2019-07-26 15:01:23.984186]  : volume status data detail : SUCCESS
<lines-snipped>
[2019-07-26 15:16:27.651132]  : volume status vmstore : SUCCESS
[2019-07-26 15:16:27.835725]  : volume status vmstore detail : SUCCESS
[2019-07-26 15:16:31.528667]  : volume status engine : SUCCESS
[2019-07-26 15:16:31.708591]  : volume status engine detail : SUCCESS
[2019-07-26 15:16:35.358410]  : volume status non_vdo : SUCCESS
[2019-07-26 15:16:35.538591]  : volume status non_vdo detail : SUCCESS
[2019-07-26 15:16:36.997835]  : system:: uuid get : SUCCESS
[2019-07-26 15:16:39.815920]  : volume status data : SUCCESS
[2019-07-26 15:16:40.006440]  : volume status data detail : SUCCESS

Comment 18 errata-xmlrpc 2019-08-12 11:53:27 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:2431

Note You need to log in before you can comment on or make changes to this bug.