Bug 1475721

Summary:	'gluster volume profile' shouldn't cause negative impact on volume performance
Product:	[Red Hat Storage] Red Hat Gluster Storage	Reporter:	Amar Tumballi <atumball>
Component:	core	Assignee:	Krutika Dhananjay <kdhananj>
Status:	CLOSED ERRATA	QA Contact:	Karan Sandha <ksandha>
Severity:	high	Docs Contact:
Priority:	unspecified
Version:	unspecified	CC:	amukherj, atumball, khartsoe, nchilaka, rcyriac, rhs-bugs, storage-qa-internal
Target Milestone:	---	Keywords:	ZStream
Target Release:	RHGS 3.3.1
Hardware:	Unspecified
OS:	Linux
Whiteboard:
Fixed In Version:	glusterfs-3.8.4-45	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2017-11-29 03:29:14 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1475687

Description Amar Tumballi 2017-07-27 08:36:52 UTC

Description of problem:
Till now, `gluster volume profile` command is recommended to be used in a production system only when there are issues, and there is a need to understand the I/O pattern. The major reason was the impact on volume performance because of enabling the option.


Version-Release number of selected component (if applicable):
3.3.0

How reproducible:
- this is an hypothesis - 


Actual results:
* Recommended only in few cases and not in actual production setup.

Expected results:
* Always have the option enabled, and use the metrics from filesystem to show the performance pattern over time.

Additional info:

Gluster volume profile can provide a lot of internal information on gluster's performance, and this can be used to plot metrics pretty well.

Comment 4 Krutika Dhananjay 2017-08-03 11:12:21 UTC

I don't have context on this issue except for knowing that we plan to enable profile by default and analyse the perf data. Do we know or have evidence to support the claim that enabling volume profiling does have a negative impact on performance?

-Krutika

Comment 5 Amar Tumballi 2017-08-03 11:37:09 UTC

>  we plan to enable profile by default and analyse the perf data.

Yes, above is the context for this bug to exist.

>  Do we know or have evidence to support the claim that enabling volume profiling does have a negative impact on performance?

Nothing where it was 'analyzed' before. Mostly hypothesis:

1. We don't turn it on by default, but ask customers to give the output when they get into perf issues. 
2. Inside code, if 'measure_latency' flag is set, it is a gettimeofday() syscall for every layer of translator for every fop().
3. Earlier in 'debug/io-stats' translator, we had xlator->lock which was used to update all the counters, which surely caused locking contentions at io-stats.

Hence, we want a performance run with profiling on, so we can see how much we regress, and see if there are anything more to enhance it further.

Comment 7 Atin Mukherjee 2017-09-04 09:54:49 UTC

downstream patch : https://code.engineering.redhat.com/gerrit/#/c/117185

Comment 11 errata-xmlrpc 2017-11-29 03:29:14 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:3276