Red Hat Bugzilla – Bug 1475721
'gluster volume profile' shouldn't cause negative impact on volume performance
Last modified: 2017-10-12 07:18:17 EDT
Description of problem:
Till now, `gluster volume profile` command is recommended to be used in a production system only when there are issues, and there is a need to understand the I/O pattern. The major reason was the impact on volume performance because of enabling the option.
Version-Release number of selected component (if applicable):
- this is an hypothesis -
* Recommended only in few cases and not in actual production setup.
* Always have the option enabled, and use the metrics from filesystem to show the performance pattern over time.
Gluster volume profile can provide a lot of internal information on gluster's performance, and this can be used to plot metrics pretty well.
I don't have context on this issue except for knowing that we plan to enable profile by default and analyse the perf data. Do we know or have evidence to support the claim that enabling volume profiling does have a negative impact on performance?
> we plan to enable profile by default and analyse the perf data.
Yes, above is the context for this bug to exist.
> Do we know or have evidence to support the claim that enabling volume profiling does have a negative impact on performance?
Nothing where it was 'analyzed' before. Mostly hypothesis:
1. We don't turn it on by default, but ask customers to give the output when they get into perf issues.
2. Inside code, if 'measure_latency' flag is set, it is a gettimeofday() syscall for every layer of translator for every fop().
3. Earlier in 'debug/io-stats' translator, we had xlator->lock which was used to update all the counters, which surely caused locking contentions at io-stats.
Hence, we want a performance run with profiling on, so we can see how much we regress, and see if there are anything more to enhance it further.
downstream patch : https://code.engineering.redhat.com/gerrit/#/c/117185