Red Hat Bugzilla – Bug 1261700
RFE : Feature: Periodic FOP statistics dumps for v3.6.x/v3.7.x
Last modified: 2016-08-23 08:34:51 EDT
Created attachment 1071977 [details]
Patch for stats dump code.
Description of problem:
Patch to add periodic JSON dumps of FOP latency & hit rate statistics from the io-stats translator. Dumps are controlled by the diagnostics.stats-dump-interval <dump interval sec> option and stored in /var/lib/glusterd/stats under their respective FUSE, gNFSd or brick instance.
This is immensely useful to reliably ferret out diagnostics & performance metrics from GlusterFS for injection into a robust analytics backend for future analysis or alarming. Heavily in-use here at Facebook.
Patches clean onto the release-3.6 or release-3.7 branches as of this bug creation.
Version-Release number of selected component (if applicable):
v3.6.x or v3.7.x, should be trivial to port to master.
Steps to Reproduce:
this is an extremely good idea. I have had to parse gluster volume profile output and it is extremely hard to do. JSON would make it much easier. Also, io-stats translator can run client-side so you get client-side latency, not server-side. Would be great if /usr/sbin/gluster could initiate the profiling so we didn't have to edit a volfile.
Can you provide an attachment with JSON output from the patch so that lazy folks like me can see what it looks like?
-Ben England, Perf. Engr., Red Hat
Created attachment 1076043 [details]
Example output for the dumps (nfsd)
Added example output. Also, this is automatically engaged when either of these options is enabled:
...is set to something non-zero.
We run with these enabled 24x7 on all clusters at all times, and as you have noted it's extremely powerful to be able to look at performance from all layers of the stack (FUSE client, gNFSd and bricks). And with lockless counters (also in this patch) we haven't observed any perf hit.
Cloning this bug to master.
This bug is being closed as GlusterFS-3.6 is nearing its End-Of-Life and only important security bugs will be fixed. This bug has been fixed in more recent GlusterFS releases. If you still face this bug with the newer GlusterFS versions, please open a new bug.