Bug 1590693

Summary: Document what data are available only when profiling is enabled
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Martin Bukatovic <mbukatov>
Component: doc-RHGS_Web_AdministrationAssignee: Rakesh <rghatvis>
Status: CLOSED CURRENTRELEASE QA Contact: Elena Bondarenko <ebondare>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rhgs-3.4CC: apaladug, asriram, ebondare, gshanmug, nthomas, rghatvis, rhs-bugs, sankarshan
Target Milestone: ---   
Target Release: RHGS 3.4.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-09-10 15:46:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1503142    

Description Martin Bukatovic 2018-06-13 08:35:35 UTC
User Story: As a storage admin, I want to understand what data will be reported
by RHGS WA only when profiling is enabled.

Example of such data: IOPS (see BZ 1587804).

Comment 1 Martin Bukatovic 2018-06-13 08:40:44 UTC
Nishanth, we need a list of metrics/data which requires profiling to be
enabled. I guess that we have such list somewhere already, but it would
be very helpful if you could could find it, check if it's still up to date
wrt upcoming RHGS WA 3.4, update it if needed and link it here.

Comment 2 Nishanth Thomas 2018-06-14 07:32:13 UTC
Please provide the information requested at https://bugzilla.redhat.com/show_bug.cgi?id=1590693#c1

Comment 3 Martin Bukatovic 2018-06-26 14:35:26 UTC
I'm extracting articles from enwiki-latest-pages-articles.xml.bz2 as individual
files into arbiter volume for about 20 hours, and disabled profiling in the
middle. Here are the charts for which I no longer see data after the profiling
feature is disabled:

* Cluster Dashboard: At a glance: IOPS
* Volume Dashboard: Performance: IOPS
* Volume Dashboard: Profiling Information: everything there
* Host Dashboard: At a glance: IOPS
* Brick Dashboard: At a glance: IOPS

Comment 4 gowtham 2018-07-25 09:48:19 UTC
* Brick wise IOPS details - (By aggregate the reads and writes across all blocks and then divide them with `duration` value). This is pushed bricks details under volume as well as brick details under nodes.

* Top File Operations(under volume dashboard) - The Top File Operations panel displays the top 5 FOP (file operations) with the highest % latency, wherein the % latency is the fraction of the FOP response time that is consumed by the FOP. (this is calculated using HITS value)

* File Operations for Read/Write(under volume dashbaord): The File Operations for Read/Write panel displays the average latency, maximum latency, call rate for each FOP for Read/Write Operations over a period of time. (sum of hits from all bricks in a volume which all have READ and WRITE operations)

* File Operations For Locks (under volume dashboard): The File Operations for Locks panel displays the average latency, maximum latency, call rate for each FOP for Locks over a period of time. (sum of hits from all bricks in a volume  which all have LOCK operations)

* File Operations for Inode Operations (under volume dashboard):  The File Operations for Inode Operations panel displays the average latency, maximum latency, call rate for each FOP for Inode Operations over a period of time. (sum of hits from all bricks in a volume which all have INODE operations)

* File Operations for Entry Operations (under volume dashboard): The File Operations for Entry Operations panel displays the average latency, maximum latency, call rate for each FOP for Entry Operations over a period of time.
(sum of hits from all bricks in a volume which all have ENTRY operations) 


READ and WRITE, ENTRY, INODE, LOCK OPS are: https://github.com/Tendrl/node-agent/blob/master/tendrl/node_agent/monitoring/collectd/collectors/gluster/heavy_weight/tendrl_glusterfs_profile_info.py#L20

Comment 5 gowtham 2018-07-25 09:53:47 UTC
Other than this we are finding "latencyAvg", "latencyMin", "latencyMax" form file operations of bricks but we are not using it in grafana. It just pushed in graphite. Might be useful for future so we are keeping it also.

Comment 6 Martin Bukatovic 2018-08-07 07:22:20 UTC
QE will check the eng comment and the qe dashboard overview test results as a
base what needs to be documented here.