Description of problem ====================== IOPS chart from At Glance section of Host Dashboard reports different values compared to all other IOPS charts (it reports no data). It's the only IOPS chart which separates write and read operations. When the data are reported, read and write data seems to be swapped with each other. Version-Release number of selected component ============================================ tendrl-monitoring-integration-1.6.3-5.el7rhgs.noarch [root@mbukatov-usm1-server ~]# rpm -qa | grep tendrl | sort tendrl-ansible-1.6.3-5.el7rhgs.noarch tendrl-api-1.6.3-3.el7rhgs.noarch tendrl-api-httpd-1.6.3-3.el7rhgs.noarch tendrl-commons-1.6.3-7.el7rhgs.noarch tendrl-grafana-plugins-1.6.3-5.el7rhgs.noarch tendrl-grafana-selinux-1.5.4-2.el7rhgs.noarch tendrl-monitoring-integration-1.6.3-5.el7rhgs.noarch tendrl-node-agent-1.6.3-7.el7rhgs.noarch tendrl-notifier-1.6.3-4.el7rhgs.noarch tendrl-selinux-1.5.4-2.el7rhgs.noarch tendrl-ui-1.6.3-4.el7rhgs.noarch [root@mbukatov-usm1-gl1 ~]# rpm -qa | grep tendrl | sort tendrl-collectd-selinux-1.5.4-2.el7rhgs.noarch tendrl-commons-1.6.3-7.el7rhgs.noarch tendrl-gluster-integration-1.6.3-5.el7rhgs.noarch tendrl-node-agent-1.6.3-7.el7rhgs.noarch tendrl-selinux-1.5.4-2.el7rhgs.noarch How reproducible ================ 100% Steps to Reproduce ================== 1. prepare gluster trusted storage pool with at least one volume 2. install WA using tendrl-ansible 3. mount the volume on dedicated client machine 4. on the client, copy large tarball in the volume while observing IOPS chart from At Glance section of Host Dashboard (on host with bricks hosting the file being copied into the volume) note: * use a large file, which would use most of the brick space * this will generate write only IO (number of read operations should be much smaller compared to number of writes during copying like this) Actual results ============== In few 1st cases, this chart reports no data at all, while I can see IOPS on other IOPS charts. In some case, I see the data (not sure what make it appear compared to previous tries, but the write IOs were reported as reads). Expected results ================ IOPS are reported in a similar way to other IOPS charts, so that: * when I see IOPS in cluster level, affected host only IOPS should be visible in IOPS chart from At Glance section of Host Dashboard as well * write IO should be reported as write and not vice versa Additional info =============== I'm not sure why in this case, the read and writes are reported in 2 sepate data lines while all the other IOPS chars reports total IOPS only. See screenshot #1, where: * 1st copying of the file is visible in Brick Capacity utilization chart, but not in the IOPS chart * 2nd copying of the file is visible somehow, but reported as read IO (the green line)
Reported during testing of BZ 1581736.
Created attachment 1453602 [details] screenshot 1
Created attachment 1453604 [details] screenshot 2 (the problem highlighted, comparing IOPS and Disk Load chart) Adding more clear screenshot of the problem, comparing IOPS and Disk Load chart.
Another use case, during testing BZ 1581736: I extracted 10000 files with names based on sha1 of it's content, so that when uploaded into arbiter 2 plus 1x2 volume, every brick will host some files. But I see IOPS reported only for one of 6 storage machines in IOPS charts of At a glance section of host dashboard.
There was a small issue where writes values were getting added to the reads only and so the second part of issue happens where looks like reads and writes are swapped. Sent a PR https://github.com/Tendrl/node-agent/pull/834 for the same.
QE will verify that the problem is fully addressed by making sure WA behaves as described in Expected Results section of this BZ.
Testing with ============ [root@mbukatov-usm1-server ~]# rpm -qa | grep tendrl | sort tendrl-ansible-1.6.3-6.el7rhgs.noarch tendrl-api-1.6.3-5.el7rhgs.noarch tendrl-api-httpd-1.6.3-5.el7rhgs.noarch tendrl-commons-1.6.3-12.el7rhgs.noarch tendrl-grafana-plugins-1.6.3-10.el7rhgs.noarch tendrl-grafana-selinux-1.5.4-2.el7rhgs.noarch tendrl-monitoring-integration-1.6.3-10.el7rhgs.noarch tendrl-node-agent-1.6.3-10.el7rhgs.noarch tendrl-notifier-1.6.3-4.el7rhgs.noarch tendrl-selinux-1.5.4-2.el7rhgs.noarch tendrl-ui-1.6.3-10.el7rhgs.noarch [root@mbukatov-usm1-gl1 ~]# rpm -qa | grep tendrl | sort tendrl-collectd-selinux-1.5.4-2.el7rhgs.noarch tendrl-commons-1.6.3-12.el7rhgs.noarch tendrl-gluster-integration-1.6.3-9.el7rhgs.noarch tendrl-node-agent-1.6.3-10.el7rhgs.noarch tendrl-selinux-1.5.4-2.el7rhgs.noarch Results ======= When I perform the steps to reproduce, IOPS chart from At Glance section of Host Dashboard: * was renamed to Brick IOPS (related to other BZ 1595013) * reports single value (combining both read and write operations) as other IOPS charts do * reports data immediately (data starts at the same time as on other charts) * data are reported on charts for all 3 machines which are part of replica set hosting the data
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:2616
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days