Bug 1507930

Summary:	volume utilization info is not correct
Product:	[Red Hat Storage] Red Hat Gluster Storage	Reporter:	Martin Kudlej <mkudlej>
Component:	web-admin-tendrl-gluster-integration	Assignee:	Shubhendu Tripathi <shtripat>
Status:	CLOSED ERRATA	QA Contact:	Martin Kudlej <mkudlej>
Severity:	unspecified	Docs Contact:
Priority:	unspecified
Version:	rhgs-3.3	CC:	gshanmug, mkudlej, nthomas, sankarshan
Target Milestone:	---	Keywords:	ZStream
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2017-12-18 04:39:36 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1511973
Bug Blocks:

Description Martin Kudlej 2017-10-31 13:30:21 UTC

Description of problem:
This bug is related to https://bugzilla.redhat.com/show_bug.cgi?id=1501398#c3 and upstream https://github.com/Tendrl/gluster-integration/issues/446
I see this issue also in downstream.

Version-Release number of selected component (if applicable):
etcd-3.2.7-1.el7.x86_64
glusterfs-3.8.4-18.4.el7.x86_64
glusterfs-3.8.4-50.el7rhgs.x86_64
glusterfs-api-3.8.4-50.el7rhgs.x86_64
glusterfs-cli-3.8.4-50.el7rhgs.x86_64
glusterfs-client-xlators-3.8.4-18.4.el7.x86_64
glusterfs-client-xlators-3.8.4-50.el7rhgs.x86_64
glusterfs-events-3.8.4-50.el7rhgs.x86_64
glusterfs-fuse-3.8.4-18.4.el7.x86_64
glusterfs-fuse-3.8.4-50.el7rhgs.x86_64
glusterfs-geo-replication-3.8.4-50.el7rhgs.x86_64
glusterfs-libs-3.8.4-18.4.el7.x86_64
glusterfs-libs-3.8.4-50.el7rhgs.x86_64
glusterfs-server-3.8.4-50.el7rhgs.x86_64
python-etcd-0.4.5-1.noarch
rubygem-etcd-0.3.0-1.el7.noarch
tendrl-ansible-1.5.3-2.el7rhgs.noarch
tendrl-api-1.5.3-2.el7rhgs.noarch
tendrl-api-httpd-1.5.3-2.el7rhgs.noarch
tendrl-commons-1.5.3-1.el7rhgs.noarch
tendrl-gluster-integration-1.5.3-2.el7rhgs.noarch
tendrl-grafana-plugins-1.5.3-2.el7rhgs.noarch
tendrl-grafana-selinux-1.5.3-2.el7rhgs.noarch
tendrl-monitoring-integration-1.5.3-2.el7rhgs.noarch
tendrl-node-agent-1.5.3-3.el7rhgs.noarch
tendrl-notifier-1.5.3-1.el7rhgs.noarch
tendrl-selinux-1.5.3-2.el7rhgs.noarch
tendrl-ui-1.5.3-2.el7rhgs.noarch


How reproducible:
100%

Steps to Reproduce:
1. install gluster with arbiter or disperse volume (or any other)
2. copy some data to volume
3. compare volume utilization with output from `df` command on mounted volume

Actual results:
Volume utilization is not correct or in other words same as for statfs on mounted volume.

Expected results:
Volume utilization (including inode stats) is not correct or in other words same as for statfs on mounted volume.

Comment 1 Martin Kudlej 2017-11-08 12:57:54 UTC

It doesn't work. Now there is no inof about volume utilization. I see in logs this error:
/var/log/tendrl/node-agent/node-agent.log:Nov  8 07:33:56 localhost tendrl-node-agent: 2017-11-08 12:33:56.273891+00:00 - monitoring_integration - /usr/lib/python2.7/site-packages/tendrl/monitoring_integration/graphite/__init__.py:407 - set_volume_level_brick_count - ERROR - Failed to set volume level brick count'volume_beta_arbiter_2_plus_1x2' -->Assigned

Comment 2 Shubhendu Tripathi 2017-11-08 13:40:58 UTC

This seems to be a different issue. The actual as part of fix was to use gfapi and not tendrl specific logic for calculation of volume utilization.

@Gowtham, we need to check this why monitoring-integration fails.
@Martin, can you share the setup details to debug this?

Comment 3 Shubhendu Tripathi 2017-11-09 05:40:24 UTC

@Martin, while debugging I see that tendrl-gluster-integration was not running on any of the storage nodes. This service is responsible for syncing all the cluster details into tendrl central store. I started the services on all the nodes and now I can see volume details getting populated properly in dashboard and grafana.

Kindly check and verify.

Comment 4 Martin Kudlej 2017-11-13 14:45:27 UTC

I still see differences between info in charts related to volume and output of *df*:
1) *Capacity Available* value is the same like in libgfapi output BUT is not the same like in *df* output. -> https://bugzilla.redhat.com/show_bug.cgi?id=1511973
2) There is still difference between Inode info in *Inode Available* chart and *df -i* command
$ df -i .
Filesystem                              Inodes IUsed    IFree IUse% Mounted on
_host_:volume_beta_arbiter_2_plus_1x2 62570496   202 62570294    1% /mnt/volume_beta_arbiter_2_plus_1x2

This value is same like in libgfapi.

But WA reports:
Inode Available
160674336


Tested with:
etcd-3.2.7-1.el7.x86_64
glusterfs-3.8.4-52.el7_4.x86_64
glusterfs-3.8.4-52.el7rhgs.x86_64
glusterfs-api-3.8.4-52.el7rhgs.x86_64
glusterfs-cli-3.8.4-52.el7rhgs.x86_64
glusterfs-client-xlators-3.8.4-52.el7_4.x86_64
glusterfs-client-xlators-3.8.4-52.el7rhgs.x86_64
glusterfs-events-3.8.4-52.el7rhgs.x86_64
glusterfs-fuse-3.8.4-52.el7_4.x86_64
glusterfs-fuse-3.8.4-52.el7rhgs.x86_64
glusterfs-geo-replication-3.8.4-52.el7rhgs.x86_64
glusterfs-libs-3.8.4-52.el7_4.x86_64
glusterfs-libs-3.8.4-52.el7rhgs.x86_64
glusterfs-rdma-3.8.4-52.el7rhgs.x86_64
glusterfs-server-3.8.4-52.el7rhgs.x86_64
gluster-nagios-addons-0.2.10-2.el7rhgs.x86_64
gluster-nagios-common-0.2.4-1.el7rhgs.noarch
libvirt-daemon-driver-storage-gluster-3.2.0-14.el7_4.3.x86_64
python-etcd-0.4.5-1.noarch
python-gluster-3.8.4-52.el7rhgs.noarch
rubygem-etcd-0.3.0-1.el7.noarch
tendrl-ansible-1.5.4-1.el7rhgs.noarch
tendrl-api-1.5.4-2.el7rhgs.noarch
tendrl-api-httpd-1.5.4-2.el7rhgs.noarch
tendrl-collectd-selinux-1.5.3-2.el7rhgs.noarch
tendrl-commons-1.5.4-2.el7rhgs.noarch
tendrl-gluster-integration-1.5.4-2.el7rhgs.noarch
tendrl-grafana-plugins-1.5.4-3.el7rhgs.noarch
tendrl-grafana-selinux-1.5.3-2.el7rhgs.noarch
tendrl-monitoring-integration-1.5.4-3.el7rhgs.noarch
tendrl-node-agent-1.5.4-2.el7rhgs.noarch
tendrl-notifier-1.5.4-2.el7rhgs.noarch
tendrl-selinux-1.5.3-2.el7rhgs.noarch
tendrl-ui-1.5.4-2.el7rhgs.noarch
vdsm-gluster-4.17.33-1.2.el7rhgs.noarch

-->Assigned, FiV chnaged

Comment 5 Nishanth Thomas 2017-11-13 15:34:38 UTC

What is the relationship between volume utilization and inodes available?
If you have issues with inodes/count etc please raise a separate bug.

Comment 6 Martin Kudlej 2017-11-14 06:23:18 UTC

Utilization of inodes in volume is related to data volume utilization because data source for both is the same - libgfapi and df.

Comment 7 Nishanth Thomas 2017-11-14 09:17:48 UTC

As far as Tendrl is concerned those are two different panels on the dashboard, so please raise a separate bug for inode utilization.

Comment 8 Martin Kudlej 2017-11-15 11:34:45 UTC

I've moved inode incorrect info to new bug 1513416.

Comment 9 Martin Kudlej 2017-11-15 11:36:49 UTC

Nishanth are you going to move this BZ to ON_QA based on comment 1507930#c8 ?

Comment 10 Martin Kudlej 2017-11-15 13:41:14 UTC

Because of comment 1507930#c8 I move this bug to verified.

Comment 13 errata-xmlrpc 2017-12-18 04:39:36 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:3478