Bug 1507930

Summary: volume utilization info is not correct
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Martin Kudlej <mkudlej>
Component: web-admin-tendrl-gluster-integrationAssignee: Shubhendu Tripathi <shtripat>
Status: CLOSED ERRATA QA Contact: Martin Kudlej <mkudlej>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rhgs-3.3CC: gshanmug, mkudlej, nthomas, sankarshan
Target Milestone: ---Keywords: ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-12-18 04:39:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1511973    
Bug Blocks:    

Description Martin Kudlej 2017-10-31 13:30:21 UTC
Description of problem:
This bug is related to https://bugzilla.redhat.com/show_bug.cgi?id=1501398#c3 and upstream https://github.com/Tendrl/gluster-integration/issues/446
I see this issue also in downstream.

Version-Release number of selected component (if applicable):
etcd-3.2.7-1.el7.x86_64
glusterfs-3.8.4-18.4.el7.x86_64
glusterfs-3.8.4-50.el7rhgs.x86_64
glusterfs-api-3.8.4-50.el7rhgs.x86_64
glusterfs-cli-3.8.4-50.el7rhgs.x86_64
glusterfs-client-xlators-3.8.4-18.4.el7.x86_64
glusterfs-client-xlators-3.8.4-50.el7rhgs.x86_64
glusterfs-events-3.8.4-50.el7rhgs.x86_64
glusterfs-fuse-3.8.4-18.4.el7.x86_64
glusterfs-fuse-3.8.4-50.el7rhgs.x86_64
glusterfs-geo-replication-3.8.4-50.el7rhgs.x86_64
glusterfs-libs-3.8.4-18.4.el7.x86_64
glusterfs-libs-3.8.4-50.el7rhgs.x86_64
glusterfs-server-3.8.4-50.el7rhgs.x86_64
python-etcd-0.4.5-1.noarch
rubygem-etcd-0.3.0-1.el7.noarch
tendrl-ansible-1.5.3-2.el7rhgs.noarch
tendrl-api-1.5.3-2.el7rhgs.noarch
tendrl-api-httpd-1.5.3-2.el7rhgs.noarch
tendrl-commons-1.5.3-1.el7rhgs.noarch
tendrl-gluster-integration-1.5.3-2.el7rhgs.noarch
tendrl-grafana-plugins-1.5.3-2.el7rhgs.noarch
tendrl-grafana-selinux-1.5.3-2.el7rhgs.noarch
tendrl-monitoring-integration-1.5.3-2.el7rhgs.noarch
tendrl-node-agent-1.5.3-3.el7rhgs.noarch
tendrl-notifier-1.5.3-1.el7rhgs.noarch
tendrl-selinux-1.5.3-2.el7rhgs.noarch
tendrl-ui-1.5.3-2.el7rhgs.noarch


How reproducible:
100%

Steps to Reproduce:
1. install gluster with arbiter or disperse volume (or any other)
2. copy some data to volume
3. compare volume utilization with output from `df` command on mounted volume

Actual results:
Volume utilization is not correct or in other words same as for statfs on mounted volume.

Expected results:
Volume utilization (including inode stats) is not correct or in other words same as for statfs on mounted volume.

Comment 1 Martin Kudlej 2017-11-08 12:57:54 UTC
It doesn't work. Now there is no inof about volume utilization. I see in logs this error:
/var/log/tendrl/node-agent/node-agent.log:Nov  8 07:33:56 localhost tendrl-node-agent: 2017-11-08 12:33:56.273891+00:00 - monitoring_integration - /usr/lib/python2.7/site-packages/tendrl/monitoring_integration/graphite/__init__.py:407 - set_volume_level_brick_count - ERROR - Failed to set volume level brick count'volume_beta_arbiter_2_plus_1x2' -->Assigned

Comment 2 Shubhendu Tripathi 2017-11-08 13:40:58 UTC
This seems to be a different issue. The actual as part of fix was to use gfapi and not tendrl specific logic for calculation of volume utilization.

@Gowtham, we need to check this why monitoring-integration fails.
@Martin, can you share the setup details to debug this?

Comment 3 Shubhendu Tripathi 2017-11-09 05:40:24 UTC
@Martin, while debugging I see that tendrl-gluster-integration was not running on any of the storage nodes. This service is responsible for syncing all the cluster details into tendrl central store. I started the services on all the nodes and now I can see volume details getting populated properly in dashboard and grafana.

Kindly check and verify.

Comment 4 Martin Kudlej 2017-11-13 14:45:27 UTC
I still see differences between info in charts related to volume and output of *df*:
1) *Capacity Available* value is the same like in libgfapi output BUT is not the same like in *df* output. -> https://bugzilla.redhat.com/show_bug.cgi?id=1511973
2) There is still difference between Inode info in *Inode Available* chart and *df -i* command
$ df -i .
Filesystem                              Inodes IUsed    IFree IUse% Mounted on
_host_:volume_beta_arbiter_2_plus_1x2 62570496   202 62570294    1% /mnt/volume_beta_arbiter_2_plus_1x2

This value is same like in libgfapi.

But WA reports:
Inode Available
160674336


Tested with:
etcd-3.2.7-1.el7.x86_64
glusterfs-3.8.4-52.el7_4.x86_64
glusterfs-3.8.4-52.el7rhgs.x86_64
glusterfs-api-3.8.4-52.el7rhgs.x86_64
glusterfs-cli-3.8.4-52.el7rhgs.x86_64
glusterfs-client-xlators-3.8.4-52.el7_4.x86_64
glusterfs-client-xlators-3.8.4-52.el7rhgs.x86_64
glusterfs-events-3.8.4-52.el7rhgs.x86_64
glusterfs-fuse-3.8.4-52.el7_4.x86_64
glusterfs-fuse-3.8.4-52.el7rhgs.x86_64
glusterfs-geo-replication-3.8.4-52.el7rhgs.x86_64
glusterfs-libs-3.8.4-52.el7_4.x86_64
glusterfs-libs-3.8.4-52.el7rhgs.x86_64
glusterfs-rdma-3.8.4-52.el7rhgs.x86_64
glusterfs-server-3.8.4-52.el7rhgs.x86_64
gluster-nagios-addons-0.2.10-2.el7rhgs.x86_64
gluster-nagios-common-0.2.4-1.el7rhgs.noarch
libvirt-daemon-driver-storage-gluster-3.2.0-14.el7_4.3.x86_64
python-etcd-0.4.5-1.noarch
python-gluster-3.8.4-52.el7rhgs.noarch
rubygem-etcd-0.3.0-1.el7.noarch
tendrl-ansible-1.5.4-1.el7rhgs.noarch
tendrl-api-1.5.4-2.el7rhgs.noarch
tendrl-api-httpd-1.5.4-2.el7rhgs.noarch
tendrl-collectd-selinux-1.5.3-2.el7rhgs.noarch
tendrl-commons-1.5.4-2.el7rhgs.noarch
tendrl-gluster-integration-1.5.4-2.el7rhgs.noarch
tendrl-grafana-plugins-1.5.4-3.el7rhgs.noarch
tendrl-grafana-selinux-1.5.3-2.el7rhgs.noarch
tendrl-monitoring-integration-1.5.4-3.el7rhgs.noarch
tendrl-node-agent-1.5.4-2.el7rhgs.noarch
tendrl-notifier-1.5.4-2.el7rhgs.noarch
tendrl-selinux-1.5.3-2.el7rhgs.noarch
tendrl-ui-1.5.4-2.el7rhgs.noarch
vdsm-gluster-4.17.33-1.2.el7rhgs.noarch

-->Assigned, FiV chnaged

Comment 5 Nishanth Thomas 2017-11-13 15:34:38 UTC
What is the relationship between volume utilization and inodes available?
If you have issues with inodes/count etc please raise a separate bug.

Comment 6 Martin Kudlej 2017-11-14 06:23:18 UTC
Utilization of inodes in volume is related to data volume utilization because data source for both is the same - libgfapi and df.

Comment 7 Nishanth Thomas 2017-11-14 09:17:48 UTC
As far as Tendrl is concerned those are two different panels on the dashboard, so please raise a separate bug for inode utilization.

Comment 8 Martin Kudlej 2017-11-15 11:34:45 UTC
I've moved inode incorrect info to new bug 1513416.

Comment 9 Martin Kudlej 2017-11-15 11:36:49 UTC
Nishanth are you going to move this BZ to ON_QA based on comment 1507930#c8 ?

Comment 10 Martin Kudlej 2017-11-15 13:41:14 UTC
Because of comment 1507930#c8 I move this bug to verified.

Comment 13 errata-xmlrpc 2017-12-18 04:39:36 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:3478