Bug 1561392

Summary: RHGSWA generates a large amount of gluster logs
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Lubos Trilety <ltrilety>
Component: web-admin-tendrl-gluster-integrationAssignee: Shubhendu Tripathi <shtripat>
Status: CLOSED ERRATA QA Contact: Filip Balák <fbalak>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rhgs-3.4CC: amukherj, avishwan, dahorak, fbalak, mbukatov, nthomas, rhs-bugs, shtripat
Target Milestone: ---Keywords: ZStream
Target Release: RHGS 3.4.z Batch Update 4   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: tendrl-node-agent-1.6.3-17.el7rhgs Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-03-27 03:49:38 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1579293    
Bug Blocks:    
Attachments:
Description Flags
gluster logs example none

Description Lubos Trilety 2018-03-28 09:25:29 UTC
Created attachment 1414104 [details]
gluster logs example

Description of problem:
RHGSWA causes that gluster generates a large amount of logs including glfsheal-volume ones. In other words connecting RHGSWA to Gluster trusted storage pool causes that a heal for all volumes is run on Gluster servers. Obviously such action takes a lot of resources, even when the heal did essentially nothing. For sure RHGS WA must not cause a gluster to run anything beyond it's usual behaviour.
However even the amount of logs could be an issue itself. The amount is related to the number of bricks present in Gluster. For our small clusters (24 bricks) it generates approximately a few GB peer week, which is quite a lot as by default the logs are rotated and 52 of them are saved.

Version-Release number of selected component (if applicable):
tendrl-gluster-integration-1.6.1-1.el7rhgs.noarch

How reproducible:
100%

Steps to Reproduce:
1. import cluster
2. check gluster logs on any gluster server machine
3.

Actual results:
Gluster start to generate a lot of logs including those related to volumes healing.

Expected results:
If possible RHGSWA doesn't cause a gluster to generate so big amount of logs.
RHGSWA doesn't cause a gluster to run heal volumes processes.

Additional info:
There's an example of gluster logs in attachment.

Comment 1 Martin Bukatovic 2018-04-05 15:54:33 UTC
This is a serious problem.

Comment 3 Rohan Kanade 2018-05-08 11:25:11 UTC
Thoughts, pointers?

Comment 5 Rohan Kanade 2018-05-09 10:26:06 UTC
The reason of such log patterns generated by glusterd process needs to be understood while the Tendrl side of things are being investigated.

Since Atin is on leave,

Aravinda, can you please take a look at these glusterd logs or needinfo correct person on your team.

Comment 7 Nishanth Thomas 2018-05-28 10:15:12 UTC
Please take a look at https://bugzilla.redhat.com/show_bug.cgi?id=1579293#c7

As discussed during the triage, this is not critical enough to address in 3.4.0. Hence moving this out of 3.4.0

Comment 9 Daniel Horák 2018-05-28 11:00:02 UTC
Based on discussion in Bug 1579293 and agreement during triage, removing qe_ack+.

Comment 11 Aravinda VK 2018-11-19 04:02:38 UTC
If any of the gluster components are producing lot of log messages, please open bug with more information about the log message and component in gluster.

Removing the needinfo flag.

Comment 12 Daniel Horák 2018-12-03 14:31:05 UTC
I've tried to do some comparison between two similar Gluster clusters, each
with 2 volumes and with 5 bricks peer storage server. The only difference was,
that one cluster was imported into RHGS WA.

After 3 days, the results are quite significant - gluster logs from cluster
without RHGS WA occupies ~300KB, while on similar cluster imported into RHGS WA
occupies more than 300MB.

Gluster Cluster without RHGS WA:
# du -hs /var/log/glusterfs/
  292K	/var/log/glusterfs/
# du -hs /var/log/glusterfs/* | sort -h
  0	/var/log/glusterfs/geo-replication
  0	/var/log/glusterfs/geo-replication-slaves
  0	/var/log/glusterfs/snaps
  4.0K	/var/log/glusterfs/cmd_history.log
  4.0K	/var/log/glusterfs/glustershd.log
  12K	/var/log/glusterfs/cli.log
  36K	/var/log/glusterfs/glusterd.log
  112K	/var/log/glusterfs/bricks
  124K	/var/log/glusterfs/glustershd.log-20181202

Gluster Cluster imported into RHGS WA:
# du -hs /var/log/glusterfs/
  336M	/var/log/glusterfs/
# du -hs /var/log/glusterfs/* | sort -h
  0	/var/log/glusterfs/events.log
  0	/var/log/glusterfs/geo-replication
  0	/var/log/glusterfs/geo-replication-slaves
  0	/var/log/glusterfs/snaps
  4.0K	/var/log/glusterfs/glustershd.log
  116K	/var/log/glusterfs/glustershd.log-20181202
  892K	/var/log/glusterfs/glfsheal-volume_gama_disperse_4_plus_2x2.log-20181201.gz
  1.2M	/var/log/glusterfs/cmd_history.log
  1.2M	/var/log/glusterfs/cmd_history.log-20181202
  1.6M	/var/log/glusterfs/glfsheal-volume_gama_disperse_4_plus_2x2.log-20181202.gz
  5.0M	/var/log/glusterfs/glfsheal-volume_beta_arbiter_2_plus_1x2.log-20181201.gz
  7.4M	/var/log/glusterfs/cli.log
  7.8M	/var/log/glusterfs/cli.log-20181202
  8.8M	/var/log/glusterfs/glfsheal-volume_beta_arbiter_2_plus_1x2.log-20181202.gz
  12M	/var/log/glusterfs/glusterd.log
  12M	/var/log/glusterfs/glusterd.log-20181202
  15M	/var/log/glusterfs/glfsheal-volume_gama_disperse_4_plus_2x2.log
  32M	/var/log/glusterfs/glfsheal-volume_gama_disperse_4_plus_2x2.log-20181203
  39M	/var/log/glusterfs/events.log-20181202
  47M	/var/log/glusterfs/glfsheal-volume_beta_arbiter_2_plus_1x2.log
  50M	/var/log/glusterfs/bricks
  98M	/var/log/glusterfs/glfsheal-volume_beta_arbiter_2_plus_1x2.log-20181203

Also I have one older cluster running for 2 and half moths and there the
gluster logs occupies 1-3GB.
# du -hs /var/log/glusterfs/
  2.9G	/var/log/glusterfs/

So there is definitely significant impact of RHGS WA on the amount of logs
generated by gluster.

Comment 20 Filip Balák 2019-02-21 15:12:17 UTC
Problem with logs related to volume healing seems to be fixed. --> VERIFIED
Another bz related to logs created by gluster with installed tendrl: bz 1679616

Tested with:
glusterfs-3.12.2-43.el7rhgs.x86_64
tendrl-gluster-integration-1.6.3-14.el7rhgs.noarch
tendrl-node-agent-1.6.3-17.el7rhgs.noarch

Comment 22 errata-xmlrpc 2019-03-27 03:49:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0660