Bug 1679616

Summary: Gluster generates a large amount of logs when imported in RHGSWA
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Filip Balák <fbalak>
Component: glusterdAssignee: Atin Mukherjee <amukherj>
Status: CLOSED NOTABUG QA Contact: Bala Konda Reddy M <bmekala>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rhgs-3.4CC: fbalak, gshanmug, ksubrahm, rhs-bugs, sankarshan, shtripat, storage-qa-internal, vbellur
Target Milestone: ---Keywords: ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-02-25 03:07:07 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
/var/log/glusterfs/glusterd.log excerpt
none
/var/log/glusterfs/bricks/mnt-brick_beta_arbiter_1-1.log excerpt none

Description Filip Balák 2019-02-21 15:07:42 UTC
Created attachment 1537106 [details]
/var/log/glusterfs/glusterd.log excerpt

Description of problem:
Gluster starts pushing logs every minute into /var/log/glusterfs/bricks/ and /var/log/glusterfs/glusterd.log when cluster containing some volumes is imported into RHGSWA.

Version-Release number of selected component (if applicable):
glusterfs-3.12.2-43.el7rhgs.x86_64
glusterfs-api-3.12.2-43.el7rhgs.x86_64
glusterfs-cli-3.12.2-43.el7rhgs.x86_64
glusterfs-client-xlators-3.12.2-43.el7rhgs.x86_64
glusterfs-events-3.12.2-43.el7rhgs.x86_64
glusterfs-fuse-3.12.2-43.el7rhgs.x86_64
glusterfs-geo-replication-3.12.2-43.el7rhgs.x86_64
glusterfs-libs-3.12.2-43.el7rhgs.x86_64
glusterfs-rdma-3.12.2-43.el7rhgs.x86_64
glusterfs-server-3.12.2-43.el7rhgs.x86_64
gluster-nagios-addons-0.2.10-2.el7rhgs.x86_64
gluster-nagios-common-0.2.4-1.el7rhgs.noarch
tendrl-gluster-integration-1.6.3-14.el7rhgs.noarch
tendrl-node-agent-1.6.3-17.el7rhgs.noarch

How reproducible:
90%

Steps to Reproduce:
1. Create gluster cluster with 6 nodes and some volumes (I used volume_beta_arbiter_2_plus_1x2 and volume_gama_disperse_4_plus_2x2)
2. Install tendrl.
3. Import cluster into Tendrl.
4. Monitor /var/log/glusterfs/glusterd.log and /var/log/glusterfs/bricks/ logs.

Actual results:
Every minute are pushed logs into /var/log/glusterfs/bricks/ and /var/log/glusterfs/glusterd.log. Excerpts that are pushed are in attachments to this bz.

Expected results:
There shouldn't be any error or redundant logs when cluster is imported into RHGSWA.

Additional info:
Based on https://bugzilla.redhat.com/show_bug.cgi?id=1561392#c19

Comment 2 Filip Balák 2019-02-21 15:08:25 UTC
Created attachment 1537107 [details]
/var/log/glusterfs/bricks/mnt-brick_beta_arbiter_1-1.log excerpt

Comment 3 Atin Mukherjee 2019-02-22 08:02:18 UTC
"Gluster starts pushing logs every minute into /var/log/glusterfs/bricks/ and /var/log/glusterfs/glusterd.log when cluster containing some volumes is imported into RHGSWA."

What log entries are we referring here which is noisy? Is it glusterd_get_value_for_vme_entry reporting some of the unknown options or something else? Please note we need to get into a habit of reporting exact details at one go so that the first pass triaging becomes smooth. Unfortunately in the originating BZ, the commentary also doesn't capture these details which is quite unfortunate. Irrespective of that, I've had a conversation with WA team members sometime back to disable executing get-state command with volumeoptions which is a no op from WA side as of now and that's causing the unrecognized options failure from the vme entry, why is that not being taken care of yet?

Comment 4 Atin Mukherjee 2019-02-22 08:12:44 UTC
OTOH, the brick log report attached indicates the log entries are due to connect/disconnect of the clients? Probably heal related commands are triggering these logs. Can you confirm this, Karthik? And if this is true, I don't consider this part of the problem to be a bug.

Comment 5 Atin Mukherjee 2019-02-22 08:15:35 UTC
Coming back to glusterd_get_value_for_vme_entry failures, they are genuine as not all options would be xlator specific. Given this API is a core one which would be used by many other interfaces, hiding such logs can avoid debuggability. So not a bug to me.

Comment 6 Shubhendu Tripathi 2019-02-22 08:59:08 UTC
Regarding disabling get-state of volume-options, yes that could removed. Gowtham, can do an impact analysis of disabling this?

Comment 7 Karthik U S 2019-02-22 09:03:26 UTC
(In reply to Atin Mukherjee from comment #4)
> OTOH, the brick log report attached indicates the log entries are due to
> connect/disconnect of the clients? Probably heal related commands are
> triggering these logs. Can you confirm this, Karthik? And if this is true, I
> don't consider this part of the problem to be a bug.

Whenever there is a new client connect/disconnect these messages are logged. Since heal is also a client process, heal related commands also generates these log messages. This can not be considered as a bug IMO.