Bug 1651915

Summary: On running "gluster volume status <vol_name> detail" continuously on the backgroud seeing glusterd memory leak
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Upasana <ubansal>
Component: glusterdAssignee: Sanju <srakonde>
Status: CLOSED WONTFIX QA Contact: Bala Konda Reddy M <bmekala>
Severity: low Docs Contact:
Priority: low    
Version: rhgs-3.4CC: amukherj, nchilaka, puebele, rhs-bugs, sanandpa, sheggodu, srakonde, storage-qa-internal, ubansal, vbellur
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-01-07 13:39:07 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Upasana 2018-11-21 07:55:21 UTC
Description of problem:
=========================
While trying to verify Bug 1635100 - Correction for glusterd memory leak because use "gluster volume status volume_name --detail" continuesly (cli) i see that the glusterd memory leak is very high 

Version-Release number of selected component (if applicable):
glusterfs-server-3.12.2-27.el7rhgs.x86_64

How reproducible:
================
Always

Steps to Reproduce:
===================
1.Enable brick mux and create 2 volumes
2.check the memory usage of glusterd using top command top -p `pidof glusterd`
3.run "gluster volume status <volname> detail" command in loop
4.Check for memory usage again - There shouldn't be considerable amount of memory leak 


Actual results:
================
Yesterday when i started the test the memory usage was -
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                              
26061 root      20   0  607112  24960   2196 S   3.7  0.2  11:38.59 glusterd

now it is 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                             
26061 root      20   0  738184 120504   2320 S   3.7  0.8  63:44.43 glusterd                            




Expected results:
=================
There shouldn't considerable amount of memory leak

Additional info:
=================
I have tried this on a 3.4.1 setup also and am seeing the same issue 

It was 
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                             
22294 root      20   0  608796   9484   4260 S   0.0  0.1   0:00.88 glusterd    

In around 2 hours it is 

iB Swap:  8257532 total,  8257532 free,        0 used.  7227104 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                             
22294 root      20   0  674336  20372   4316 S   6.2  0.3   4:40.54 glusterd

Comment 14 Sanju 2018-12-19 06:32:41 UTC
Based on comment 10, we are seeing very less memory leak if we run the command "gluster v status detail" with a sleep of 15 sec. IMO, it is a minor leak which doesn't have any impact. I'm more inclined towards closing this bug. Sweta/Upasana, Please do let me know your thoughts.

Comment 15 Atin Mukherjee 2018-12-21 07:58:41 UTC
Sanju - Can we please take periodic statedump by running this command on an interval to see what structure has an increase in memory. Have we done that? I agree with you that there's not much impact of this, but might be worth to fix this in upstream anyways. Did we see this happening in upstream master?

Comment 16 Sanju 2018-12-26 06:54:12 UTC
(In reply to Atin Mukherjee from comment #15)
> Sanju - Can we please take periodic statedump by running this command on an
> interval to see what structure has an increase in memory. Have we done that?
> I agree with you that there's not much impact of this, but might be worth to
> fix this in upstream anyways. Did we see this happening in upstream master?

Atin, we haven't tried to look at which structure's memory is growing.

In a 3 node cluster, I ran "gluster v status <volname> detail" command for 100000 times in a loop with upstream master. For 1 lakh runs, glusterd's memory increased by 124 MB. In the same cluster I ran the "gluster v status <volname> detail" command for 100000 times in a loop with a sleep of 15 seconds between each run. I haven't observed any increase in glusterd's memory. I believe that, some of the coverity fixes might have fixed this leak in upstream.

Given this leak doesn't exist at upstream and we have minor leak at downstream, I don't think its worth to spend time on this. Anyway, I'm always ready to send out a downstream fix for this if we want it.

Thanks,
Sanju