Bug 1144413

Summary: High memory usage by rebalance process
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Krutika Dhananjay <kdhananj>
Component: distributeAssignee: Krutika Dhananjay <kdhananj>
Status: CLOSED ERRATA QA Contact: shylesh <shmohan>
Severity: urgent Docs Contact:
Priority: high    
Version: 2.1CC: nbalacha, nsathyan, ssamanta, surs, vagarwal
Target Milestone: ---Keywords: Patch, ZStream
Target Release: RHGS 2.1.5   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.4.0.69rhs Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1147427 (view as bug list) Environment:
Last Closed: 2014-11-13 12:23:15 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1147095    

Description Krutika Dhananjay 2014-09-19 10:37:08 UTC
Description of problem:

There are 2 dict_t memory leaks in rebalance process' codepath for every file that is migrated successfully.

In other words, the amount of memory leaked would be equal to 2*sizeof(each dict_t)*(number of files successfully migrated).

One community user had reported OOM kill of rebalance process while he was trying to migrate data of the order of few TBs. The bug report can be found at 
https://bugzilla.redhat.com/show_bug.cgi?id=1142052.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Add-brick + rebalance with large amount of data to be migrated.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Krutika Dhananjay 2014-09-19 10:39:21 UTC
I performed this test locally on about 5GB of data on the mount, which contained 10 linux kernel untars, and took statedump of the rebalance daemons once in every 30 seconds. At the end of migration, there were about 3 lakh dict_t objects that were allocated and not freed.

Comment 6 shylesh 2014-10-27 09:46:43 UTC
Verified on 3.4.0.69rhs-1.el6rhs.x86_64.

I tried rebalance with 800k files of size 64k each on 50 node setup. cold count for dict_t never exhausted and memory usage was also not high. Hence marking as verified.

Comment 9 errata-xmlrpc 2014-11-13 12:23:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2014-1853.html