Bug 1540403 - High memory usage on gluster volume / bricks.
Summary: High memory usage on gluster volume / bricks.
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: fuse
Version: rhgs-3.3
Hardware: All
OS: All
high
high
Target Milestone: ---
: ---
Assignee: Csaba Henk
QA Contact: Rahul Hinduja
URL:
Whiteboard:
Depends On:
Blocks: RHGS34MemoryLeak
TreeView+ depends on / blocked
 
Reported: 2018-01-31 00:53 UTC by Ben Turner
Modified: 2021-09-09 13:02 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-09-26 03:49:26 UTC
Embargoed:


Attachments (Terms of Use)

Description Ben Turner 2018-01-31 00:53:47 UTC
Description of problem:

Gluster is using a significant amount of memory, upon investigation we found:

https://bugzilla.redhat.com/show_bug.cgi?id=1501146
https://bugzilla.redhat.com/show_bug.cgi?id=1455223

[pousley@collab-shell rf3pvxap600n1]$ cat installed-rpms | grep gluster
glusterfs-3.8.4-18.el6rhs.x86_64
RHGS 3.2 on RHEL 6 Async 	2017-06-08	glusterfs-3.8.4-18.4.el6rhs

=-=-==-=-=-=-=-=
[mallinfo]
mallinfo_arena=749,568       /* Non-mmapped space allocated (bytes) */
mallinfo_ordblks=5           
mallinfo_smblks=0
mallinfo_hblks=78
mallinfo_hblkhd=103,141,376  /* Space allocated in mmapped regions (bytes) */
mallinfo_usmblks=0
mallinfo_fsmblks=0
mallinfo_uordblks=611,728    /* Total allocated space (bytes) */
mallinfo_fordblks=137,840    /* Total free space (bytes) */
mallinfo_keepcost=133,024    /* Top-most, releasable space (bytes) */

----------------
$  grep -w num_allocs glusterdump.3315.dump.1515113162. | sed -e "s/num_allocs=//" | sort -n
.....
.....
6327160
6327174
6347799
6347813
6852044
6852060
7182563
7182573
7182579
7182589
7183612
7184273
7184273
7184273
7184289
7184289
7184289
7184289
20613390
---------
[mount/fuse.fuse - usage-type gf_common_mt_mem_pool memusage]
size=2,525,771,284         #num_allocs times the sizeof(data-type) i.e. num_allocs * sizeof (data-type)
num_allocs=20,613,390      #Number of allocations of the data-type which are active at the time of taking statedump.
max_size=2,556,785,772     #max_num_allocs times the sizeof(data-type) i.e. max_num_allocs * sizeof (data-type)
max_num_allocs=20,867,596  #Maximum number of active allocations at any point in the life of the process.
total_allocs=194,805,642   #Number of times this data is allocated in the life of the process.

[mount/fuse.fuse - usage-type gf_common_mt_inode_ctx memusage]
size=1,739,835,464
num_allocs=7,184,289
max_size=1,812,641,624
max_num_allocs=7,270,963
total_allocs=59,799,663

[performance/md-cache.images1-md-cache - usage-type gf_mdc_mt_md_cache_t memusage]
size=1,034,537,616
num_allocs=7,184,289
max_size=1,047,018,528
max_num_allocs=7,270,962
total_allocs=15,526,083

[performance/io-cache.images1-io-cache - usage-type gf_ioc_mt_ioc_inode_t memusage]
size=1,206,960,552
num_allocs=7,184,289
max_size=1,221,521,616
max_num_allocs=7,270,962
total_allocs=15,526,083

[cluster/distribute.images1-dht - usage-type gf_dht_mt_inode_ctx_t memusage]
size=287,371,560
num_allocs=7,184,289
max_size=290,838,480
max_num_allocs=7,270,962
total_allocs=15,526,083

-------------------
pool-name=glusterfs:data_t
hot-count=16384         #number of mempool elements that are in active use. i.e. for this pool it is the number of 'fd_t' s in active use.

cold-count=0            #number of mempool elements that are not in use. If a new allocation is required it will be served from here until all the elements in the pool are in use                      i.e. cold-count becomes 0.

padded_sizeof=92        #Each mempool element is padded with a doubly-linked-list + ptr of mempool + is-in-use info to operate the pool of elements, this size is the element-size after padding

alloc-count=5037173938  #Number of times this type of data is allocated through out the life of this process. This may include pool-misses as well.

max-alloc=16384         #Maximum number of elements from the pool in active use at any point in the life of the process. This does *not* include pool-misses.
pool-misses=5019282721  #Number of times the element had to be allocated from heap because all elements from the pool are in active use.
cur-stdalloc=6311017    #Denotes the number of allocations made from heap once cold-count reaches 0, that are yet to be released via mem_put().
max-stdalloc=6396965    #Maximum number of allocations from heap that are in active use at any point in the life of the process.


-----=-----
pool-name=glusterfs:dict_t
hot-count=4096
cold-count=0
padded_sizeof=172
alloc-count=1534198868
max-alloc=4096
pool-misses=1528426715
cur-stdalloc=6347813
max-stdalloc=6433931


-----=-----
pool-name=glusterfs:data_pair_t
hot-count=198
cold-count=16186
padded_sizeof=68
alloc-count=5950695670
max-alloc=525
pool-misses=0
cur-stdalloc=0
max-stdalloc=0

-----=-----
pool-name=glusterfs:call_frame_t
hot-count=1
cold-count=4095
padded_sizeof=212
alloc-count=2551228022
max-alloc=140
pool-misses=0
cur-stdalloc=0
max-stdalloc=0

-----=-----
pool-name=fuse:dentry_t
hot-count=32768
cold-count=0
padded_sizeof=84
alloc-count=15526082
max-alloc=32768
pool-misses=15427794
cur-stdalloc=7151520
max-stdalloc=7238193
-----=-----
pool-name=fuse:inode_t
hot-count=32768
cold-count=0
padded_sizeof=188
alloc-count=59799663
max-alloc=32768
pool-misses=59417032
cur-stdalloc=7151521
max-stdalloc=7238195


Version-Release number of selected component (if applicable):

glusterfs-3.8.4-18.el6rhs.x86_64

How reproducible:

Memory seems to build over time.

Steps to Reproduce:
1.  Normal operation
2.  Monitor
3.  Check gluster process memory usage

Actual results:

High memory

Expected results:

Normal operation

Additional info:

Oon Kwee had a look at the state dump and seemed to think that the data indicated a problem and asked me to open this bug.

Comment 3 Oonkwee Lim 2018-02-02 00:59:23 UTC
Hello,

I noticed that some of the cur-stdalloc are quite high:

-------------------
pool-name=glusterfs:data_t
hot-count=16384         #number of mempool elements that are in active use. i.e. for this pool it is the number of 'fd_t' s in active use.

cold-count=0            #number of mempool elements that are not in use. If a new allocation is required it will be served from here until all the elements in the pool are in use                      i.e. cold-count becomes 0.

padded_sizeof=92        #Each mempool element is padded with a doubly-linked-list + ptr of mempool + is-in-use info to operate the pool of elements, this size is the element-size after padding

alloc-count=5037173938  #Number of times this type of data is allocated through out the life of this process. This may include pool-misses as well.

max-alloc=16384         #Maximum number of elements from the pool in active use at any point in the life of the process. This does *not* include pool-misses.
pool-misses=5019282721  #Number of times the element had to be allocated from heap because all elements from the pool are in active use.
cur-stdalloc=6311017    #Denotes the number of allocations made from heap once cold-count reaches 0, that are yet to be released via mem_put().
max-stdalloc=6396965    #Maximum number of allocations from heap that are in active use at any point in the life of the process.


-----=-----
pool-name=glusterfs:dict_t
hot-count=4096
cold-count=0
padded_sizeof=172
alloc-count=1534198868
max-alloc=4096
pool-misses=1528426715
cur-stdalloc=6347813
max-stdalloc=6433931


-----=-----
pool-name=glusterfs:data_pair_t
hot-count=198
cold-count=16186
padded_sizeof=68
alloc-count=5950695670
max-alloc=525
pool-misses=0
cur-stdalloc=0
max-stdalloc=0

-----=-----
pool-name=glusterfs:call_frame_t
hot-count=1
cold-count=4095
padded_sizeof=212
alloc-count=2551228022
max-alloc=140
pool-misses=0
cur-stdalloc=0
max-stdalloc=0

-----=-----
pool-name=fuse:dentry_t
hot-count=32768
cold-count=0
padded_sizeof=84
alloc-count=15526082
max-alloc=32768
pool-misses=15427794
cur-stdalloc=7151520
max-stdalloc=7238193
-----=-----
pool-name=fuse:inode_t
hot-count=32768
cold-count=0
padded_sizeof=188
alloc-count=59799663
max-alloc=32768
pool-misses=59417032
cur-stdalloc=7151521
max-stdalloc=7238195

Comment 11 Raghavendra G 2018-12-15 01:17:24 UTC
This bug indeed looks to be a problem of high memory allocated to inode-ctxs (as already pointed out by other analysis).


Note You need to log in before you can comment on or make changes to this bug.