Description of problem: Gluster is using a significant amount of memory, upon investigation we found: https://bugzilla.redhat.com/show_bug.cgi?id=1501146 https://bugzilla.redhat.com/show_bug.cgi?id=1455223 [pousley@collab-shell rf3pvxap600n1]$ cat installed-rpms | grep gluster glusterfs-3.8.4-18.el6rhs.x86_64 RHGS 3.2 on RHEL 6 Async 2017-06-08 glusterfs-3.8.4-18.4.el6rhs =-=-==-=-=-=-=-= [mallinfo] mallinfo_arena=749,568 /* Non-mmapped space allocated (bytes) */ mallinfo_ordblks=5 mallinfo_smblks=0 mallinfo_hblks=78 mallinfo_hblkhd=103,141,376 /* Space allocated in mmapped regions (bytes) */ mallinfo_usmblks=0 mallinfo_fsmblks=0 mallinfo_uordblks=611,728 /* Total allocated space (bytes) */ mallinfo_fordblks=137,840 /* Total free space (bytes) */ mallinfo_keepcost=133,024 /* Top-most, releasable space (bytes) */ ---------------- $ grep -w num_allocs glusterdump.3315.dump.1515113162. | sed -e "s/num_allocs=//" | sort -n ..... ..... 6327160 6327174 6347799 6347813 6852044 6852060 7182563 7182573 7182579 7182589 7183612 7184273 7184273 7184273 7184289 7184289 7184289 7184289 20613390 --------- [mount/fuse.fuse - usage-type gf_common_mt_mem_pool memusage] size=2,525,771,284 #num_allocs times the sizeof(data-type) i.e. num_allocs * sizeof (data-type) num_allocs=20,613,390 #Number of allocations of the data-type which are active at the time of taking statedump. max_size=2,556,785,772 #max_num_allocs times the sizeof(data-type) i.e. max_num_allocs * sizeof (data-type) max_num_allocs=20,867,596 #Maximum number of active allocations at any point in the life of the process. total_allocs=194,805,642 #Number of times this data is allocated in the life of the process. [mount/fuse.fuse - usage-type gf_common_mt_inode_ctx memusage] size=1,739,835,464 num_allocs=7,184,289 max_size=1,812,641,624 max_num_allocs=7,270,963 total_allocs=59,799,663 [performance/md-cache.images1-md-cache - usage-type gf_mdc_mt_md_cache_t memusage] size=1,034,537,616 num_allocs=7,184,289 max_size=1,047,018,528 max_num_allocs=7,270,962 total_allocs=15,526,083 [performance/io-cache.images1-io-cache - usage-type gf_ioc_mt_ioc_inode_t memusage] size=1,206,960,552 num_allocs=7,184,289 max_size=1,221,521,616 max_num_allocs=7,270,962 total_allocs=15,526,083 [cluster/distribute.images1-dht - usage-type gf_dht_mt_inode_ctx_t memusage] size=287,371,560 num_allocs=7,184,289 max_size=290,838,480 max_num_allocs=7,270,962 total_allocs=15,526,083 ------------------- pool-name=glusterfs:data_t hot-count=16384 #number of mempool elements that are in active use. i.e. for this pool it is the number of 'fd_t' s in active use. cold-count=0 #number of mempool elements that are not in use. If a new allocation is required it will be served from here until all the elements in the pool are in use i.e. cold-count becomes 0. padded_sizeof=92 #Each mempool element is padded with a doubly-linked-list + ptr of mempool + is-in-use info to operate the pool of elements, this size is the element-size after padding alloc-count=5037173938 #Number of times this type of data is allocated through out the life of this process. This may include pool-misses as well. max-alloc=16384 #Maximum number of elements from the pool in active use at any point in the life of the process. This does *not* include pool-misses. pool-misses=5019282721 #Number of times the element had to be allocated from heap because all elements from the pool are in active use. cur-stdalloc=6311017 #Denotes the number of allocations made from heap once cold-count reaches 0, that are yet to be released via mem_put(). max-stdalloc=6396965 #Maximum number of allocations from heap that are in active use at any point in the life of the process. -----=----- pool-name=glusterfs:dict_t hot-count=4096 cold-count=0 padded_sizeof=172 alloc-count=1534198868 max-alloc=4096 pool-misses=1528426715 cur-stdalloc=6347813 max-stdalloc=6433931 -----=----- pool-name=glusterfs:data_pair_t hot-count=198 cold-count=16186 padded_sizeof=68 alloc-count=5950695670 max-alloc=525 pool-misses=0 cur-stdalloc=0 max-stdalloc=0 -----=----- pool-name=glusterfs:call_frame_t hot-count=1 cold-count=4095 padded_sizeof=212 alloc-count=2551228022 max-alloc=140 pool-misses=0 cur-stdalloc=0 max-stdalloc=0 -----=----- pool-name=fuse:dentry_t hot-count=32768 cold-count=0 padded_sizeof=84 alloc-count=15526082 max-alloc=32768 pool-misses=15427794 cur-stdalloc=7151520 max-stdalloc=7238193 -----=----- pool-name=fuse:inode_t hot-count=32768 cold-count=0 padded_sizeof=188 alloc-count=59799663 max-alloc=32768 pool-misses=59417032 cur-stdalloc=7151521 max-stdalloc=7238195 Version-Release number of selected component (if applicable): glusterfs-3.8.4-18.el6rhs.x86_64 How reproducible: Memory seems to build over time. Steps to Reproduce: 1. Normal operation 2. Monitor 3. Check gluster process memory usage Actual results: High memory Expected results: Normal operation Additional info: Oon Kwee had a look at the state dump and seemed to think that the data indicated a problem and asked me to open this bug.
Hello, I noticed that some of the cur-stdalloc are quite high: ------------------- pool-name=glusterfs:data_t hot-count=16384 #number of mempool elements that are in active use. i.e. for this pool it is the number of 'fd_t' s in active use. cold-count=0 #number of mempool elements that are not in use. If a new allocation is required it will be served from here until all the elements in the pool are in use i.e. cold-count becomes 0. padded_sizeof=92 #Each mempool element is padded with a doubly-linked-list + ptr of mempool + is-in-use info to operate the pool of elements, this size is the element-size after padding alloc-count=5037173938 #Number of times this type of data is allocated through out the life of this process. This may include pool-misses as well. max-alloc=16384 #Maximum number of elements from the pool in active use at any point in the life of the process. This does *not* include pool-misses. pool-misses=5019282721 #Number of times the element had to be allocated from heap because all elements from the pool are in active use. cur-stdalloc=6311017 #Denotes the number of allocations made from heap once cold-count reaches 0, that are yet to be released via mem_put(). max-stdalloc=6396965 #Maximum number of allocations from heap that are in active use at any point in the life of the process. -----=----- pool-name=glusterfs:dict_t hot-count=4096 cold-count=0 padded_sizeof=172 alloc-count=1534198868 max-alloc=4096 pool-misses=1528426715 cur-stdalloc=6347813 max-stdalloc=6433931 -----=----- pool-name=glusterfs:data_pair_t hot-count=198 cold-count=16186 padded_sizeof=68 alloc-count=5950695670 max-alloc=525 pool-misses=0 cur-stdalloc=0 max-stdalloc=0 -----=----- pool-name=glusterfs:call_frame_t hot-count=1 cold-count=4095 padded_sizeof=212 alloc-count=2551228022 max-alloc=140 pool-misses=0 cur-stdalloc=0 max-stdalloc=0 -----=----- pool-name=fuse:dentry_t hot-count=32768 cold-count=0 padded_sizeof=84 alloc-count=15526082 max-alloc=32768 pool-misses=15427794 cur-stdalloc=7151520 max-stdalloc=7238193 -----=----- pool-name=fuse:inode_t hot-count=32768 cold-count=0 padded_sizeof=188 alloc-count=59799663 max-alloc=32768 pool-misses=59417032 cur-stdalloc=7151521 max-stdalloc=7238195
This bug indeed looks to be a problem of high memory allocated to inode-ctxs (as already pointed out by other analysis).