Bug 1647277 (RHGS34MemoryLeak)
Summary: | [Tracker]: Memory leak bugs | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Amar Tumballi <atumball> |
Component: | core | Assignee: | Sunny Kumar <sunkumar> |
Status: | CLOSED NOTABUG | QA Contact: | Rahul Hinduja <rhinduja> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | rhgs-3.4 | CC: | abhishku, amarts, aspandey, bkunal, jthottan, khiremat, pasik, ravishankar, rhs-bugs, sheggodu, storage-qa-internal, sunkumar |
Target Milestone: | --- | Keywords: | Tracking |
Target Release: | --- | Flags: | khiremat:
needinfo-
sunkumar: needinfo- |
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2020-02-06 07:23:08 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1400067, 1408104, 1511779, 1530133, 1540403, 1576193, 1579151, 1626574, 1632465, 1637254, 1637574, 1642868, 1644934, 1648893 | ||
Bug Blocks: | 1386658, 1529501, 1653205, 1655352, 1658979, 1677145 |
Description
Amar Tumballi
2018-11-07 02:55:33 UTC
Following is the summary of my observations on high memory usage when memory consumption is driven by large number of inodes so far (the list is not comprehensive to cover all leaks observed so far): * large number of inodes looked up by kernel driving memory usage of client high. These inodes are in the lru list of itable and is a well known problem and solutions are WIP - bz 1511779. * large number of "active" inodes (with refcount > 0) which are not looked up by kernel. These inodes are likely leaks or could be cached by readdir-ahead (especially if readdir-ahead rda-cache-limit is higher) - bz 1644934. * large number of inodes in lru list on bricks. This is due to high network.inode-lru-limit (50000 usually) set by "group metadata-cache" tuning - https://bugzilla.redhat.com/show_bug.cgi?id=1637393#c78. [Bug 1648893] statedump doesn't contain information about newer mempools Amar, Sunny and others, Another area of memory accumulation is graph switches. Note that caches/inode-ctxs of inodes in older/unused graphs are not freed up. Do you want to work on that? I am asking this question because graph switch is not a common operation. On the other hand I see customers/GSS experimenting too with turning on and off various translators resulting in graph switches. If there is a consensus on graph switch being a fairly common operation, we need to cleanup old graphs as they can amount to significant memory consumption and we should file a bug and track that. Leaving needinfo on Bipin, Amar and Sunny to drive the discussion on this point. regards, Raghavendra > If there is a consensus on graph switch being a fairly common operation, we need to cleanup old graphs as they can amount to significant memory consumption and we should file a bug and track that.
Yes, this is very important, and we should focus on that. I see that Mohit is already working a lot on server-side graph cleanups.
(In reply to Amar Tumballi from comment #5) > > If there is a consensus on graph switch being a fairly common operation, we need to cleanup old graphs as they can amount to significant memory consumption and we should file a bug and track that. > > Yes, this is very important, and we should focus on that. I see that Mohit > is already working a lot on server-side graph cleanups. This problem is present on the clients too. Especially fuse mounts (gfapi had some cleanup-drive long back). > This problem is present on the clients too. Especially fuse mounts (gfapi had some cleanup-drive long back).
For client, my thinking is to get more stronger with profiles, so a user/admin doesn't keep changing the volume setup frequently, rather, doesn't change the volume setting at all.
While working on bz 1657405, I found that many xlators like afr, EC, bit-rot and trash create their own inode tables. But, it might be the case that contents of all these itables are not dumped to statedumps. If not, its good to dump these itables. [rgowdapp@rgowdapp rhs-glusterfs]$ git grep inode_table_new api/src/glfs-master.c: itable = inode_table_new (131072, new_subvol); doc/developer-guide/datastructure-inode.md:inode_table_new (size_t lru_limit, xlator_t *xl) libglusterfs/src/inode.c:inode_table_new (size_t lru_limit, xlator_t *xl) libglusterfs/src/inode.h:inode_table_new (size_t lru_limit, xlator_t *xl); xlators/cluster/afr/src/afr-self-heald.c: this->itable = inode_table_new (SHD_INODE_LRU_LIMIT, this); xlators/cluster/dht/src/dht-rebalance.c: itable = inode_table_new (0, this); xlators/cluster/ec/src/ec.c: this->itable = inode_table_new (EC_SHD_INODE_LRU_LIMIT, this); xlators/features/bit-rot/src/bitd/bit-rot.c: child->table = inode_table_new (4096, subvol); xlators/features/quota/src/quotad-helpers.c: active_subvol->itable = inode_table_new (4096, active_subvol); xlators/features/trash/src/trash.c: priv->trash_itable = inode_table_new (0, this); xlators/mount/fuse/src/fuse-bridge.c: itable = inode_table_new (0, graph->top); xlators/nfs/server/src/nfs.c: xl->itable = inode_table_new (lrusize, xl); xlators/protocol/server/src/server-handshake.c: inode_table_new (conf->inode_lru_limit, Amar/Sunil, What is the target release for this tracker bug? -Bipin Kunal |