Bug 1426291
Summary: | possible memory leak in glusterfsd with multiplexing | |||
---|---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | krishnaram Karthick <kramdoss> | |
Component: | core | Assignee: | Jeff Darcy <jeff> | |
Status: | CLOSED EOL | QA Contact: | ||
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | 3.10 | CC: | amukherj, bugs, jeff, kramdoss, nchilaka, rcyriac, shberry | |
Target Milestone: | --- | Keywords: | Triaged | |
Target Release: | --- | |||
Hardware: | All | |||
OS: | All | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | If docs needed, set a value | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1457936 1467986 (view as bug list) | Environment: | ||
Last Closed: | 2018-06-20 18:27:19 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1457936, 1467986 |
Description
krishnaram Karthick
2017-02-23 15:51:12 UTC
memleak issue seems to be a legitimate one. When IO was started and ran for a while, memory consumption increased and stayed at the same level even when IO was stopped. I've taken statedump for one of the volume once again after running IOs and attached. Looking at the differences between the statedumps, these two stand out: protocol/server.vol1-server gf_common_mt_inode_ctx: 4000 -> 54000 protocol/server.vol1-server gf_common_mt_strdup: 16007 -> 66007 So, exactly 50K of each, both from protocol/server. This seems consistent with a memory leak when clients reconnect, if they do so many times, which raises two questions. (1) Where *exactly* is the leak (or possibly two leaks)? (2) Why do clients keep reconnecting? The answer to the second question, unfortunately, might be that our network layer simply isn't capable of handling that many connections, creating queue effects that cause clients to time out. Can you check for that in the client logs? Or maybe for a consistent interval between disconnect/reconnect cycles? Also, have you checked whether this happens *without* multiplexing, given the same rate of reconnections? I have a strong suspicion that it would, and that the leak has been latent for a long time until multiplexing made it visible. Hi Jeff, Do you think one of the way to mitigate problem 2 mentioned in comment 5 can be implementing https://github.com/gluster/glusterfs/issues/151 ? This bug reported is against a version of Gluster that is no longer maintained (or has been EOL'd). See https://www.gluster.org/release-schedule/ for the versions currently maintained. As a result this bug is being closed. If the bug persists on a maintained version of gluster or against the mainline gluster repository, request that it be reopened and the Version field be marked appropriately. This bug reported is against a version of Gluster that is no longer maintained (or has been EOL'd). See https://www.gluster.org/release-schedule/ for the versions currently maintained. As a result this bug is being closed. If the bug persists on a maintained version of gluster or against the mainline gluster repository, request that it be reopened and the Version field be marked appropriately. clearing stale needinfos. The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days |