| Summary: | Possible memory leak, memory consumption is not reduced even after rm -rf | ||
|---|---|---|---|
| Product: | Red Hat Gluster Storage | Reporter: | Nag Pavan Chilakam <nchilaka> |
| Component: | core | Assignee: | Mohit Agrawal <moagrawa> |
| Status: | CLOSED NOTABUG | QA Contact: | Prasanth <pprakash> |
| Severity: | low | Docs Contact: | |
| Priority: | medium | ||
| Version: | rhgs-3.2 | CC: | moagrawa, pprakash, rhs-bugs, sasundar, sheggodu |
| Target Milestone: | --- | Keywords: | ZStream |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-06-30 10:48:09 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Bug Depends On: | 1647277 | ||
| Bug Blocks: | |||
|
Description
Nag Pavan Chilakam
2016-10-19 12:03:09 UTC
in this way complete memory can be consumed leading to mount process crashed. When I started the test the residual memory was at 51604KB and on first lookup(post creation of 1lakh files) went to 190932 Even after delete finally it was at 293652 Also, note that i noticed that when i re issue lookups after larger time gaps say about 15min, I see that the residual memory shooting up, this too could be a problem that needs addressing From my analysis, I could see the memory usage increase as the files get created, but when the files are removed, md-cache cleans up the cache. But the memory usage (as shown in top) doesn't reduce greatly. This is the case with md-cache enabled or disabled. There surely is some leak, but it is not by md-cache. We need to debug it further to identify which component is consuming the remaining memory. From the first look couldn't find it out from statedump. As mentioned in Comment #3, i could reproduce the leak, but is seen without md-cache as well. Could you please confirm? I agree this can be seen even with md-cache However, shouldn't we be clearing the cache atleast with md-cache enabled when upcalls are triggered. Can't we leverage that intelligence? (In reply to nchilaka from comment #5) > I agree this can be seen even with md-cache > However, shouldn't we be clearing the cache atleast with md-cache enabled > when upcalls are triggered. Can't we leverage that intelligence? md-cache already clears the cache that it allocated as a part of unlink. We do not require upcall to clear the cache on unlink in any component, as it is on the same mount. I guess this is a trivial leak, not sure which component. changing summary as it may not have to do with mdcache, based on above comments Requires re-testing with the latest release, as lots of memory leaks have gone in from 3.2 to now. As mentioned in the previous comments, its not related to md-cache, hence changing the component. (In reply to Poornima G from comment #13) > Requires re-testing with the latest release, as lots of memory leaks have > gone in from 3.2 to now. retested on 3.4.2 3.12.2-29 build, still the problem exists. [root@dhcp35-64 ~]# cat test.log below was taken while writes were going on Fri Nov 23 20:22:35 IST 2018 13456 root 20 0 642756 182528 4140 S 0.0 4.7 5:52.45 glusterfs Below was taken after doing a find * and ls -lRt Sat Nov 24 21:29:52 IST 2018 13456 root 20 0 839364 390532 4156 S 0.0 10.1 11:05.12 glusterfs now going to do rm -rf Sat Nov 24 21:41:23 IST 2018 rm -rf complete and filesystem empty Sat Nov 24 21:41:23 IST 2018 13456 root 20 0 810692 365248 4204 S 0.0 9.4 12:55.24 glusterfs #### rechecking after about 15min Sat Nov 24 21:58:16 IST 2018 13456 root 20 0 810692 365248 4204 S 0.0 9.4 12:55.28 glusterfs #### rechecking after about 15min Sat Nov 24 21:58:19 IST 2018 13456 root 20 0 810692 365248 4204 S 0.0 9.4 12:55.28 glusterfs Need a test with 3.4.4 release / 3.5.0 builds. Mainly because we have fuse inode garbage collection feature now. sosreports and client statedumps @ http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/nchilaka/bug.1386658/reproducer-on-rhgs350-comment19/client/dhcp47-147.lab.eng.blr.redhat.com/ Hi Nag, Are we still seeing the issue in 3.5.1? Thanks, Mohit Agrawal (In reply to Mohit Agrawal from comment #22) > Hi Nag, > > Are we still seeing the issue in 3.5.1? > > Thanks, > Mohit Agrawal I Mohit, yes I saw in 3.5.1 too This issue is not reproducible with RHGS 3.5.4 on RHEL7. Validation was also done on RHEL 8 based RHGS 3.5.4. Based on these facts, closing this bug |