Bug 2265322
Summary: | [NFS-Ganesha][v3] On scale cluster, ganesha memory not getting freed up (~ 9.4 GB ) post completing test and cleanup | |||
---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Manisha Saini <msaini> | |
Component: | NFS-Ganesha | Assignee: | Frank Filz <ffilz> | |
Status: | CLOSED ERRATA | QA Contact: | Manisha Saini <msaini> | |
Severity: | urgent | Docs Contact: | ||
Priority: | unspecified | |||
Version: | 7.1 | CC: | akraj, cephqe-warriors, ffilz, gouthamr, kkeithle, mbenjamin, sostapov, tserlin, vdas | |
Target Milestone: | --- | Keywords: | Automation | |
Target Release: | 7.1 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | nfs-ganesha-5.7-4.el9cp | Doc Type: | Bug Fix | |
Doc Text: |
.All memory consumed by the configuration reload process is now released
Previously, reload exports would not release all the memory consumed by the configuration reload process causing the memory footprint to increase.
With this fix, all memory consumed by the configuration reload process is released resulting in reduced memory footprint.
|
Story Points: | --- | |
Clone Of: | ||||
: | 2280364 (view as bug list) | Environment: | ||
Last Closed: | 2024-06-13 14:27:14 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 2280364 |
Description
Manisha Saini
2024-02-21 13:19:43 UTC
I just did some runs using FSAL_VFS and note the same behavior. It's actually pretty much the same with NFSv4 mount. I did confirm that the file descriptors used for the files get closed when the unexport happens (we expect them to not get closed when the NFSv3 unmount occurs, and in fact, if the files are deleted locally, not via NFS, they don't get closed either). I did see a small decrease in memory use after the unexport. I also ran with the latest V6-dev.6 code plus a patch that fixes a DRC memory leak and saw no improvement. I also added some logging and verified that the MDCACHE entries are released for each file (I ran with one client and one export, creating 10,000 files). So more investigation into what is occupying the memory. I have done some debugging using valgrind on my FSAL_VFS setup. I see no radical memory leaks (I actually DID find a couple small memory leaks - fixes posted). Valgrind massif shows no significant memory (other than 50 MB of hash tables which we always have) once the exports are removed. I think this is a situation where due to the way memory is utilized, we simply can not reduce the memory footprint even with malloc trim. I'm at a loss for what we could do. Hi Frank, Do we have RCA for the memory issue. We've noticed that the memory usage is significantly higher on the v3 mount compared to the v4 mount. Isn't the memory usage high on the idle setup (post completing tests and deleting the files on mounts) for v3? Please try with nfs-ganesha-5.7-2.el9cp There is a fix: a8d097edd210b7be14eb813e3eaf8fb503a6f708 FSAL's state_free function called by free_state doesn't actually free That very likely is the (or at least a major) cause of memory growth. There are some other fixes that may impact also. But note also that while I was able to replicate the issue, valgrind memcheck showed no significant memory leaks. (In reply to Frank Filz from comment #4) > Please try with nfs-ganesha-5.7-2.el9cp > > There is a fix: > > a8d097edd210b7be14eb813e3eaf8fb503a6f708 FSAL's state_free function called > by free_state doesn't actually free > > That very likely is the (or at least a major) cause of memory growth. > > There are some other fixes that may impact also. > > But note also that while I was able to replicate the issue, valgrind > memcheck showed no significant memory leaks. Hi Frank, With latest build - nfs-ganesha-5.7-2 , I again reran test with 1000 exports and 100 clients with v3 mount using FIO. Observing the same high memory usage of NFS daemon post performing cleanup --> 12.5g Disk usage post completing IO's - > 43 TiB used, 25 TiB / 69 TiB avail =========== # ceph -s cluster: id: 4e687a60-638e-11ee-8772-b49691cee574 health: HEALTH_OK services: mon: 1 daemons, quorum cali013 (age 2d) mgr: cali013.qakwdk(active, since 2d), standbys: cali016.rhribl, cali015.hvvbwh mds: 1/1 daemons up, 1 standby osd: 35 osds: 28 up (since 2d), 28 in (since 4d) rgw: 2 daemons active (2 hosts, 1 zones) data: volumes: 1/1 healthy pools: 9 pools, 1233 pgs objects: 3.78M objects, 14 TiB usage: 43 TiB used, 25 TiB / 69 TiB avail pgs: 1233 active+clean io: client: 170 B/s rd, 62 MiB/s wr, 0 op/s rd, 92 op/s wr Memory usage post completing IO's and before cleanup ========== Node 1: —------- MiB Swap: 4096.0 total, 4096.0 free, 0.0 used. 103858.8 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 92243 root 20 0 19.8g 13.0g 23996 S 17.0 10.4 416:26.67 ganesha.nfsd Node 2: —--- MiB Swap: 4096.0 total, 4096.0 free, 0.0 used. 74121.4 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 50503 root 20 0 2962844 218312 23724 S 0.0 0.2 8:50.37 ganesha.nfsd [ceph: root@cali013 /]# rpm -qa | grep nfs libnfsidmap-2.5.4-20.el9.x86_64 nfs-utils-2.5.4-20.el9.x86_64 nfs-ganesha-selinux-5.7-2.el9cp.noarch nfs-ganesha-5.7-2.el9cp.x86_64 nfs-ganesha-rgw-5.7-2.el9cp.x86_64 nfs-ganesha-ceph-5.7-2.el9cp.x86_64 nfs-ganesha-rados-grace-5.7-2.el9cp.x86_64 Nfs-ganesha-rados-urls-5.7-2.el9cp.x86_64 [ceph: root@cali013 /]# ceph --version ceph version 18.2.1-89.el9cp (926619fe7135cbd6d305b46782ee7ecc7be199a3) reef (stable) Post clean up (Deleting everything on exports and deleting all 1000 exports) ========================================================================== Memory usage ----------- Node 1: ---- PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 92243 root 20 0 19.5g 12.5g 23996 S 0.0 10.0 596:45.88 ganesha.nfsd Node 2: ----- PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 50503 root 20 0 2962844 219304 23736 S 0.0 0.2 10:18.34 ganesha.nfsd Disk usage post deleting exports ------------------------ # ceph -s cluster: id: 4e687a60-638e-11ee-8772-b49691cee574 health: HEALTH_OK services: mon: 1 daemons, quorum cali013 (age 3d) mgr: cali013.qakwdk(active, since 3d), standbys: cali016.rhribl, cali015.hvvbwh mds: 1/1 daemons up, 1 standby osd: 35 osds: 28 up (since 3d), 28 in (since 5d) rgw: 2 daemons active (2 hosts, 1 zones) data: volumes: 1/1 healthy pools: 9 pools, 1233 pgs objects: 316.89k objects, 1.2 TiB usage: 3.6 TiB used, 65 TiB / 69 TiB avail pgs: 1233 active+clean io: client: 170 B/s rd, 0 op/s rd, 0 op/s wr [ceph: root@cali013 /]# ceph nfs cluster ls [ "cephfs-nfs" ] [ceph: root@cali013 /]# ceph nfs export ls cephfs-nfs [] # ceph nfs cluster info { "cephfs-nfs": { "backend": [ { "hostname": "cali015", "ip": "10.8.130.15", "port": 12049 }, { "hostname": "cali016", "ip": "10.8.130.16", "port": 12049 } ], "monitor_port": 9049, "port": 2049, "virtual_ip": "10.8.130.236" } } Logs - =============== FIO instances running on all 1000 exports in parallel - http://magna002.ceph.redhat.com/ceph-qe-logs/msaini/Automation/scale_linux_v3_1000exports_100clients/fio_logs/Test_nfs_scale_with_fio_0.log Cleanup on all exports and deleting 1000 exports post test completion - http://magna002.ceph.redhat.com/ceph-qe-logs/msaini/Automation/scale_linux_v3_1000exports_100clients/delete_exports_and_cleanup/Test_nfs_scale_with_fio_0.log Could you run the test several cycles and report on the memory use during and after each cycle? Test is in process. Will update the results here once completed Hi Frank, The test was executed 3 times consecutively, and with each iteration, the memory usage increased by approximately 1 GB. After completing the 3 iterations and returning to an idle state, the NFS daemon consumed 14.6 GB of memory, which is considered high. Memory utilisation details in each run with IO's and post cleanup Iteration 1 ************************* Node 1: After running IO's --> 13.0 GB —------- MiB Swap: 4096.0 total, 4096.0 free, 0.0 used. 103858.8 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 92243 root 20 0 19.8g 13.0g 23996 S 17.0 10.4 416:26.67 ganesha.nfsd Node 1: After cleanup --> 12.5 GB ------- PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 92243 root 20 0 19.5g 12.5g 23996 S 0.0 10.0 596:45.88 ganesha.nfsd Iteration 2 ************************* Node 1: After running IO's --> 15.7 GB -------- MiB Swap: 4096.0 total, 4096.0 free, 0.0 used. 100842.1 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 92243 root 20 0 22.0g 15.7g 23996 S 25.0 12.6 935:58.53 ganesha.nfsd Node 1: After cleanup --> 13.9 GB ------- MiB Swap: 4096.0 total, 4096.0 free, 0.0 used. 102730.5 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 92243 root 20 0 20.7g 13.9g 23996 S 0.0 11.1 1095:12 ganesha.nfsd Iteration 3 ************************* Node 1: After running IO's --> 16.4 GB ------ PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 92243 root 20 0 22.4g 16.4g 23996 S 0.0 13.1 1793:14 ganesha.nfsd Node 1: After cleanup --> 14.6 GB ------- PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 92243 root 20 0 21.4g 14.6g 23996 S 0.3 11.7 1793:46 ganesha.nfsd I think we may leak some small bits for each export during each export re-load. The way exports are configured and removed, we reload exports 500 times during export setup, and 500 times during unload. This means we see some 250k structures (from (N * (N+1)/2 twice) lost each test cycle. I did fix one set of leaks in V5.7-2 (which is being used per the bug details above), but Valgrind showed some possibly lost memory that I didn't get a chance to chase down. If there really is a leak there, 250k structures per cycle does add up to something, though gigabyte means 4k memory per such structure... hmm... that's a page... Heap fragmentation might be a culprit here... As suggested, I ran the same test for NFSv4.1 (2 Iterations) with 7.1 Build, observing the same memory growth (as we saw for NFSv3) with the last run. Below are the stats for 2 Iterations for NFSv4.1 -- First Iteration =========== Post Ruuning IO’s - Node 1: 10.8g | Node 2: 13.0g Post cleanup- Node 1: 9.7g | Node 2: 11.1g 2nd Iteration =========== Post running IO’s Node 1: 13.1g | Node 2: 14.5g Post cleanup: Node 1: 11.7g. | Node 2: 13.4g It appears that the problem is not limited to NFSv3. It is also observed with NFSv4. So as I read this, Frank has made real progress in fixing some memory leaks here, and has confirmed this with Valgrind. Manisha has confirmed that the remaining leakage affects both NFS v3 and NFS v4. Might I suggest we accept the forward progress we have made in 7.1, mark this specific BZ as fixed (since it addressed the original problems, but not ALL problems), note in the errata that it is better than ever but still a work in progress, and then clone this issue for 7.1 z1 so Frank can continue his work addressing memory leakage as we constantly move the quality forward? Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Critical: Red Hat Ceph Storage 7.1 security, enhancements, and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2024:3925 |