Description of problem: ======================== On a scale cluster, create 2000 NFS exports. Mount the export on the client using version 3. During the first export mount, memory usage unexpectedly increased, leading to an NFS service crash and failed mount with vers=3. # ls -lart total 16388 drwxr-xr-x. 7 root root 4096 Jul 17 2023 .. -rw-r-----. 1 root root 7272427 Oct 21 15:39 'core.ganesha\x2enfsd.0.92b21226dd874ad1bcbf401f0dd547d2.37084.1729525023000000.zst' -rw-r-----. 1 root root 9486612 Oct 21 15:56 'core.ganesha\x2enfsd.0.92b21226dd874ad1bcbf401f0dd547d2.126145.1729526076000000.zst' ------ warning: Error reading shared library list entry at 0x696e6e75723a5652 warning: Error reading shared library list entry at 0x69626f6e6e613a56 Failed to read a valid object file image from memory. --Type <RET> for more, q to quit, c to continue without paging-- Core was generated by `/usr/bin/ganesha.nfsd -F -L STDERR -N NIV_EVENT'. Program terminated with signal SIGABRT, Aborted. #0 0x00007fafa282794c in ?? () [Current thread is 1 (LWP 67)] (gdb) bt #0 0x00007fafa282794c in ?? () Backtrace stopped: Cannot access memory at address 0x7faf33ffdf80 (gdb) ============================ Output from the Ganesha process after the first mount failure shows that memory usage increased on an idle setup following the failed mount. [root@cali015 coredump]# top -p 136052 top - 16:21:25 up 34 days, 19:36, 1 user, load average: 6.62, 2.00, 3.85 Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.6 us, 1.7 sy, 0.0 ni, 97.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st MiB Mem : 127826.4 total, 32772.4 free, 94554.5 used, 3422.0 buff/cache MiB Swap: 4096.0 total, 3083.7 free, 1012.2 used. 33271.9 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 136052 root 20 0 2034.5g 75.1g 23136 D 128.6 60.1 1:55.43 ganesha.nfsd ========================= # ceph nfs cluster info nfsganesha { "nfsganesha": { "backend": [ { "hostname": "cali015", "ip": "10.8.130.15", "port": 12049 }, { "hostname": "cali016", "ip": "10.8.130.16", "port": 12049 } ], "monitor_port": 9049, "port": 2049, "virtual_ip": "10.8.130.236" } } Version-Release number of selected component (if applicable): How reproducible: ================ 2/2 Steps to Reproduce: =================== 1.Create ganesha cluster 2.Create 2000 NFS exports 3.Mount the 1st export on 1 client via vers=3 Actual results: ============ v3 Mount failed. Memory increased to 75GB. Expected results: ============= v3 mount should work. No memory leak should be expected Additional info:
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat Ceph Storage 8.0 security, bug fix, and enhancement updates), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2024:10216