Hide Forgot
valgrind reported: ==17821== 129,024 bytes in 63 blocks are indirectly lost in loss record 74 of 77 ==17821== at 0x4C277CC: calloc (vg_replace_malloc.c:467) ==17821== by 0x4E5A2D1: __gf_fd_fdtable_get_all_fds (fd.c:153) ==17821== by 0x4E5A9AC: gf_fd_fdtable_get_all_fds (fd.c:168) ==17821== by 0x6A60160: server_connection_cleanup (server-helpers.c:670) ==17821== by 0x6A4D839: notify (server-protocol.c:6762) ==17821== by 0x4E41DC2: xlator_notify (xlator.c:923) ==17821== by 0x746C17E: socket_event_poll_err (socket.c:435) ==17821== by 0x746E0E7: socket_event_handler (socket.c:833) ==17821== by 0x4E5C31C: event_dispatch_epoll (event.c:804) ==17821== by 0x4044F1: main (glusterfsd.c:1413) ==17821== ==17821== 129,024 bytes in 63 blocks are definitely lost in loss record 75 of 77 ==17821== at 0x4C277CC: calloc (vg_replace_malloc.c:467) ==17821== by 0x4E5AA28: gf_fd_fdtable_expand (fd.c:102) ==17821== by 0x4E5ADD4: gf_fd_fdtable_alloc (fd.c:136) ==17821== by 0x6A5E4DA: server_connection_get (server-helpers.c:874) ==17821== by 0x6A565D2: mop_setvolume (server-protocol.c:5701) ==17821== by 0x6A4D769: protocol_server_pollin (server-protocol.c:6687) ==17821== by 0x6A4D7F2: notify (server-protocol.c:6743) ==17821== by 0x4E41DC2: xlator_notify (xlator.c:923) ==17821== by 0x746E099: socket_event_handler (socket.c:829) ==17821== by 0x4E5C31C: event_dispatch_epoll (event.c:804) ==17821== by 0x4044F1: main (glusterfsd.c:1413) ==17821== ==17821== 133,056 (4,032 direct, 129,024 indirect) bytes in 63 blocks are definitely lost in loss record 76 of 77 ==17821== at 0x4C277CC: calloc (vg_replace_malloc.c:467) ==17821== by 0x4E5ADAC: gf_fd_fdtable_alloc (fd.c:128) ==17821== by 0x6A5E4DA: server_connection_get (server-helpers.c:874) ==17821== by 0x6A565D2: mop_setvolume (server-protocol.c:5701) ==17821== by 0x6A4D769: protocol_server_pollin (server-protocol.c:6687) ==17821== by 0x6A4D7F2: notify (server-protocol.c:6743) ==17821== by 0x4E41DC2: xlator_notify (xlator.c:923) ==17821== by 0x746E099: socket_event_handler (socket.c:829) ==17821== by 0x4E5C31C: event_dispatch_epoll (event.c:804) ==17821== by 0x4044F1: main (glusterfsd.c:1413)
We have a volume that only opens to some particular ip addresses specified by option "option auth.addr.brick00.allow 10.10.10.10 10.10.10.11". But we have many other clients that are trying for the volume. Then we see the glusterfsd process eating up to 7G memory, and have to kill it. It's easy to reproduce the problem. Just setup the option and try connecting to it from clients that are not allowed, the more the better. /* In function server_connection_cleanup, some memories are only freed if conn->bound_xl is assigned. But in mop_setvolume, this can only happen when gf_authenticate returns AUTH_ACCEPT. So for clients rejected, it prints "Cannot authenticate client from %s" and leaves conn->bound_xl NULL. Then in server_connection_cleanup, do_connection_cleanup will not be called. fdentries has no chance to be freed. */
also in server_connection_destroy, ltable and fdentries are not freed if bound_xl is NULL :)
Created attachment 246 [details] patch to fix typo and add s390 architecture
PATCH: http://patches.gluster.com/patch/3573 in release-3.0 (protocol/server: Fix memory leak when server authentication fails.)