Bug 1654917 - cleanup resources in server_init in case of failure
Summary: cleanup resources in server_init in case of failure
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: core
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Mohit Agrawal
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-11-30 03:14 UTC by Mohit Agrawal
Modified: 2019-03-25 16:32 UTC (History)
2 users (show)

Fixed In Version: glusterfs-6.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-03-25 16:32:19 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Gluster.org Gerrit 21750 0 None Merged server: Resolve memory leak path in server_init 2018-12-03 11:35:16 UTC
Gluster.org Gerrit 21787 0 None Open rpc: check if fini is there before calling it 2018-12-03 17:19:22 UTC
Gluster.org Gerrit 21790 0 None Merged rpc: check if fini is there before calling it 2018-12-04 03:19:57 UTC

Description Mohit Agrawal 2018-11-30 03:14:28 UTC
Description of problem:

Resolve memory leaks for server_init while server_init is failed.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Worker Ant 2018-11-30 03:17:39 UTC
REVIEW: https://review.gluster.org/21750 (server: Resolve memory leak path in server_init) posted (#2) for review on master by MOHIT AGRAWAL

Comment 2 Worker Ant 2018-12-03 11:35:14 UTC
REVIEW: https://review.gluster.org/21750 (server: Resolve memory leak path in server_init) posted (#13) for review on master by Atin Mukherjee

Comment 3 Raghavendra Bhat 2018-12-03 17:14:08 UTC

Glusterd crashed in init.

Because, in glusterd init it tries to initialize rdma transport (because it is mentioned in the volfile). But if rdma libraries are not present in a machine, then initializing rdma transport fails in rdma_trasport_load. It exactly fails at this dlopen call

                                                                                                         |
    handle = dlopen(name, RTLD_NOW);                                                                     |
    if (handle == NULL) {                                                                                |
        gf_log("rpc-transport", GF_LOG_ERROR, "%s", dlerror());                                          |
        gf_log("rpc-transport", GF_LOG_WARNING,                                                          |
               "volume '%s': transport-type '%s' is not valid or "                                       |
               "not found on this machine",                                                              |
               trans_name, type);                                                                        |
        goto fail;                                                                                       |
    }                                  


[2018-12-03 16:48:04.557840] I [socket.c:931:__socket_server_bind] 0-socket.management: process started listening on port (24007)
[2018-12-03 16:48:04.558021] E [rpc-transport.c:295:rpc_transport_load] 0-rpc-transport: /usr/local/lib/glusterfs/6dev/rpc-transport/rdma.so: cannot open shared object file: No such file or directory
[2018-12-03 16:48:04.558046] W [rpc-transport.c:299:rpc_transport_load] 0-rpc-transport: volume 'rdma.management': transport-type 'rdma' is not valid or not found on this machine


As part of this failure handling, rpc_trasport_cleanup is called. And there, transport->fini is called unconditionally.

But, between the allocation of memory and loading the fini symbol from the shared object, there are several other places where things can fail and rpc_traport_load can enter the failure mode (including the failure to load the fini symbol itself). Eventually this can result in crash.


pending frames:
patchset: git://git.gluster.org/glusterfs.git
signal received: 11
time of crash: 
2018-12-03 16:48:04
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 6dev
/usr/local/lib/libglusterfs.so.0(+0x2d9bb)[0x7fe10646d9bb]
/usr/local/lib/libglusterfs.so.0(gf_print_trace+0x259)[0x7fe106476ef3]
glusterd(glusterfsd_print_trace+0x1f)[0x40b15e]
/lib64/libc.so.6(+0x385c0)[0x7fe105d395c0]


This is the backtrace of the core:

Core was generated by `glusterd'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000000000000000 in ?? ()
[Current thread is 1 (Thread 0x7f3bc6182880 (LWP 21751))]
Missing separate debuginfos, use: dnf debuginfo-install glibc-2.28-20.fc29.x86_64 keyutils-libs-1.5.10-8.fc29.x86_64 krb5-libs-1.16.1-21.fc29.x86_64 libcom_err-1.44.3-1.fc29.x86_64 libselinux-2.8-4.fc29.x86_64 libtirpc-1.1.4-2.rc2.fc29.x86_64 libxml2-2.9.8-4.fc29.x86_64 openssl-libs-1.1.1-3.fc29.x86_64 pcre2-10.32-4.fc29.x86_64 userspace-rcu-0.10.1-4.fc29.x86_64 xz-libs-5.2.4-3.fc29.x86_64 zlib-1.2.11-14.fc29.x86_64
(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x00007f3bc6adb4b7 in rpc_transport_cleanup (trans=0x182a680) at ../../../../rpc/rpc-lib/src/rpc-transport.c:168
#2  0x00007f3bc6adbfb5 in rpc_transport_load (ctx=0x1791280, options=0x17de4b8, trans_name=0x18095b0 "rdma.management") at ../../../../rpc/rpc-lib/src/rpc-transport.c:375
#3  0x00007f3bc6ad8456 in rpcsvc_create_listener (svc=0x17d9300, options=0x17de4b8, name=0x18095b0 "rdma.management") at ../../../../rpc/rpc-lib/src/rpcsvc.c:1991
#4  0x00007f3bc6ad87b8 in rpcsvc_create_listeners (svc=0x17d9300, options=0x17de4b8, name=0x17d1b00 "management") at ../../../../rpc/rpc-lib/src/rpcsvc.c:2083
#5  0x00007f3bb5a91338 in init (this=0x17dd480) at ../../../../../xlators/mgmt/glusterd/src/glusterd.c:1774
#6  0x00007f3bc6b3d179 in __xlator_init (xl=0x17dd480) at ../../../libglusterfs/src/xlator.c:718
#7  0x00007f3bc6b3d2c3 in xlator_init (xl=0x17dd480) at ../../../libglusterfs/src/xlator.c:745
#8  0x00007f3bc6b8dce8 in glusterfs_graph_init (graph=0x17d16c0) at ../../../libglusterfs/src/graph.c:359
#9  0x00007f3bc6b8e8f8 in glusterfs_graph_activate (graph=0x17d16c0, ctx=0x1791280) at ../../../libglusterfs/src/graph.c:722
#10 0x000000000040b8fe in glusterfs_process_volfp (ctx=0x1791280, fp=0x17d09c0) at ../../../glusterfsd/src/glusterfsd.c:2597
#11 0x000000000040bace in glusterfs_volumes_init (ctx=0x1791280) at ../../../glusterfsd/src/glusterfsd.c:2670
#12 0x000000000040bf9e in main (argc=1, argv=0x7ffc12766108) at ../../../glusterfsd/src/glusterfsd.c:2823
(gdb) frame 1
#1  0x00007f3bc6adb4b7 in rpc_transport_cleanup (trans=0x182a680) at ../../../../rpc/rpc-lib/src/rpc-transport.c:168
168	    trans->fini(trans);
(gdb) l
163	rpc_transport_cleanup(rpc_transport_t *trans)
164	{
165	    if (!trans)
166	        return;
167	
168	    trans->fini(trans);
169	    GF_FREE(trans->name);
170	
171	    if (trans->xl)
172	        pthread_mutex_destroy(&trans->lock);
(gdb) p trans->fini
$1 = (void (*)(rpc_transport_t *)) 0x0
(gdb) frame 2
#2  0x00007f3bc6adbfb5 in rpc_transport_load (ctx=0x1791280, options=0x17de4b8, trans_name=0x18095b0 "rdma.management") at ../../../../rpc/rpc-lib/src/rpc-transport.c:375
375	        rpc_transport_cleanup(trans);
(gdb) l
370	
371	    success = _gf_true;
372	
373	fail:
374	    if (!success) {
375	        rpc_transport_cleanup(trans);
376	        GF_FREE(name);
377	
378	        return_trans = NULL;
379	    }
(gdb) p success
$2 = false
(gdb) p trans->fini
$3 = (void (*)(rpc_transport_t *)) 0x0

Comment 4 Worker Ant 2018-12-03 17:19:20 UTC
REVIEW: https://review.gluster.org/21787 (rpc: check if fini is there before calling it) posted (#1) for review on master by Raghavendra Bhat

Comment 5 Worker Ant 2018-12-03 22:25:21 UTC
REVIEW: https://review.gluster.org/21790 (rpc: check if fini is there before calling it) posted (#1) for review on master by Raghavendra Bhat

Comment 6 Worker Ant 2018-12-04 03:19:56 UTC
REVIEW: https://review.gluster.org/21790 (rpc: check if fini is there before calling it) posted (#2) for review on master by MOHIT AGRAWAL

Comment 7 Shyamsundar 2019-03-25 16:32:19 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-6.0, please open a new bug report.

glusterfs-6.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://lists.gluster.org/pipermail/announce/2019-March/000120.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.