Bug 1752245

Summary: Crash in glusterd when running test script bug-1699339.t
Product: [Community] GlusterFS Reporter: Mohit Agrawal <moagrawa>
Component: glusterdAssignee: Mohit Agrawal <moagrawa>
Status: CLOSED NEXTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: unspecified    
Version: 7CC: bmekala, bugs, jahernan, rhs-bugs, sankarshan, storage-qa-internal, vbellur
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1723890 Environment:
Last Closed: 2019-09-15 13:07:03 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On: 1723890    
Bug Blocks: 1723889    

Description Mohit Agrawal 2019-09-15 06:30:55 UTC
+++ This bug was initially created as a clone of Bug #1723890 +++

+++ This bug was initially created as a clone of Bug #1723889 +++

Description of problem:

When script bug-1699339.t is executed, glusterd crashes when it's restarted.

Backtrace of the crash:

#0  _gf_log (domain=domain@entry=0x7ffff7e54bcd "THIS", file=file@entry=0x7ffff7e54bc2 "rpc-clnt.c", 
    function=function@entry=0x7ffff7e55000 <__FUNCTION__.18290> "__saved_frame_get", line=line@entry=314, 
    level=level@entry=GF_LOG_ERROR, fmt=fmt@entry=0x7ffff7e54d0d "%p") at logging.c:2046
        basename = 0x0
        new_logfile = 0x0
        ap = {{gp_offset = 48, fp_offset = 48, overflow_arg_area = 0x7fffe58d3d48, reg_save_area = 0x7fffe58d3c50}}
        timestr = '\000' <repeats 255 times>
        tv = {tv_sec = 0, tv_usec = 0}
        logline = 0x0
        msg = 0x0
        ret = 0
        fd = -1
        this = 0x48eb2f
        ctx = 0x43e28000
        level_strings = {0x7ffff7f51790 "", 0x7ffff7f5673d "M", 0x7ffff7f51791 "A", 0x7ffff7f5676a "C", 0x7ffff7f55911 "E", 
          0x7ffff7f5aac2 "W", 0x7ffff7f566af "N", 0x7ffff7f51793 "I", 0x7ffff7f566b4 "D", 0x7ffff7f52879 "T", 0x7ffff7f51790 ""}
        __PRETTY_FUNCTION__ = "_gf_log"
        __FUNCTION__ = "_gf_log"
#1  0x00007ffff7e4bc86 in __saved_frame_get (frames=<optimized out>, callid=callid@entry=5) at rpc-clnt.c:314
        saved_frame = <optimized out>
        tmp = 0x7fffe0005b48
        __FUNCTION__ = "__saved_frame_get"
#2  0x00007ffff7e4d7d1 in lookup_frame (callid=5, conn=0x765a10) at rpc-clnt.c:570
        frame = 0x0
        frame = <optimized out>
        __FUNCTION__ = "lookup_frame"
#3  rpc_clnt_handle_reply (clnt=clnt@entry=0x7659e0, pollin=pollin@entry=0x7fffe000b7f0) at rpc-clnt.c:754
        conn = 0x765a10
        saved_frame = 0x0
        ret = -1
        req = 0x0
        xid = 5
        __FUNCTION__ = "rpc_clnt_handle_reply"
#4  0x00007ffff7e4dcd2 in rpc_clnt_notify (trans=0x774b70, mydata=0x765a10, event=<optimized out>, data=0x7fffe000b7f0) at rpc-clnt.c:946
        conn = 0x765a10
        clnt = 0x7659e0
        ret = -1
        req_info = 0x0
        pollin = 0x7fffe000b7f0
        clnt_mydata = 0x0
        old_THIS = 0x48eb30
        __FUNCTION__ = "rpc_clnt_notify"
#5  0x00007ffff7e4a976 in rpc_transport_notify (this=this@entry=0x774b70, event=event@entry=RPC_TRANSPORT_MSG_RECEIVED, 
    data=data@entry=0x7fffe000b7f0) at rpc-transport.c:545
        ret = -1
        __FUNCTION__ = "rpc_transport_notify"
#6  0x00007fffe64b0d38 in socket_event_poll_in_async (xl=<optimized out>, async=0x7fffe000b918) at socket.c:2562
        pollin = 0x7fffe000b7f0
        this = 0x774b70
        priv = 0x7750d0
#7  0x00007fffe64b808c in gf_async (cbk=0x7fffe64b0d10 <socket_event_poll_in_async>, xl=<optimized out>, async=0x7fffe000b918)
    at ../../../../libglusterfs/src/glusterfs/async.h:189
        __FUNCTION__ = "gf_async"
#8  socket_event_poll_in (notify_handled=true, this=0x774b70) at socket.c:2603
        ret = <optimized out>
        pollin = 0x7fffe000b7f0
        priv = 0x7750d0
        ctx = <optimized out>
        ret = <optimized out>
        pollin = <optimized out>
        priv = <optimized out>
        ctx = <optimized out>
#9  socket_event_handler (event_thread_died=0 '\000', poll_err=0, poll_out=<optimized out>, poll_in=<optimized out>, data=0x774b70, 
    gen=1, idx=2, fd=0) at socket.c:2994
        this = <optimized out>
        ret = <optimized out>
        ctx = <optimized out>
        notify_handled = <optimized out>
        priv = 0x7750d0
        socket_closed = <optimized out>
        this = <optimized out>
        priv = <optimized out>
        ret = <optimized out>
        ctx = <optimized out>
        socket_closed = <optimized out>
        notify_handled = <optimized out>
        __FUNCTION__ = "socket_event_handler"
        sock_type = <optimized out>
        sa = <optimized out>
#10 socket_event_handler (fd=fd@entry=43, idx=idx@entry=2, gen=gen@entry=1, data=data@entry=0x774b70, poll_in=<optimized out>,
    poll_out=<optimized out>, poll_err=0, event_thread_died=0 '\000') at socket.c:2921
        this = 0x774b70
        __FUNCTION__ = "socket_event_handler"
        sock_type = <optimized out>
        sa = <optimized out>
#11 0x00007ffff7f053a3 in event_dispatch_epoll_handler (event=0x7fffe58d4024, event_pool=0x4750b0) at event-epoll.c:642
        handler = 0x7fffe64b6520 <socket_event_handler>
        gen = 1
        slot = 0x4c8890
        data = 0x774b70
        ret = 0
        fd = 43
        ev_data = 0x7fffe58d4028
        idx = 2
        handled_error_previously = <optimized out>
        ev_data = <optimized out>
        slot = <optimized out>
        handler = <optimized out>
        data = <optimized out>
        idx = <optimized out>
        gen = <optimized out>
        ret = <optimized out>
        fd = <optimized out>
        handled_error_previously = <optimized out>
        __FUNCTION__ = "event_dispatch_epoll_handler"
#12 event_dispatch_epoll_worker (data=0x488ab0) at event-epoll.c:755
        event = {events = 1, data = {ptr = 0x100000002, fd = 2, u32 = 2, u64 = 4294967298}}
        ret = <optimized out>
        ev_data = 0x488ab0
        event_pool = 0x4750b0
        myindex = 1
        timetodie = 0
        gen = 0
        poller_death_notify = {next = 0x0, prev = 0x0}
        slot = 0x0
        tmp = 0x0
        __FUNCTION__ = "event_dispatch_epoll_worker"
#13 0x00007ffff7c595a2 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#14 0x00007ffff78a5023 in clone () from /lib64/libc.so.6
No symbol table info available.

Version-Release number of selected component (if applicable):


How reproducible:

Always

Steps to Reproduce:
1. Run the referenced script
2.
3.

Actual results:

Glusterd crashes in the middle of execution of the script.

Expected results:

The script should finish successfully.

Additional info:

--- Additional comment from RHEL Product and Program Management on 2019-06-25 17:54:19 CEST ---

This bug is automatically being proposed for the next minor release of Red Hat Gluster Storage by setting the release flag 'rhgs‑3.5.0' to '?'. 

If this bug should be proposed for a different release, please manually change the proposed release flag.

--- Additional comment from Xavi Hernandez on 2019-06-25 17:56:58 CEST ---

The root cause of the bug is an extra dict_unref() that causes a use-after-free situation.

--- Additional comment from Worker Ant on 2019-06-25 16:03:48 UTC ---

REVIEW: https://review.gluster.org/22943 (glusterd: fix use-after-free of a dict_t) posted (#1) for review on master by Xavi Hernandez

--- Additional comment from Worker Ant on 2019-06-26 03:09:14 UTC ---

REVIEW: https://review.gluster.org/22943 (glusterd: fix use-after-free of a dict_t) merged (#2) on master by Atin Mukherjee

Comment 1 Worker Ant 2019-09-15 06:34:48 UTC
REVIEW: https://review.gluster.org/23422 (glusterd: fix use-after-free of a dict_t) posted (#1) for review on release-7 by MOHIT AGRAWAL

Comment 2 Worker Ant 2019-09-15 13:07:03 UTC
REVIEW: https://review.gluster.org/23422 (glusterd: fix use-after-free of a dict_t) merged (#1) on release-7 by MOHIT AGRAWAL