1665656 – testcaes glusterd/add-brick-and-validate-replicated-volume-options.t is crash while brick_mux is enable

Bug 1665656 - testcaes glusterd/add-brick-and-validate-replicated-volume-options.t is crash while brick_mux is enable

Summary: testcaes glusterd/add-brick-and-validate-replicated-volume-options.t is crash...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	core
Sub Component:
Version:	mainline
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Mohit Agrawal
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2019-01-12 05:41 UTC by Mohit Agrawal
Modified:	2019-03-25 16:33 UTC (History)
CC List:	1 user (show)
Fixed In Version:	glusterfs-6.0
Clone Of:
Environment:
Last Closed:	2019-01-14 12:35:25 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Gluster.org Gerrit	22015	0	None	Merged	core: glusterd/add-brick-and-validate-replicated-volume-options.t is crash	2019-01-14 12:35:23 UTC

Description Mohit Agrawal 2019-01-12 05:41:55 UTC

Description of problem:
testcaes glusterd/add-brick-and-validate-replicated-volume-options.t is crash while brick_mux is enable

Version-Release number of selected component (if applicable):


How reproducible:
Allways

Steps to Reproduce:
1.Enable brick_mux in add-brick-and-validate-replicated-volume-options.t
2.Run .t in a loop after some attempt .t is crash

3.

Actual results:
Test case is crash.

Expected results:
Test case should not crash

Additional info:

Comment 1 Mohit Agrawal 2019-01-12 05:46:10 UTC

Hi,

test case is generating below crash after just call kill_brick.

#0  0x0000560df20e821b in STACK_DESTROY (stack=0x3) at ../../libglusterfs/src/glusterfs/stack.h:182
182	    LOCK(&stack->pool->lock);
Missing separate debuginfos, use: debuginfo-install bzip2-libs-1.0.6-13.el7.x86_64 elfutils-libelf-0.168-8.el7.x86_64 elfutils-libs-0.168-8.el7.x86_64 glibc-2.17-196.el7_4.2.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-8.el7.x86_64 libacl-2.2.51-12.el7.x86_64 libaio-0.3.109-13.el7.x86_64 libattr-2.4.46-12.el7.x86_64 libcap-2.22-9.el7.x86_64 libcom_err-1.42.9-10.el7.x86_64 libgcc-4.8.5-16.el7_4.2.x86_64 libselinux-2.5-11.el7.x86_64 libuuid-2.23.2-43.el7_4.2.x86_64 openssl-libs-1.0.2k-8.el7.x86_64 pcre-8.32-17.el7.x86_64 systemd-libs-219-42.el7_4.10.x86_64 xz-libs-5.2.2-1.el7.x86_64 zlib-1.2.7-17.el7.x86_64
(gdb) bt
#0  0x0000560df20e821b in STACK_DESTROY (stack=0x3) at ../../libglusterfs/src/glusterfs/stack.h:182
#1  mgmt_pmap_signin_cbk (req=<optimized out>, iov=<optimized out>, count=<optimized out>, myframe=0x7fda6802ddb8)
    at glusterfsd-mgmt.c:2824
#2  0x00007fda86559161 in rpc_clnt_handle_reply (clnt=clnt@entry=0x560df2d53b30, pollin=pollin@entry=0x560df2ea98f0)
    at rpc-clnt.c:755
#3  0x00007fda865594c7 in rpc_clnt_notify (trans=0x560df2d53e50, mydata=0x560df2d53b60, event=<optimized out>, 
    data=0x560df2ea98f0) at rpc-clnt.c:922
#4  0x00007fda86555b33 in rpc_transport_notify (this=this@entry=0x560df2d53e50, event=event@entry=RPC_TRANSPORT_MSG_RECEIVED, 
    data=data@entry=0x560df2ea98f0) at rpc-transport.c:541
#5  0x00007fda7ab7f95d in socket_event_poll_in (notify_handled=true, this=0x560df2d53e50) at socket.c:2516
#6  socket_event_handler (fd=<optimized out>, idx=<optimized out>, gen=<optimized out>, data=0x560df2d53e50, 
    poll_in=<optimized out>, poll_out=<optimized out>, poll_err=0, event_thread_died=0 '\000') at socket.c:2918
#7  0x00007fda86814e15 in event_dispatch_epoll_handler (event=0x7fda34ff8e70, event_pool=0x560df2d03560) at event-epoll.c:642
#8  event_dispatch_epoll_worker (data=0x7fda40054740) at event-epoll.c:756
#9  0x00007fda855eee25 in start_thread () from /usr/lib64/libpthread.so.0
#10 0x00007fda84ebb34d in clone () from /usr/lib64/libc.so.6
(gdb) f 1
#1  mgmt_pmap_signin_cbk (req=<optimized out>, iov=<optimized out>, count=<optimized out>, myframe=0x7fda6802ddb8)
    at glusterfsd-mgmt.c:2824
2824	    STACK_DESTROY(frame->root);
(gdb) p frame
$1 = (call_frame_t *) 0x7fda6802ddb8
(gdb) p *frame
$2 = {root = 0x4, parent = 0x400000001, frames = {next = 0xffffffffffffffff, prev = 0x7fda6802de18}, local = 0x7fda68059478, 
  this = 0x0, ret = 0x0, ref_count = 0, lock = {spinlock = 0, mutex = {__data = {__lock = 0, __count = 0, __owner = 0, 
        __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x7fda68059478}}, 
      __size = '\000' <repeats 32 times>, "x\224\005h\332\177\000", __align = 0}}, cookie = 0x0, complete = 232, op = 32730, 
  begin = {tv_sec = 0, tv_nsec = 140576024633944}, end = {tv_sec = 140576024654704, tv_nsec = 1125216510}, 
  wind_from = 0x1 <Address 0x1 out of bounds>, wind_to = 0x0, unwind_from = 0x0, unwind_to = 0x0}
(gdb) p frame->root
$3 = (call_stack_t *) 0x4
(gdb) 

After checked the code I have found the current glusterfs_mgmt_pmap_signin code
is not perfect to send signin request. It uses same frame to send multiple requests.

>>>>>>>
.......
.......

if (ctx->active) {
        top = ctx->active->first;
        for (trav_p = &top->children; *trav_p; trav_p = &(*trav_p)->next) {
            req.brick = (*trav_p)->xlator->name;
            ret = mgmt_submit_request(&req, frame, ctx, &clnt_pmap_prog,
                                      GF_PMAP_SIGNIN, mgmt_pmap_signin_cbk,
                                      (xdrproc_t)xdr_pmap_signin_req);
            if (ret < 0) {
                gf_log(THIS->name, GF_LOG_WARNING,
                       "failed to send sign in request; brick = %s", req.brick);
            }
            count++;
        }
    } else {
        ret = mgmt_submit_request(&req, frame, ctx, &clnt_pmap_prog,
                                  GF_PMAP_SIGNIN, mgmt_pmap_signin_cbk,
                                  (xdrproc_t)xdr_pmap_signin_req);
    }

>>>>>>>>>>>>>

Thanks,
Mohit Agrawal

Comment 2 Worker Ant 2019-01-12 05:52:58 UTC

REVIEW: https://review.gluster.org/22015 (core: glusterd/add-brick-and-validate-replicated-volume-options.t is crash) posted (#1) for review on master by MOHIT AGRAWAL

Comment 3 Worker Ant 2019-01-14 12:35:25 UTC

REVIEW: https://review.gluster.org/22015 (core: glusterd/add-brick-and-validate-replicated-volume-options.t is crash) merged (#3) on master by Amar Tumballi

Comment 4 Shyamsundar 2019-03-25 16:33:07 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-6.0, please open a new bug report.

glusterfs-6.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://lists.gluster.org/pipermail/announce/2019-March/000120.html
[2] https://www.gluster.org/pipermail/gluster-users/

Note You need to log in before you can comment on or make changes to this bug.