Bug 1398226 - With compound fops on, client process crashes when a replica is brought down while IO is in progress
Summary: With compound fops on, client process crashes when a replica is brought down ...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: replicate
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Krutika Dhananjay
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 1398331 1398333 1398499
TreeView+ depends on / blocked
 
Reported: 2016-11-24 10:13 UTC by Krutika Dhananjay
Modified: 2017-03-06 17:36 UTC (History)
1 user (show)

Fixed In Version: glusterfs-3.10.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1398331 1398333 1398499 (view as bug list)
Environment:
Last Closed: 2017-03-06 17:36:17 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Krutika Dhananjay 2016-11-24 10:13:46 UTC
Description of problem:


(gdb) bt
#0  0x00007f976ed9169d in afr_pre_op_writev_cbk (frame=0x7f97601255dc, cookie=0x0, this=0x7f976800f860, op_ret=-1, op_errno=107, data=0x0, xdata=0x0)
    at afr-transaction.c:1252
#1  0x00007f976f022d04 in client3_3_compound_cbk (req=0x7f976017bc9c, iov=0x7f976fa53770, count=1, myframe=0x7f97600c11ec) at client-rpc-fops.c:3213
#2  0x00007f977c3764cb in saved_frames_unwind (saved_frames=0x7f9764000bd0) at rpc-clnt.c:369
#3  0x00007f977c376563 in saved_frames_destroy (frames=0x7f9764000bd0) at rpc-clnt.c:386
#4  0x00007f977c376a7c in rpc_clnt_connection_cleanup (conn=0x7f9768060790) at rpc-clnt.c:555
#5  0x00007f977c377523 in rpc_clnt_notify (trans=0x7f9768060bc0, mydata=0x7f9768060790, event=RPC_TRANSPORT_DISCONNECT, data=0x7f9768060bc0) at rpc-clnt.c:901
#6  0x00007f977c373a27 in rpc_transport_notify (this=0x7f9768060bc0, event=RPC_TRANSPORT_DISCONNECT, data=0x7f9768060bc0) at rpc-transport.c:537
#7  0x00007f977151788d in socket_event_poll_err (this=0x7f9768060bc0) at socket.c:1177
#8  0x00007f977151c23d in socket_event_handler (fd=14, idx=3, data=0x7f9768060bc0, poll_in=1, poll_out=0, poll_err=24) at socket.c:2402
#9  0x00007f977c61c323 in event_dispatch_epoll_handler (event_pool=0x1488010, event=0x7f976fa53f20) at event-epoll.c:571
#10 0x00007f977c61c702 in event_dispatch_epoll_worker (data=0x14cb7c0) at event-epoll.c:674
#11 0x00007f977b6025ca in start_thread () from /lib64/libpthread.so.0
#12 0x00007f977aedc0ed in clone () from /lib64/libc.so.6
(gdb) f 1
#1  0x00007f976f022d04 in client3_3_compound_cbk (req=0x7f976017bc9c, iov=0x7f976fa53770, count=1, myframe=0x7f97600c11ec) at client-rpc-fops.c:3213
3213	        CLIENT_STACK_UNWIND (compound, frame, rsp.op_ret,
(gdb) p req->rpc_status
$3 = -1
(gdb) f 0
#0  0x00007f976ed9169d in afr_pre_op_writev_cbk (frame=0x7f97601255dc, cookie=0x0, this=0x7f976800f860, op_ret=-1, op_errno=107, data=0x0, xdata=0x0)
    at afr-transaction.c:1252
1252	        write_args_cbk = &args_cbk->rsp_list[1];
(gdb) p args_cbk
$4 = (compound_args_cbk_t *) 0x0



Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Worker Ant 2016-11-24 13:07:09 UTC
REVIEW: http://review.gluster.org/15924 (cluster/afr: Handle rpc errors, xdr failures etc with proper NULL checks) posted (#1) for review on master by Krutika Dhananjay (kdhananj)

Comment 2 Worker Ant 2016-11-25 03:03:24 UTC
COMMIT: http://review.gluster.org/15924 committed in master by Pranith Kumar Karampuri (pkarampu) 
------
commit 3a5169907b44d79e207c35941b1973b1f60d2079
Author: Krutika Dhananjay <kdhananj>
Date:   Thu Nov 24 18:36:28 2016 +0530

    cluster/afr: Handle rpc errors, xdr failures etc with proper NULL checks
    
    Change-Id: Id8ba76ba116d056bc7299dc5ce0980680a5a23f8
    BUG: 1398226
    Signed-off-by: Krutika Dhananjay <kdhananj>
    Reviewed-on: http://review.gluster.org/15924
    Reviewed-by: Pranith Kumar Karampuri <pkarampu>
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>

Comment 3 Shyamsundar 2017-03-06 17:36:17 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.10.0, please open a new bug report.

glusterfs-3.10.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/gluster-users/2017-February/030119.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.