Bug 844689

Summary: Afr crashes if the last response fails for create, mknod, mkdir, link, symlink
Product: [Community] GlusterFS Reporter: Pranith Kumar K <pkarampu>
Component: replicateAssignee: Pranith Kumar K <pkarampu>
Status: CLOSED CURRENTRELEASE QA Contact: Rahul Hinduja <rhinduja>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 3.3.0CC: gluster-bugs, jdarcy, sdharane
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.4.0 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-07-24 17:30:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Pranith Kumar K 2012-07-31 12:06:20 UTC
Description of problem:
Crashes have similar pattern as this:
(gdb) bt
#0  0x00000035d7235965 in raise () from /lib64/libc.so.6
#1  0x00000035d7237118 in abort () from /lib64/libc.so.6
#2  0x00000035d722e6e2 in __assert_fail_base () from /lib64/libc.so.6
#3  0x00000035d722e792 in __assert_fail () from /lib64/libc.so.6
#4  0x00007f094c2d26e0 in afr_inode_set_ctx (this=0x8b8530, inode=0x0, params=0x7fff40e41320)
    at afr-common.c:399
#5  0x00007f094c2d2aac in afr_inode_set_read_ctx (this=0x8b8530, inode=0x0, read_child=0, 
    fresh_children=0x7f09400017e0) at afr-common.c:491
#6  0x00007f094c2d2f68 in afr_set_read_ctx_from_policy (this=0x8b8530, inode=0x0, 
    fresh_children=0x7f09400017e0, prev_read_child=0, config_read_child=0, 
    gfid=0x7f0946ce268c "\252U\260N\241{OnjG؊\224\236\242\264\002\b") at afr-common.c:661
#7  0x00007f094c281bee in afr_create_wind_cbk (frame=0x7f094f1587e4, cookie=0x1, this=0x8b8530, op_ret=-1, 
    op_errno=12, fd=0x0, inode=0x0, buf=0x0, preparent=0x0, postparent=0x0, xdata=0x0) at afr-dir-write.c:208
#8  0x00007f094c51dbd4 in client3_3_create_cbk (req=0x7f0946bc4a60, iov=0x7f0946bc4aa0, count=1, 
    myframe=0x7f094f36158c) at client-rpc-fops.c:2054
#9  0x00007f095031ba44 in rpc_clnt_handle_reply (clnt=0x8e5380, pollin=0x9192f0) at rpc-clnt.c:784
#10 0x00007f095031bdb9 in rpc_clnt_notify (trans=0x8f4ea0, mydata=0x8e53b0, event=RPC_TRANSPORT_MSG_RECEIVED, 
    data=0x9192f0) at rpc-clnt.c:903
#11 0x00007f0950317f4b in rpc_transport_notify (this=0x8f4ea0, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x9192f0)
    at rpc-transport.c:495
#12 0x00007f094d177e21 in socket_event_poll_in (this=0x8f4ea0) at socket.c:1964
#13 0x00007f094d178391 in socket_event_handler (fd=10, idx=2, data=0x8f4ea0, poll_in=1, poll_out=0, poll_err=0)


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Jeff Darcy 2012-10-31 20:59:02 UTC
http://review.gluster.org/3760

Comment 2 Rahul Hinduja 2012-11-02 06:54:40 UTC
Pranith,

Can you update the bug with steps to reproduce?

Comment 3 Pranith Kumar K 2012-11-02 08:54:02 UTC
Rahul,
   This bug is not easy to recreate. Not even with error-gen. I tested it by putting break point in the response code path and mangling the variables of the second response to simulate the failure code-path.

Pranith.