Description of problem: Crashes have similar pattern as this: (gdb) bt #0 0x00000035d7235965 in raise () from /lib64/libc.so.6 #1 0x00000035d7237118 in abort () from /lib64/libc.so.6 #2 0x00000035d722e6e2 in __assert_fail_base () from /lib64/libc.so.6 #3 0x00000035d722e792 in __assert_fail () from /lib64/libc.so.6 #4 0x00007f094c2d26e0 in afr_inode_set_ctx (this=0x8b8530, inode=0x0, params=0x7fff40e41320) at afr-common.c:399 #5 0x00007f094c2d2aac in afr_inode_set_read_ctx (this=0x8b8530, inode=0x0, read_child=0, fresh_children=0x7f09400017e0) at afr-common.c:491 #6 0x00007f094c2d2f68 in afr_set_read_ctx_from_policy (this=0x8b8530, inode=0x0, fresh_children=0x7f09400017e0, prev_read_child=0, config_read_child=0, gfid=0x7f0946ce268c "\252U\260N\241{OnjG؊\224\236\242\264\002\b") at afr-common.c:661 #7 0x00007f094c281bee in afr_create_wind_cbk (frame=0x7f094f1587e4, cookie=0x1, this=0x8b8530, op_ret=-1, op_errno=12, fd=0x0, inode=0x0, buf=0x0, preparent=0x0, postparent=0x0, xdata=0x0) at afr-dir-write.c:208 #8 0x00007f094c51dbd4 in client3_3_create_cbk (req=0x7f0946bc4a60, iov=0x7f0946bc4aa0, count=1, myframe=0x7f094f36158c) at client-rpc-fops.c:2054 #9 0x00007f095031ba44 in rpc_clnt_handle_reply (clnt=0x8e5380, pollin=0x9192f0) at rpc-clnt.c:784 #10 0x00007f095031bdb9 in rpc_clnt_notify (trans=0x8f4ea0, mydata=0x8e53b0, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x9192f0) at rpc-clnt.c:903 #11 0x00007f0950317f4b in rpc_transport_notify (this=0x8f4ea0, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x9192f0) at rpc-transport.c:495 #12 0x00007f094d177e21 in socket_event_poll_in (this=0x8f4ea0) at socket.c:1964 #13 0x00007f094d178391 in socket_event_handler (fd=10, idx=2, data=0x8f4ea0, poll_in=1, poll_out=0, poll_err=0) Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
http://review.gluster.org/3760
Pranith, Can you update the bug with steps to reproduce?
Rahul, This bug is not easy to recreate. Not even with error-gen. I tested it by putting break point in the response code path and mangling the variables of the second response to simulate the failure code-path. Pranith.