Hide Forgot
Description of problem: ======================= glusterd crashed after performing rebalance and volume delete and create operations Version-Release number of selected component (if applicable): ============================================================= glusterfs 3.4.0.44.1u2rhs How reproducible: ================ faced it once till now Steps to Reproduce: ==================== 1.Create a distributed volume and start it 2.Fuse mount the volume and create some files 3.add bricks and perform fix layout multiple times and check rebalance status (with glusterd restart a couple of times before checking rebalance status) 4.start rebalance and check status 5.Perform rebalance start force and check status 6.Stop the volume and delete it 7. Create another volume and mount it 8. add 2 more bricks and check rebalance status . Below is the output : There were no nodes listed in the output gluster v rebalance vol1 status Node Rebalanced-files size scanned failures skipped status run time in secs ---- ---------------- ---- ------- -------- ------- ------ ---------------- --------part of glusterd log -------------------- [2013-12-03 12:29:26.906589] I [socket.c:2235:socket_event_handler] 0-transport: disconnecting now [2013-12-03 12:29:29.907000] I [socket.c:2235:socket_event_handler] 0-transport: disconnecting now [2013-12-03 12:29:29.907066] I [socket.c:2235:socket_event_handler] 0-transport: disconnecting now [2013-12-03 12:29:32.907509] I [socket.c:2235:socket_event_handler] 0-transport: disconnecting now [2013-12-03 12:29:32.907611] I [socket.c:2235:socket_event_handler] 0-transport: disconnecting now pending frames: patchset: git://git.gluster.com/glusterfs.git signal received: 11 time of crash: 2013-12-03 12:29:35configuration details: argp 1 backtrace 1 dlfcn 1 fdatasync 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.4.0.44.1u2rhs /lib64/libc.so.6[0x30d3c32960] /usr/lib64/glusterfs/3.4.0.44.1u2rhs/xlator/mgmt/glusterd.so(__glusterd_defrag_notify+0x1d0)[0x7f6a194fd550] /usr/lib64/glusterfs/3.4.0.44.1u2rhs/xlator/mgmt/glusterd.so(glusterd_big_locked_notify+0x60)[0x7f6a194ad830] /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x109)[0x7f6a1ccdf2e9] /usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x28)[0x7f6a1ccdab78] /usr/lib64/glusterfs/3.4.0.44.1u2rhs/rpc-transport/socket.so(+0x557c)[0x7f6a17d5057c] /usr/lib64/glusterfs/3.4.0.44.1u2rhs/rpc-transport/socket.so(+0xa5b8)[0x7f6a17d555b8] /usr/lib64/libglusterfs.so.0(+0x62327)[0x7f6a1cf4a327] /usr/sbin/glusterd(main+0x6c7)[0x4069d7] /lib64/libc.so.6(__libc_start_main+0xfd)[0x30d3c1ecdd] /usr/sbin/glusterd[0x404619] -------------------------------------------------------------------- Actual results: Expected results: Additional info:
sosreports : http://rhsqe-repo.lab.eng.blr.redhat.com/bugs_necessary_info/1037597/
glusterd crashed while stopping and deleting volumes : Steps : ====== While trying to install rpms (from glusterfs 3.4.0.44.1u2rhs to glusterfs-3.4.0.52rhs ), volumes were in the started state , so tried to stop and delete volumes from the other nodes and glusterd crashed on one of the nodes. (gdb) bt #0 __glusterd_defrag_notify (rpc=0x15b2970, mydata=0x14fdc20, event=RPC_CLNT_CONNECT, data=<value optimized out>) at glusterd-rebalance.c:119 #1 0x00007f81df8ac830 in glusterd_big_locked_notify (rpc=0x15b2970, mydata=0x14fdc20, event=RPC_CLNT_CONNECT, data=0x0, notify_fn=0x7f81df8fc380 <__glusterd_defrag_notify>) at glusterd-handler.c:66 #2 0x0000003701a0f2e9 in rpc_clnt_notify (trans=<value optimized out>, mydata=0x15b29a0, event=<value optimized out>, data=<value optimized out>) at rpc-clnt.c:937 #3 0x0000003701a0ab78 in rpc_transport_notify (this=<value optimized out>, event=<value optimized out>, data=<value optimized out>) at rpc-transport.c:512 #4 0x00007f81de14f57c in socket_connect_finish (this=0x15a90e0) at socket.c:2192 #5 0x00007f81de1545b8 in socket_event_handler (fd=<value optimized out>, idx=<value optimized out>, data=0x15a90e0, poll_in=0, poll_out=4, poll_err=16) at socket.c:2222 #6 0x0000003701662327 in event_dispatch_epoll_handler (event_pool=0x14d8ee0) at event-epoll.c:384 #7 event_dispatch_epoll (event_pool=0x14d8ee0) at event-epoll.c:445 #8 0x00000000004069d7 in main (argc=2, argv=0x7fffcceffaa8) at glusterfsd.c:2048
core file and glusterd logs for comment 3 : http://rhsqe-repo.lab.eng.blr.redhat.com/bugs_necessary_info/1037597_26dec/
Cloning this to 3.1. to be fixed in future release.