Description of problem: ======================= Rebooting node while scheduler was creating snaphots on volume which has bit rot enabled resulted in glusterd crash. Version-Release number of selected component (if applicable): ============================================================= gluster --version glusterfs 3.7.0beta1 built on May 1 2015 How reproducible: ================ 1/1 Steps to Reproduce: =================== 1.Create a dist-rep volume , disperse volume(4 redundant bricks)and a distribute volume 2.Enable USS, quota and bitrot on all the volumes 3.Add a job which creates snapshots every 5 mins on the volumes 4. While it is in progress , reboot rhs-arch-srv2.lab.eng.blr.redhat.com 5.Check gluster status after the node comes back. 2015-05-05 07:16:54 configuration details: argp 1 backtrace 1 dlfcn 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.7.0beta1 /usr/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb6)[0x3b7c821e16] /usr/lib64/libglusterfs.so.0(gf_print_trace+0x33f)[0x3b7c83db2f] /lib64/libc.so.6[0x3a964326a0] /usr/lib64/glusterfs/3.7.0beta1/xlator/mgmt/glusterd.so(glusterd_bitdsvc_manager+0x56)[0x7f31e44b34c6] /usr/lib64/glusterfs/3.7.0beta1/xlator/mgmt/glusterd.so(glusterd_svcs_manager+0xdd)[0x7f31e44b1bed] /usr/lib64/glusterfs/3.7.0beta1/xlator/mgmt/glusterd.so(glusterd_compare_friend_data+0x332)[0x7f31e44276f2] /usr/lib64/glusterfs/3.7.0beta1/xlator/mgmt/glusterd.so(+0x4f9f3)[0x7f31e43fe9f3] /usr/lib64/glusterfs/3.7.0beta1/xlator/mgmt/glusterd.so(glusterd_friend_sm+0x170)[0x7f31e43ff6e0] /usr/lib64/glusterfs/3.7.0beta1/xlator/mgmt/glusterd.so(__glusterd_handle_incoming_friend_req+0x232)[0x7f31e43fdad2] /usr/lib64/glusterfs/3.7.0beta1/xlator/mgmt/glusterd.so(glusterd_big_locked_handler+0x3f)[0x7f31e43e4ebf] /usr/lib64/libgfrpc.so.0(rpcsvc_handle_rpc_call+0x295)[0x3b7cc09c85] /usr/lib64/libgfrpc.so.0(rpcsvc_notify+0x103)[0x3b7cc09ec3] /usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x28)[0x3b7cc0b7b8] /usr/lib64/glusterfs/3.7.0beta1/rpc-transport/socket.so(+0x9bcd)[0x7f31e3411bcd] /usr/lib64/glusterfs/3.7.0beta1/rpc-transport/socket.so(+0xb6fd)[0x7f31e34136fd] /usr/lib64/libglusterfs.so.0[0x3b7c87d4b0] /lib64/libpthread.so.0[0x3a968079d1] /lib64/libc.so.6(clone+0x6d)[0x3a964e89dd] bt: === Core was generated by `/usr/sbin/glusterd --pid-file=/var/run/glusterd.pid'. Program terminated with signal 11, Segmentation fault. #0 0x00007f31e44b34c6 in glusterd_bitdsvc_manager () from /usr/lib64/glusterfs/3.7.0beta1/xlator/mgmt/glusterd.so Missing separate debuginfos, use: debuginfo-install glusterfs-3.7.0beta1-0.3.git7aeae00.el6.x86_64 (gdb) bt #0 0x00007f31e44b34c6 in glusterd_bitdsvc_manager () from /usr/lib64/glusterfs/3.7.0beta1/xlator/mgmt/glusterd.so #1 0x00007f31e44b1bed in glusterd_svcs_manager () from /usr/lib64/glusterfs/3.7.0beta1/xlator/mgmt/glusterd.so #2 0x00007f31e44276f2 in glusterd_compare_friend_data () from /usr/lib64/glusterfs/3.7.0beta1/xlator/mgmt/glusterd.so #3 0x00007f31e43fe9f3 in ?? () from /usr/lib64/glusterfs/3.7.0beta1/xlator/mgmt/glusterd.so #4 0x00007f31e43ff6e0 in glusterd_friend_sm () from /usr/lib64/glusterfs/3.7.0beta1/xlator/mgmt/glusterd.so #5 0x00007f31e43fdad2 in __glusterd_handle_incoming_friend_req () from /usr/lib64/glusterfs/3.7.0beta1/xlator/mgmt/glusterd.so #6 0x00007f31e43e4ebf in glusterd_big_locked_handler () from /usr/lib64/glusterfs/3.7.0beta1/xlator/mgmt/glusterd.so #7 0x0000003b7cc09c85 in rpcsvc_handle_rpc_call () from /usr/lib64/libgfrpc.so.0 #8 0x0000003b7cc09ec3 in rpcsvc_notify () from /usr/lib64/libgfrpc.so.0 #9 0x0000003b7cc0b7b8 in rpc_transport_notify () from /usr/lib64/libgfrpc.so.0 #10 0x00007f31e3411bcd in ?? () from /usr/lib64/glusterfs/3.7.0beta1/rpc-transport/socket.so #11 0x00007f31e34136fd in ?? () from /usr/lib64/glusterfs/3.7.0beta1/rpc-transport/socket.so #12 0x0000003b7c87d4b0 in ?? () from /usr/lib64/libglusterfs.so.0 #13 0x0000003a968079d1 in start_thread () from /lib64/libpthread.so.0 #14 0x0000003a964e89dd in clone () from /lib64/libc.so.6 Actual results: Expected results: Additional info:
Version : ======= gluster --version glusterfs 3.7.0beta1 built on May 1 2015 Faced a similar crash while attaching another node to the cluster where volumes had bit rot enabled. [root@inception ~]# gluster peer probe snapshot11.lab.eng.blr.redhat.com peer probe: failed: Probe returned with unknown errno -1 (gdb) bt #0 0x00007f750ca2a4c6 in glusterd_bitdsvc_manager () from /usr/lib64/glusterfs/3.7.0beta1/xlator/mgmt/glusterd.so #1 0x00007f750ca28bed in glusterd_svcs_manager () from /usr/lib64/glusterfs/3.7.0beta1/xlator/mgmt/glusterd.so #2 0x00007f750c99e6f2 in glusterd_compare_friend_data () from /usr/lib64/glusterfs/3.7.0beta1/xlator/mgmt/glusterd.so #3 0x00007f750c9759f3 in ?? () from /usr/lib64/glusterfs/3.7.0beta1/xlator/mgmt/glusterd.so #4 0x00007f750c9766e0 in glusterd_friend_sm () from /usr/lib64/glusterfs/3.7.0beta1/xlator/mgmt/glusterd.so #5 0x00007f750c974ad2 in __glusterd_handle_incoming_friend_req () from /usr/lib64/glusterfs/3.7.0beta1/xlator/mgmt/glusterd.so #6 0x00007f750c95bebf in glusterd_big_locked_handler () from /usr/lib64/glusterfs/3.7.0beta1/xlator/mgmt/glusterd.so #7 0x00000031f6c09c85 in rpcsvc_handle_rpc_call () from /usr/lib64/libgfrpc.so.0 #8 0x00000031f6c09ec3 in rpcsvc_notify () from /usr/lib64/libgfrpc.so.0 #9 0x00000031f6c0b7b8 in rpc_transport_notify () from /usr/lib64/libgfrpc.so.0 #10 0x00007f750bbd9bcd in ?? () from /usr/lib64/glusterfs/3.7.0beta1/rpc-transport/socket.so #11 0x00007f750bbdb6fd in ?? () from /usr/lib64/glusterfs/3.7.0beta1/rpc-transport/socket.so #12 0x00000031f647d4b0 in ?? () from /usr/lib64/libglusterfs.so.0 #13 0x00000032284079d1 in start_thread () from /lib64/libpthread.so.0 #14 0x00000032280e88fd in clone () from /lib64/libc.so.6
Gaurav, Mind having a look at this?
patch http://review.gluster.org/#/c/10664/ should fix this issue. If you find the issue again please reopen this bug. hence moving the status of bug close.