Description of problem: ======================= bitd crashed. Version-Release number of selected component (if applicable): ============================================================= 0.803.gitf64666f.el6.x86_64 How reproducible: ================= Intemittent Steps to Reproduce: =================== Don't know exact steps to reproduce. 1. created 2 volumes in cluster and enabled bitrot for that [root@rhs-client37 ~]# gluster v info BitRot1 Volume Name: BitRot1 Type: Distributed-Replicate Volume ID: a311984b-5978-4041-91fd-be627c616bea Status: Started Number of Bricks: 2 x 3 = 6 Transport-type: tcp Bricks: Brick1: rhs-client44:/pavanbrick6/br1 Brick2: rhs-client44:/pavanbrick6/br2 Brick3: rhs-client44:/pavanbrick6/br3 Brick4: rhs-client44:/pavanbrick6/br4 Brick5: rhs-client44:/pavanbrick6/br5 Brick6: rhs-client44:/pavanbrick6/br6 Options Reconfigured: features.bitrot: on performance.open-behind: off [root@rhs-client37 ~]# gluster v info rac1 Volume Name: rac1 Type: Distribute Volume ID: d462f6c7-809f-4eb1-9517-7947527c5415 Status: Started Number of Bricks: 3 Transport-type: tcp Bricks: Brick1: rhs-client44:/pavanbrick7/r1 Brick2: rhs-client37:/pavanbrick7/r1 Brick3: rhs-client38:/pavanbrick7/r1 Options Reconfigured: features.bitrot: on 2. created few files and while verifying bitrot functionality found that on one of the machine (rhs-client44) bitd crashed Actual results: =============== bitd crashed Additional info: ================ bt:- #0 0x00007f63fc558434 in gf_changelog_reborp_rpcsvc_notify (rpc=<value optimized out>, mydata=0x7f63c40016e0, event=<value optimized out>, data=<value optimized out>) at gf-changelog-reborp.c:161 161 gf_log (this->name, GF_LOG_WARNING, "failed to unlink " Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.149.el6_6.5.x86_64 keyutils-libs-1.4-5.el6.x86_64 krb5-libs-1.10.3-33.el6.x86_64 libcom_err-1.41.12-21.el6.x86_64 libgcc-4.4.7-11.el6.x86_64 libselinux-2.0.94-5.8.el6.x86_64 openssl-1.0.1e-30.el6_6.5.x86_64 zlib-1.2.3-29.el6.x86_64 (gdb) bt #0 0x00007f63fc558434 in gf_changelog_reborp_rpcsvc_notify (rpc=<value optimized out>, mydata=0x7f63c40016e0, event=<value optimized out>, data=<value optimized out>) at gf-changelog-reborp.c:161 #1 0x0000003712c09ea4 in rpcsvc_program_notify (trans=<value optimized out>, mydata=<value optimized out>, event=<value optimized out>, data=0x7f63b8039900) at rpcsvc.c:327 #2 rpcsvc_accept (trans=<value optimized out>, mydata=<value optimized out>, event=<value optimized out>, data=0x7f63b8039900) at rpcsvc.c:350 #3 rpcsvc_notify (trans=<value optimized out>, mydata=<value optimized out>, event=<value optimized out>, data=0x7f63b8039900) at rpcsvc.c:775 #4 0x0000003712c0b7c8 in rpc_transport_notify (this=<value optimized out>, event=<value optimized out>, data=<value optimized out>) at rpc-transport.c:543 #5 0x00007f63fd7d032e in socket_server_event_handler (fd=<value optimized out>, idx=<value optimized out>, data=0x7f63c4008a60, poll_in=<value optimized out>, poll_out=<value optimized out>, poll_err=<value optimized out>) at socket.c:2820 #6 0x000000371247cee0 in event_dispatch_epoll_handler (data=0x7f63f8002250) at event-epoll.c:572 #7 event_dispatch_epoll_worker (data=0x7f63f8002250) at event-epoll.c:674 #8 0x0000003ac86079d1 in start_thread () from /lib64/libpthread.so.0 #9 0x0000003ac7ee88fd in clone () from /lib64/libc.so.6 log snippet:- [2015-03-26 13:55:57.601516] E [socket.c:823:__socket_server_bind] 0-socket.gfchangelog: binding to failed: Address already in use [2015-03-26 13:55:57.601534] E [socket.c:826:__socket_server_bind] 0-socket.gfchangelog: Port is already in use [2015-03-26 13:55:57.601546] W [rpcsvc.c:1583:rpcsvc_transport_create] 0-rpc-service: listening on transport failed pending frames: frame : type(0) op(0) patchset: git://git.gluster.com/glusterfs.git signal received: 7 time of crash: 2015-03-26 13:55:57 configuration details: argp 1 backtrace 1 dlfcn 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.7dev [2015-03-26 13:55:57.601685] E [gf-changelog.c:543:gf_changelog_register_generic] 0-gfchangelog: Error registering with changelog xlator pending frames: frame : type(0) op(0) patchset: git://git.gluster.com/glusterfs.git signal received: 11 [2015-03-26 13:55:57.601873] E [bit-rot.c:1018:br_enact_signer] 0-bit-rot: Register to changelog failed [Reason: Address already in use] [2015-03-26 13:55:57.602010] E [bit-rot.c:1166:br_handle_events] 0-bit-rot: failed to connect to the child (subvolume: BitRot1-client-0) /usr/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb6)[0x7f92d80e1126] /usr/lib64/libglusterfs.so.0(gf_print_trace+0x33f)[0x7f92d80fcd5f] /lib64/libc.so.6[0x3ac7e326a0] /usr/lib64/libgfchangelog.so.0(gf_changelog_reborp_rpcsvc_notify+0xd4)[0x7f92ccc7a434] /usr/lib64/libgfrpc.so.0(rpcsvc_notify+0xa4)[0x7f92d7eaeea4] /usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x28)[0x7f92d7eb07c8] /usr/lib64/glusterfs/3.7dev/rpc-transport/socket.so(+0x832e)[0x7f92cdef232e] /usr/lib64/libglusterfs.so.0(+0x7cee0)[0x7f92d813bee0] /lib64/libpthread.so.0[0x3ac86079d1] /lib64/libc.so.6(clone+0x6d)[0x3ac7ee88fd] [2015-03-27 06:03:28.691590] I [MSGID: 100030] [glusterfsd.c:2288:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs ver sion 3.7dev (args: /usr/sbin/glusterfs -s localhost --volfile-id gluster/bitd -p /var/lib/glusterd/bitd/run/bitd.pid -l /var/log/glust erfs/bitd.log -S /var/run/gluster/a2f42fcde27fccc09d3f5318ab8b9ed2.socket) [2015-03-27 06:03:28.868332] I [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2015-03-27 06:03:30.205733] I [bit-rot.c:1419:init] 0-bit-rot: bit-rot xlator loaded in "SIGNER" mode [2015-03-27 06:03:30.205955] I [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 2 [2015-03-27 06:03:30.208236] I [client.c:2389:notify] 0-BitRot1-client-0: parent translators are ready, attempting connect on transpor t [2015-03-27 06:03:30.212406] I [client.c:2389:notify] 0-BitRot1-client-1: parent translators are ready, attempting connect on transpor t [2015-03-27 06:03:30.212669] I [rpc-clnt.c:1806:rpc_clnt_reconfig] 0-BitRot1-client-0: changing port to 49154 (from 0) [2015-03-27 06:03:30.216228] I [client.c:2389:notify] 0-BitRot1-client-2: parent translators are ready, attempting connect on transpor t [2015-03-27 06:03:30.219695] I [client-handshake.c:1414:select_server_supported_programs] 0-BitRot1-client-0: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2015-03-27 06:03:30.219788] I [rpc-clnt.c:1806:rpc_clnt_reconfig] 0-BitRot1-client-1: changing port to 49155 (from 0) [2015-03-27 06:03:30.222783] I [client.c:2389:notify] 0-BitRot1-client-3: parent translators are ready, attempting connect on transpor t [2015-03-27 06:03:30.226078] I [client-handshake.c:1202:client_setvolume_cbk] 0-BitRot1-client-0: Connected to BitRot1-client-0, attac hed to remote volume '/pavanbrick6/br1'. [2015-03-27 06:03:30.226102] I [client-handshake.c:1212:client_setvolume_cbk] 0-BitRot1-client-0: Server and Client lk-version numbers are not same, reopening the fds [2015-03-27 06:03:30.227889] I [client-handshake.c:187:client_set_lk_version_cbk] 0-BitRot1-client-0: Server lk version = 1 [2015-03-27 06:03:30.228003] I [client-handshake.c:1414:select_server_supported_programs] 0-BitRot1-client-1: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2015-03-27 06:03:30.228077] I [rpc-clnt.c:1806:rpc_clnt_reconfig] 0-BitRot1-client-2: changing port to 49156 (from 0) [2015-03-27 06:03:30.229337] I [client.c:2389:notify] 0-BitRot1-client-4: parent translators are ready, attempting connect on transpor t [2015-03-27 06:03:30.232713] I [client-handshake.c:1202:client_setvolume_cbk] 0-BitRot1-client-1: Connected to BitRot1-client-1, attac hed to remote volume '/pavanbrick6/br2'. [2015-03-27 06:03:30.232746] I [client-handshake.c:1212:client_setvolume_cbk] 0-BitRot1-client-1: Server and Client lk-version numbers are not same, reopening the fds
Not seen this in a long time now.