Created attachment 954336 [details] Log file for brick daemon Description of problem: Brick process crashed. Version-Release number of selected component (if applicable): How reproducible: First time I've seen this and restarting glusterd brought the brick back online Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: Log file attached
I [socket.c:3134:socket_submit_reply] 0-tcp.data-server: not connected (priv->connected = -1) E [rpcsvc.c:1258:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x94c32, Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) to rpc-transport (tcp.data-server) E [server.c:190:server_submit_reply] (-->/usr/lib64/glusterfs/3.5.2/xlator/features/marker.so(marker_lookup_cbk+0x10e) [0x7f69045e814e] (-->/usr/lib64/glusterfs/3.5.2/xlator/debug/io-stats.so(io_stats_lookup_cbk+0x113) [0x7f69041b1c63] (-->/usr/lib64/glusterfs/3.5.2/xlator/protocol/server.so(server_lookup_cbk+0x34d) [0x7f68ffdf0e5d]))) 0-: Reply submission failed E [server-helpers.c:381:server_alloc_frame] (-->/usr/lib64/libgfrpc.so.0(rpcsvc_notify+0x103) [0x7f690be609e3] (-->/usr/lib64/libgfrpc.so.0(rpcsvc_handle_rpc_call+0x295) [0x7f690be607a5] (-->/usr/lib64/glusterfs/3.5.2/xlator/protocol/server.so(server3_3_lookup+0x9d) [0x7f68ffdf147d]))) 0-server: invalid argument: client E [rpcsvc.c:547:rpcsvc_check_and_reply_error] 0-rpcsvc: rpc actor failed to complete successfully I [server-handshake.c:575:server_setvolume] 0-data-server: accepted client from delta30-2273-2014/11/05-00:09:22:351963-data-client-13-0 (version: 3.5.1) I [client_t.c:417:gf_client_unref] 0-data-server: Shutting down connection delta30-2273-2014/11/05-00:09:22:351963-data-client-13-0 I [client_t.c:417:gf_client_unref] 0-data-server: Shutting down connection delta30-2273-2014/11/05-00:09:22:351963-data-client-13-0 I [client_t.c:417:gf_client_unref] 0-data-server: Shutting down connection I [client_t.c:417:gf_client_unref] 0-data-server: Shutting down connection I [client_t.c:417:gf_client_unref] 0-data-server: Shutting down connection pÔ I [client_t.c:417:gf_client_unref] 0-data-server: Shutting down connection pÔ I [client_t.c:417:gf_client_unref] 0-data-server: Shutting down connection pÔ I [client_t.c:417:gf_client_unref] 0-data-server: Shutting down connection pÔ I [client_t.c:417:gf_client_unref] 0-data-server: Shutting down connection pÔ I [client_t.c:417:gf_client_unref] 0-data-server: Shutting down connection pÔ I [client_t.c:417:gf_client_unref] 0-data-server: Shutting down connection pÔ I [client_t.c:417:gf_client_unref] 0-data-server: Shutting down connection pÔ I [client_t.c:417:gf_client_unref] 0-data-server: Shutting down connection pÔ I [client_t.c:417:gf_client_unref] 0-data-server: Shutting down connection pÔ I [client_t.c:417:gf_client_unref] 0-data-server: Shutting down connection pÔ pending frames: frame : type(0) op(27) frame : type(0) op(27) frame : type(0) op(27) frame : type(0) op(27) frame : type(0) op(27) frame : type(0) op(27) frame : type(0) op(27) frame : type(0) op(27) frame : type(0) op(27) frame : type(0) op(27) frame : type(0) op(27) frame : type(0) op(27) frame : type(0) op(14) frame : type(0) op(27) patchset: git://git.gluster.com/glusterfs.git signal received: 6 time of crash: 2014-11-06 06:01:26configuration details: argp 1 backtrace 1 dlfcn 1 fdatasync 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.5.2 /lib64/libc.so.6(+0x329a0)[0x7f690a6aa9a0] /lib64/libc.so.6(gsignal+0x35)[0x7f690a6aa925] /lib64/libc.so.6(abort+0x175)[0x7f690a6ac105] /lib64/libc.so.6(+0x70837)[0x7f690a6e8837] /lib64/libc.so.6(+0x76166)[0x7f690a6ee166] /lib64/libc.so.6(+0x79f1f)[0x7f690a6f1f1f] /lib64/libc.so.6(__libc_malloc+0x71)[0x7f690a6f2991] /lib64/libc.so.6(xdr_bytes+0xf0)[0x7f690a790180] /usr/lib64/libgfxdr.so.0(xdr_gfs3_readdirp_req+0x7d)[0x7f690bc4ab2d] /usr/lib64/libgfxdr.so.0(xdr_to_generic+0x73)[0x7f690bc48aa3] /usr/lib64/glusterfs/3.5.2/xlator/protocol/server.so(server3_3_readdirp+0x65)[0x7f68ffdda885] /usr/lib64/libgfrpc.so.0(rpcsvc_handle_rpc_call+0x295)[0x7f690be607a5] /usr/lib64/libgfrpc.so.0(rpcsvc_notify+0x103)[0x7f690be609e3] /usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x28)[0x7f690be62328] /usr/lib64/glusterfs/3.5.2/rpc-transport/socket.so(+0x8fb5)[0x7f69076c0fb5] /usr/lib64/glusterfs/3.5.2/rpc-transport/socket.so(+0xa9fd)[0x7f69076c29fd] /usr/lib64/libglusterfs.so.0(+0x67cd7)[0x7f690c0d7cd7] /usr/sbin/glusterfsd(main+0x564)[0x4075e4] /lib64/libc.so.6(__libc_start_main+0xfd)[0x7f690a696d1d] /usr/sbin/glusterfsd[0x404679]
This might be related to the name of the client in this message: [client_t.c:417:gf_client_unref] 0-data-server: Shutting down connection
Do you have the core file from this crash? Thanks,
REVIEW: http://review.gluster.org/9096 (libglusterfs: brick crashed after failing to send RPC reply, client_t) posted (#1) for review on release-3.5 by Kaleb KEITHLEY (kkeithle)
core file is 20M compressed, here is the backtrace. #0 0x00007f690a6aa925 in raise () from /lib64/libc.so.6 #1 0x00007f690a6ac105 in abort () from /lib64/libc.so.6 #2 0x00007f690a6e8837 in __libc_message () from /lib64/libc.so.6 #3 0x00007f690a6ee166 in malloc_printerr () from /lib64/libc.so.6 #4 0x00007f690a6f1f1f in _int_malloc () from /lib64/libc.so.6 #5 0x00007f690a6f2991 in malloc () from /lib64/libc.so.6 #6 0x00007f690a790180 in xdr_bytes_internal () from /lib64/libc.so.6 #7 0x00007f690bc4ab2d in xdr_gfs3_readdirp_req () from /usr/lib64/libgfxdr.so.0 #8 0x00007f690bc48aa3 in xdr_to_generic () from /usr/lib64/libgfxdr.so.0 #9 0x00007f68ffdda885 in server3_3_readdirp () from /usr/lib64/glusterfs/3.5.2/xlator/protocol/server.so #10 0x00007f690be607a5 in rpcsvc_handle_rpc_call () from /usr/lib64/libgfrpc.so.0 #11 0x00007f690be609e3 in rpcsvc_notify () from /usr/lib64/libgfrpc.so.0 #12 0x00007f690be62328 in rpc_transport_notify () from /usr/lib64/libgfrpc.so.0 #13 0x00007f69076c0fb5 in ?? () from /usr/lib64/glusterfs/3.5.2/rpc-transport/socket.so #14 0x00007f69076c29fd in ?? () from /usr/lib64/glusterfs/3.5.2/rpc-transport/socket.so #15 0x00007f690c0d7cd7 in ?? () from /usr/lib64/libglusterfs.so.0 #16 0x00000000004075e4 in main ()
Same brick has crashed again. #0 0x00007f4322aee3a0 in pthread_mutex_lock () from /lib64/libpthread.so.0 #1 0x00007f431c92f4b6 in pl_inodelk_client_cleanup () from /usr/lib64/glusterfs/3.5.2/xlator/features/locks.so #2 0x00007f431c924eea in ?? () from /usr/lib64/glusterfs/3.5.2/xlator/features/locks.so #3 0x00007f431c924f2a in ?? () from /usr/lib64/glusterfs/3.5.2/xlator/features/locks.so #4 0x00007f4323dcfc75 in gf_client_unref () from /usr/lib64/libglusterfs.so.0 #5 0x00007f4317bbb21c in server_submit_reply () from /usr/lib64/glusterfs/3.5.2/xlator/protocol/server.so #6 0x00007f4317bc7d4f in server_statfs_cbk () from /usr/lib64/glusterfs/3.5.2/xlator/protocol/server.so #7 0x00007f4317df6b96 in io_stats_statfs_cbk () from /usr/lib64/glusterfs/3.5.2/xlator/debug/io-stats.so #8 0x00007f431c70adb2 in iot_statfs_cbk () from /usr/lib64/glusterfs/3.5.2/xlator/performance/io-threads.so #9 0x00007f431d37b3cc in posix_statfs () from /usr/lib64/glusterfs/3.5.2/xlator/storage/posix.so #10 0x00007f4323d8a606 in default_statfs () from /usr/lib64/libglusterfs.so.0 #11 0x00007f4323d8a606 in default_statfs () from /usr/lib64/libglusterfs.so.0 #12 0x00007f4323d8a606 in default_statfs () from /usr/lib64/libglusterfs.so.0 #13 0x00007f431c70f0e4 in iot_statfs_wrapper () from /usr/lib64/glusterfs/3.5.2/xlator/performance/io-threads.so #14 0x00007f4323d9ea06 in call_resume () from /usr/lib64/libglusterfs.so.0 #15 0x00007f431c716ee8 in iot_worker () from /usr/lib64/glusterfs/3.5.2/xlator/performance/io-threads.so #16 0x00007f4322aec9d1 in start_thread () from /lib64/libpthread.so.0 #17 0x00007f432245ab6d in clone () from /lib64/libc.so.6
Created attachment 958094 [details] Brick log Here is the brick log for the second crash
More gdb info. 25 Thread 0x7f4316444700 (LWP 32288) 0x00007f4322af05bc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 24 Thread 0x7f4320dc7700 (LWP 32261) 0x00007f4322af44b5 in sigwait () from /lib64/libpthread.so.0 23 Thread 0x7f4317648700 (LWP 32284) 0x00007f4322af05bc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 22 Thread 0x7f4314f78700 (LWP 32302) 0x00007f4322af36fd in write () from /lib64/libpthread.so.0 21 Thread 0x7f4314d76700 (LWP 32304) 0x00007f4322af36fd in write () from /lib64/libpthread.so.0 20 Thread 0x7f4314172700 (LWP 32377) 0x00007f4322af36fd in write () from /lib64/libpthread.so.0 19 Thread 0x7f431557e700 (LWP 32291) 0x00007f4322af36fd in write () from /lib64/libpthread.so.0 18 Thread 0x7f431547d700 (LWP 32292) 0x00007f4322af3264 in __lll_lock_wait () from /lib64/libpthread.so.0 17 Thread 0x7f431dd97700 (LWP 32264) 0x00007f4322af3f3d in nanosleep () from /lib64/libpthread.so.0 16 Thread 0x7f4317547700 (LWP 32285) 0x00007f4322af3264 in __lll_lock_wait () from /lib64/libpthread.so.0 15 Thread 0x7f4317446700 (LWP 32286) 0x00007f432241ecdd in nanosleep () from /lib64/libc.so.6 14 Thread 0x7f4316c45700 (LWP 32287) 0x00007f4322af098e in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 13 Thread 0x7f4324206700 (LWP 32260) 0x00007f432245b163 in epoll_wait () from /lib64/libc.so.6 12 Thread 0x7f431567f700 (LWP 32290) 0x00007f4322af098e in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 11 Thread 0x7f4315079700 (LWP 32301) 0x00007f4322af3264 in __lll_lock_wait () from /lib64/libpthread.so.0 10 Thread 0x7f431fdc5700 (LWP 32263) 0x00007f4322af098e in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 * 9 Thread 0x7f4315780700 (LWP 32289) 0x00007f4322af3264 in __lll_lock_wait () from /lib64/libpthread.so.0 8 Thread 0x7f4314475700 (LWP 32374) 0x00007f4322af36fd in write () from /lib64/libpthread.so.0 7 Thread 0x7f431517a700 (LWP 32295) 0x00007f4322498be0 in _dl_addr () from /lib64/libc.so.6 6 Thread 0x7f4314374700 (LWP 32375) 0x00007f4322452c57 in writev () from /lib64/libc.so.6 5 Thread 0x7f4314273700 (LWP 32376) 0x00007f4322452c57 in writev () from /lib64/libc.so.6 4 Thread 0x7f431527b700 (LWP 32294) 0x00007f4322af098e in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 3 Thread 0x7f4314e77700 (LWP 32303) 0x00007f4322af098e in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 2 Thread 0x7f43205c6700 (LWP 32262) 0x00007f4322af098e in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 1 Thread 0x7f431537c700 (LWP 32293) 0x00007f4322aee3a0 in pthread_mutex_lock () from /lib64/libpthread.so.0
This bug is getting closed because the 3.5 is marked End-Of-Life. There will be no further updates to this version. Please open a new bug against a version that still receives bugfixes if you are still facing this issue in a more current release.