Description of problem: The Glusterfsd process was crashed while doing heavy operations. Version-Release number of selected component (if applicable):RHS-2.0.z How reproducible: It might hard to reproduce. Steps to Reproduce: 1.Create distribut-replicate volume. 2.Run creating and deletion of file for some 2 weeks. 3.Run kernel untar on some clients parallely, Eg: while : ; do mkdir client1 ; tar -xvf linux-2.6.39.tar.gz -C client1 ; rm -rf client1; done Like this run on multiple fuse clients parallely. 4.On one of the other client run fs_mark to create file with -F option in loop. Eg: while :; do rm -rvf $MNT_PT/*; ./fs_mark -s 100000 -D 1000 -N 1000 -L 10 -n 10 -t 50 -F -d $MNT_PT; rm -rvf $MNT_PT/*; done 5. This is most probable steps. Actual results:The glusterfsd process crashed. Expected results: No process should crash. Additional info: ############################################################### Back trace in the log file : ############################################################### patchset: git://git.gluster.com/glusterfs.git signal received: 11 time of crash: 2012-09-07 07:25:09 configuration details: argp 1 backtrace 1 dlfcn 1 fdatasync 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.3.0rhs /lib64/libc.so.6[0x3d9fe32900] /lib64/libc.so.6(cfree+0x1c)[0x3d9fe7a5ac] /lib64/libc.so.6(_IO_free_backup_area+0x19)[0x3d9fe73979] /lib64/libc.so.6(_IO_file_overflow+0x20a)[0x3d9fe718da] /lib64/libc.so.6(_IO_file_xsputn+0x6d)[0x3d9fe70b8d] /lib64/libc.so.6(_IO_vfprintf+0x418e)[0x3d9fe47e7e] /lib64/libc.so.6(__fprintf_chk+0xeb)[0x3d9fefd1ab] /usr/lib64/libglusterfs.so.0(_gf_log+0x393)[0x3d7b419553] /usr/lib64/glusterfs/3.3.0rhs/xlator/protocol/server.so(server_unlink_cbk+0x22b)[0x7f5e87bd5c2b] /usr/lib64/glusterfs/3.3.0rhs/xlator/debug/io-stats.so(io_stats_unlink_cbk+0xec)[0x7f5e87df744c] /usr/lib64/glusterfs/3.3.0rhs/xlator/features/marker.so(marker_unlink_stat_cbk+0x200)[0x7f5e8c0846f0] /usr/lib64/libglusterfs.so.0(default_stat_cbk+0xad)[0x3d7b4265fd] /usr/lib64/glusterfs/3.3.0rhs/xlator/performance/io-threads.so(iot_stat_cbk+0xad)[0x7f5e8c4a7aed] /usr/lib64/libglusterfs.so.0(default_stat_cbk+0xad)[0x3d7b4265fd] ############################################################### Backtrace from the core ############################################################### Program terminated with signal 11, Segmentation fault. #0 0x0000003d9fe7a5ac in free () from /lib64/libc.so.6 Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.47.el6_2.12.x86_64 libgcc-4.4.6-3.el6.x86_64 openssl-1.0.0-20.el6_2.5.x86_64 zlib-1.2.3-27.el6.x86_64 (gdb) bt #0 0x0000003d9fe7a5ac in free () from /lib64/libc.so.6 #1 0x0000003d9fe73979 in _IO_free_backup_area_internal () from /lib64/libc.so.6 #2 0x0000003d9fe718da in _IO_new_file_overflow () from /lib64/libc.so.6 #3 0x0000003d9fe70b8d in _IO_new_file_xsputn () from /lib64/libc.so.6 #4 0x0000003d9fe47e7e in vfprintf () from /lib64/libc.so.6 #5 0x0000003d9fefd1ab in __fprintf_chk () from /lib64/libc.so.6 #6 0x0000003d7b419553 in fprintf (domain=<value optimized out>, file=0x7f5e87bdebff "server3_1-fops.c", function=<value optimized out>, line=1085, level=GF_LOG_INFO, fmt=0x7f5e87bdeae0 "%ld: UNLINK %s (%s) ==> %d (%s)") at /usr/include/bits/stdio2.h:98 #7 _gf_log (domain=<value optimized out>, file=0x7f5e87bdebff "server3_1-fops.c", function=<value optimized out>, line=1085, level=GF_LOG_INFO, fmt=0x7f5e87bdeae0 "%ld: UNLINK %s (%s) ==> %d (%s)") at logging.c:579 #8 0x00007f5e87bd5c2b in server_unlink_cbk (frame=0x7f5e8fcb5f34, cookie=<value optimized out>, this=0x194cd20, op_ret=-1, op_errno=2, preparent=<value optimized out>, postparent=0x0, xdata=0x0) at server3_1-fops.c:1081 #9 0x00007f5e87df744c in io_stats_unlink_cbk (frame=0x7f5e8fef625c, cookie=<value optimized out>, this=<value optimized out>, op_ret=-1, op_errno=2, preparent=0x0, postparent=0x0, xdata=0x0) at io-stats.c:1430 #10 0x00007f5e8c0846f0 in marker_unlink_stat_cbk (frame=0x7f5e8ff290ac, cookie=<value optimized out>, this=<value optimized out>, op_ret=<value optimized out>, op_errno=<value optimized out>, buf=<value optimized out>, xdata=0x0) at marker.c:833 #11 0x0000003d7b4265fd in default_stat_cbk (frame=0x7f5e8feabc78, cookie=<value optimized out>, this=<value optimized out>, op_ret=-1, op_errno=<value optimized out>, buf=<value optimized out>, xdata=0x0) at defaults.c:47 #12 0x00007f5e8c4a7aed in iot_stat_cbk (frame=0x7f5e8ff33344, cookie=<value optimized out>, this=<value optimized out>, op_ret=-1, op_errno=<value optimized out>, buf=<value optimized out>, xdata=0x0) at io-threads.c:1170 #13 0x0000003d7b4265fd in default_stat_cbk (frame=0x7f5e8fed437c, cookie=<value optimized out>, this=<value optimized out>, op_ret=-1, op_errno=<value optimized out>, buf=<value optimized out>, xdata=0x0) at defaults.c:47 #14 0x0000003d7b4265fd in default_stat_cbk (frame=0x7f5e8ff30ba0, cookie=<value optimized out>, this=<value optimized out>, op_ret=-1, op_errno=<value optimized out>, buf=<value optimized out>, xdata=0x0) at defaults.c:47 #15 0x00007f5e8cae64e0 in posix_stat (frame=0x7f5e8fec7ff8, this=<value optimized out>, loc=<value optimized out>, xdata=<value optimized out>) at posix.c:231 #16 0x0000003d7b41b254 in default_stat (frame=<value optimized out>, this=0x1946680, loc=0x7f5e8fba37c8, xdata=0x0) at defaults.c:1231 #17 0x0000003d7b41b254 in default_stat (frame=<value optimized out>, this=0x1947740, loc=0x7f5e8fba37c8, xdata=0x0) at defaults.c:1231 #18 0x00007f5e8c4ab754 in iot_stat_wrapper (frame=<value optimized out>, this=0x19487e0, loc=0x7f5e8fba37c8, xdata=0x0) at io-threads.c:1178 #19 0x0000003d7b42e5ca in call_resume_wind (stub=0x7f5e8fba3788) at call-stub.c:2394 #20 call_resume (stub=0x7f5e8fba3788) at call-stub.c:4151 #21 0x00007f5e8c4b1f1b in iot_worker (data=0x195a9a0) at io-threads.c:131 #22 0x0000003da06077f1 in start_thread () from /lib64/libpthread.so.0 #23 0x0000003d9fee5ccd in clone () from /lib64/libc.so.6 (gdb) f 0 #0 0x0000003d9fe7a5ac in free () from /lib64/libc.so.6 (gdb) f 1 #1 0x0000003d9fe73979 in _IO_free_backup_area_internal () from /lib64/libc.so.6 (gdb) f 8 #8 0x00007f5e87bd5c2b in server_unlink_cbk (frame=0x7f5e8fcb5f34, cookie=<value optimized out>, this=0x194cd20, op_ret=-1, op_errno=2, preparent=<value optimized out>, postparent=0x0, xdata=0x0) at server3_1-fops.c:1081 1081 gf_log (this->name, GF_LOG_INFO, (gdb) #8 0x00007f5e87bd5c2b in server_unlink_cbk (frame=0x7f5e8fcb5f34, cookie=<value optimized out>, this=0x194cd20, op_ret=-1, op_errno=2, preparent=<value optimized out>, postparent=0x0, xdata=0x0) at server3_1-fops.c:1081 1081 gf_log (this->name, GF_LOG_INFO, (gdb) l 1076 1077 gf_stat_from_iatt (&rsp.preparent, preparent); 1078 gf_stat_from_iatt (&rsp.postparent, postparent); 1079 1080 } else { 1081 gf_log (this->name, GF_LOG_INFO, 1082 "%"PRId64": UNLINK %s (%s) ==> %"PRId32" (%s)", 1083 frame->root->unique, state->loc.path, 1084 state->loc.inode ? uuid_utoa (state->loc.inode->gfid) : 1085 "--", op_ret, strerror (op_errno)); (gdb) f 9 #9 0x00007f5e87df744c in io_stats_unlink_cbk (frame=0x7f5e8fef625c, cookie=<value optimized out>, this=<value optimized out>, op_ret=-1, op_errno=2, preparent=0x0, postparent=0x0, xdata=0x0) at io-stats.c:1430 1430 STACK_UNWIND_STRICT (unlink, frame, op_ret, op_errno, (gdb) f 10 #10 0x00007f5e8c0846f0 in marker_unlink_stat_cbk (frame=0x7f5e8ff290ac, cookie=<value optimized out>, this=<value optimized out>, op_ret=<value optimized out>, op_errno=<value optimized out>, buf=<value optimized out>, xdata=0x0) at marker.c:833 833 STACK_UNWIND_STRICT (unlink, frame, -1, op_errno, NULL, NULL, NULL);
With master branch not seen this happening anymore, running the similar type of tests in longevity test-bed for more than 2weeks and this issue is not seen. Marking it as WORKSFORME (with Fixed in version as 3.4.0qa4), please feel free to reopen if seen again.
I wonder if this one is related, it's release-3.3 git build: patchset: git://git.gluster.com/glusterfs.git signal received: 6 time of crash: 2013-06-30 11:41:46 configuration details: argp 1 backtrace 1 dlfcn 1 fdatasync 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.3.1 /lib64/libc.so.6[0x3932a32920] /lib64/libc.so.6(gsignal+0x35)[0x3932a328a5] /lib64/libc.so.6(abort+0x175)[0x3932a34085] /lib64/libc.so.6[0x3932a707b7] /lib64/libc.so.6[0x3932a760e6] /lib64/libc.so.6(_IO_free_backup_area+0x19)[0x3932a747c9] /lib64/libc.so.6(_IO_file_overflow+0x20a)[0x3932a7271a] /lib64/libc.so.6(_IO_file_xsputn+0x6d)[0x3932a719cd] /lib64/libc.so.6(_IO_vfprintf+0x41c0)[0x3932a48410] /lib64/libc.so.6(__fprintf_chk+0xeb)[0x3932affdcb] /usr/lib64/libglusterfs.so.0(_gf_log+0x393)[0x3350c19553] /usr/lib64/glusterfs/3.3.1/xlator/storage/posix.so(posix_create+0x7d0)[0x7f7a2eaae650] /usr/lib64/glusterfs/3.3.1/xlator/features/access-control.so(posix_acl_create+0x291)[0x7f7a2e89acd1] /usr/lib64/glusterfs/3.3.1/xlator/features/locks.so(pl_create+0x14a)[0x7f7a2e6802da] /usr/lib64/glusterfs/3.3.1/xlator/performance/io-threads.so(iot_create_wrapper+0x13c)[0x7f7a2e46d16c] /usr/lib64/libglusterfs.so.0(call_resume+0x84f)[0x3350c2ebef] /usr/lib64/glusterfs/3.3.1/xlator/performance/io-threads.so(iot_worker+0x15b)[0x7f7a2e472f1b] /lib64/libpthread.so.0[0x3933207851] /lib64/libc.so.6(clone+0x6d)[0x3932ae890d] again the crash is in _gf_log, actualy this one was exactly at the time of log rotation.