Description of problem: 2x2 distributed replicate volume. 1 fuse client. Ran sanity script on the fuse mount and after the tests are over tool the statedump of the volume (bricks). glusterfs servers running in all the peers of the cluster segfaulted. This is the backtrace of the core. Core was generated by `/usr/local/sbin/glusterfsd -s localhost --volfile-id mirror.10.1.11.130.export-'. Program terminated with signal 11, Segmentation fault. #0 0x00007f8834cee99e in server_priv (this=0x118df40) at ../../../../../xlators/protocol/server/src/server.c:336 336 total_read += xprt->total_bytes_read; Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.25.el6_1.3.x86_64 libgcc-4.4.5-6.el6.x86_64 (gdb) bt #0 0x00007f8834cee99e in server_priv (this=0x118df40) at ../../../../../xlators/protocol/server/src/server.c:336 #1 0x00007f883bbfc1bf in gf_proc_dump_xlator_info (top=0x118df40) at ../../../libglusterfs/src/statedump.c:403 #2 0x00007f883bbfcc15 in gf_proc_dump_info (signum=10) at ../../../libglusterfs/src/statedump.c:655 #3 0x000000000040738d in glusterfs_sigwaiter (arg=0x7fff6c0e2a10) at ../../../glusterfsd/src/glusterfsd.c:1342 #4 0x00000034c2a077e1 in start_thread () from /lib64/libpthread.so.0 #5 0x00000034c22e577d in clone () from /lib64/libc.so.6 (gdb) f 0 #0 0x00007f8834cee99e in server_priv (this=0x118df40) at ../../../../../xlators/protocol/server/src/server.c:336 336 total_read += xprt->total_bytes_read; (gdb) p xprt $1 = (rpc_transport_t *) 0xcafebabe00007ce8 (gdb) l 331 conf = this->private; 332 if (!conf) 333 return 0; 334 335 list_for_each_entry (xprt, &conf->xprt_list, list) { 336 total_read += xprt->total_bytes_read; 337 total_write += xprt->total_bytes_write; 338 } 339 340 gf_proc_dump_build_key(key, "server", "total-bytes-read"); (gdb) p conf->xprt_list $2 = {next = 0x11808a0, prev = 0x11c27a0} (gdb) p *conf $3 = {rpc = 0x11913b0, rpc_conf = {max_block_size = 4194304}, inode_lru_limit = 1024, verify_volfile = _gf_true, trace = _gf_false, conf_dir = 0x7f8834d100a8 "/usr/local/etc/glusterfs", volfile = 0x0, grace_tv = {tv_sec = 10, tv_usec = 0}, auth_modules = 0x118e9f0, mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}, conns = {next = 0x11bee90, prev = 0x11a25a0}, xprt_list = {next = 0x11808a0, prev = 0x11c27a0}} (gdb) p *xprt Cannot access memory at address 0xcafebabe00007ce8 (gdb)info thr 8 Thread 0x7f883455f700 (LWP 13920) 0x00000034c2a0b3cc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 7 Thread 0x7f88367da700 (LWP 13910) 0x00000034c2a0ecbd in nanosleep () from /lib64/libpthread.so.0 6 Thread 0x7f882f3fc700 (LWP 14060) 0x00000034c2a0b74b in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 5 Thread 0x7f8838c74700 (LWP 13906) 0x00000034c2a0b3cc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 4 Thread 0x7f883b764700 (LWP 13904) 0x00000034c22e5d73 in epoll_wait () from /lib64/libc.so.6 3 Thread 0x7f882ffff700 (LWP 13922) 0x00000034c2a0b74b in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 2 Thread 0x7f8838273700 (LWP 13907) 0x00000034c2a0b3cc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 * 1 Thread 0x7f8839675700 (LWP 13905) 0x00007f8834cee99e in server_priv (this=0x118df40) at ../../../../../xlators/protocol/server/src/server.c:336 (gdb) Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. Ran sanity script 2. take the statedump of the volume 3. Actual results: glusterfs servers crashed Expected results: glusterfs servers should not crash. Additional info: logs of the crashed server. [2012-02-22 02:42:31.019360] I [server3_1-fops.c:277:server_finodelk_cbk] 0-mirror-server: 4167147: FINODELK 113 (daa452b7-447f-4e7f-bae7-084c 5fb4435a) ==> -1 (Invalid argument) [2012-02-22 02:42:31.584372] I [server3_1-fops.c:277:server_finodelk_cbk] 0-mirror-server: 4167884: FINODELK 113 (f3855d49-c316-45eb-a6f1-02b320a5860b) ==> -1 (Invalid argument) [2012-02-22 02:42:40.034678] I [server3_1-fops.c:277:server_finodelk_cbk] 0-mirror-server: 4176969: FINODELK 43 (714e946a-26d1-4c4a-ae7d-abb98666b0bb) ==> -1 (Invalid argument) [2012-02-22 02:43:40.739659] I [server3_1-fops.c:277:server_finodelk_cbk] 0-mirror-server: 4213829: FINODELK 113 (a573ad13-4ae7-45e9-a2b4-ebf5c2f87a4f) ==> -1 (Invalid argument) [2012-02-22 02:44:36.537695] I [server3_1-fops.c:277:server_finodelk_cbk] 0-mirror-server: 4250861: FINODELK 43 (b442c196-c9c4-44e8-a6f3-c9afe0713c07) ==> -1 (Invalid argument) [2012-02-22 02:45:25.625239] I [server3_1-fops.c:277:server_finodelk_cbk] 0-mirror-server: 4293977: FINODELK 113 (4f20a9cd-29f2-46cc-b321-92e187c3394b) ==> -1 (Invalid argument) [2012-02-22 02:46:01.620192] I [server3_1-fops.c:277:server_finodelk_cbk] 0-mirror-server: 4323819: FINODELK 43 (aff78366-4122-4570-89e8-95b81dcb30d0) ==> -1 (Invalid argument) [2012-02-22 02:47:02.022653] I [server3_1-fops.c:277:server_finodelk_cbk] 0-mirror-server: 4362653: FINODELK 113 (93bf7adf-174f-40dc-ade0-0f07d42c4f88) ==> -1 (Invalid argument) [2012-02-22 02:47:07.644661] I [server3_1-fops.c:277:server_finodelk_cbk] 0-mirror-server: 4367012: FINODELK 113 (ed78b83b-e388-4118-9983-bf06b5584b81) ==> -1 (Invalid argument) pending frames: patchset: git://git.gluster.com/glusterfs.git signal received: 11 time of crash: 2012-02-22 04:20:11 configuration details: argp 1 backtrace 1 dlfcn 1 fdatasync 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.3.0qa23 /lib64/libc.so.6[0x34c2232980] /usr/local/lib/glusterfs/3.3.0qa23/xlator/protocol/server.so(server_priv+0xfa)[0x7f8834cee99e] /usr/local/lib/libglusterfs.so.0(gf_proc_dump_xlator_info+0x151)[0x7f883bbfc1bf] /usr/local/lib/libglusterfs.so.0(gf_proc_dump_info+0x23e)[0x7f883bbfcc15] /usr/local/sbin/glusterfsd(glusterfs_sigwaiter+0x11a)[0x40738d] /lib64/libpthread.so.0[0x34c2a077e1] /lib64/libc.so.6(clone+0x6d)[0x34c22e577d]
should be fixed by http://review.gluster.com/2806 please do check the behavior now with 3.3.0qa25
after the patch : http://review.gluster.com/2911, this should not be happening anymore.
Checked with glusterfs-3.3.0qa40. Now after statedumps server processes does not crash.