Description of problem: 3 replica volume with 3 fuse and 3 nfs clients. One of the fuse clients crashed since all the bricks of the volume were down. (Actually 2 of the bricks were crashed, and 1 brick was killed earlier to crashes to test self-heal). This is the backtrace of the crash. GNU gdb (GDB) Red Hat Enterprise Linux (7.2-50.el6) Copyright (C) 2010 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>... Reading symbols from /usr/local/sbin/glusterfs...done. [New Thread 21101] [New Thread 21103] [New Thread 21109] [New Thread 21107] [New Thread 21102] [New Thread 21108] [New Thread 21104] Reading symbols from /usr/local/lib/libglusterfs.so.0...done. Loaded symbols for /usr/local/lib/libglusterfs.so.0 Reading symbols from /usr/local/lib/libgfrpc.so.0...done. Loaded symbols for /usr/local/lib/libgfrpc.so.0 Reading symbols from /usr/local/lib/libgfxdr.so.0...done. Loaded symbols for /usr/local/lib/libgfxdr.so.0 Reading symbols from /lib64/libdl.so.2...(no debugging symbols found)...done. Loaded symbols for /lib64/libdl.so.2 Reading symbols from /lib64/libpthread.so.0...(no debugging symbols found)...done. [Thread debugging using libthread_db enabled] Loaded symbols for /lib64/libpthread.so.0 Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done. Loaded symbols for /lib64/libc.so.6 Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done. Loaded symbols for /lib64/ld-linux-x86-64.so.2 Reading symbols from /usr/local/lib/glusterfs/3.3.0qa28/xlator/mount/fuse.so...done. Loaded symbols for /usr/local/lib/glusterfs/3.3.0qa28/xlator/mount/fuse.so Reading symbols from /usr/local/lib/glusterfs/3.3.0qa28/rpc-transport/socket.so...done. Loaded symbols for /usr/local/lib/glusterfs/3.3.0qa28/rpc-transport/socket.so Reading symbols from /lib64/libnss_files.so.2...(no debugging symbols found)...done. Loaded symbols for /lib64/libnss_files.so.2 Reading symbols from /usr/local/lib/glusterfs/3.3.0qa28/xlator/protocol/client.so...done. Loaded symbols for /usr/local/lib/glusterfs/3.3.0qa28/xlator/protocol/client.so Reading symbols from /usr/local/lib/glusterfs/3.3.0qa28/xlator/cluster/replicate.so...done. Loaded symbols for /usr/local/lib/glusterfs/3.3.0qa28/xlator/cluster/replicate.so Reading symbols from /usr/local/lib/glusterfs/3.3.0qa28/xlator/performance/write-behind.so...done. Loaded symbols for /usr/local/lib/glusterfs/3.3.0qa28/xlator/performance/write-behind.so Reading symbols from /usr/local/lib/glusterfs/3.3.0qa28/xlator/performance/read-ahead.so...done. Loaded symbols for /usr/local/lib/glusterfs/3.3.0qa28/xlator/performance/read-ahead.so Reading symbols from /usr/local/lib/glusterfs/3.3.0qa28/xlator/performance/io-cache.so...done. Loaded symbols for /usr/local/lib/glusterfs/3.3.0qa28/xlator/performance/io-cache.so Reading symbols from /usr/local/lib/glusterfs/3.3.0qa28/xlator/performance/quick-read.so...done. Loaded symbols for /usr/local/lib/glusterfs/3.3.0qa28/xlator/performance/quick-read.so Reading symbols from /usr/local/lib/glusterfs/3.3.0qa28/xlator/performance/md-cache.so...done. Loaded symbols for /usr/local/lib/glusterfs/3.3.0qa28/xlator/performance/md-cache.so Reading symbols from /usr/local/lib/glusterfs/3.3.0qa28/xlator/debug/io-stats.so...done. Loaded symbols for /usr/local/lib/glusterfs/3.3.0qa28/xlator/debug/io-stats.so Reading symbols from /lib64/libgcc_s.so.1...(no debugging symbols found)...done. Loaded symbols for /lib64/libgcc_s.so.1 Core was generated by `/usr/local/sbin/glusterfs --volfile-id=mirror --volfile-server=10.16.156.15 /mn'. Program terminated with signal 6, Aborted. #0 0x0000003a29c32885 in raise () from /lib64/libc.so.6 Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.47.el6_2.5.x86_64 libgcc-4.4.6-3.el6.x86_64 (gdb) bt #0 0x0000003a29c32885 in raise () from /lib64/libc.so.6 #1 0x0000003a29c34065 in abort () from /lib64/libc.so.6 #2 0x0000003a29c2b9fe in __assert_fail_base () from /lib64/libc.so.6 #3 0x0000003a29c2bac0 in __assert_fail () from /lib64/libc.so.6 #4 0x00007f4fc4063c94 in afr_sh_save_child_iatts_from_policy (children=0x7f4fb800fe50, bufs=0x7f4fb800a510, save=0x7f4fbc1acb30, child_count=3) at ../../../../../xlators/cluster/afr/src/afr-self-heal-common.c:1588 #5 0x00007f4fc4063f11 in afr_sh_children_lookup_done (frame=0x7f4fc747a894, this=0x1f3e270, op_ret=0, op_errno=107) at ../../../../../xlators/cluster/afr/src/afr-self-heal-common.c:1635 #6 0x00007f4fc4062f62 in afr_sh_common_lookup_cbk (frame=0x7f4fc747a894, cookie=0x1, this=0x1f3e270, op_ret=0, op_errno=107, inode=0x7f4fb652d174, buf=0x7fffcd1c1d90, xattr=0x0, postparent=0x7fffcd1c1d20) at ../../../../../xlators/cluster/afr/src/afr-self-heal-common.c:1316 #7 0x00007f4fc42cd597 in client3_1_lookup_cbk (req=0x7f4fb7ef51ec, iov=0x7fffcd1c1fa0, count=1, myframe=0x7f4fc7681c64) at ../../../../../xlators/protocol/client/src/client3_1-fops.c:2185 #8 0x00007f4fc8626ab4 in saved_frames_unwind (saved_frames=0x205b210) at ../../../../rpc/rpc-lib/src/rpc-clnt.c:387 #9 0x00007f4fc8626b63 in saved_frames_destroy (frames=0x205b210) at ../../../../rpc/rpc-lib/src/rpc-clnt.c:405 #10 0x00007f4fc862711d in rpc_clnt_connection_cleanup (conn=0x1f91a40) at ../../../../rpc/rpc-lib/src/rpc-clnt.c:567 #11 0x00007f4fc8627bfe in rpc_clnt_notify (trans=0x1fa1550, mydata=0x1f91a40, event=RPC_TRANSPORT_DISCONNECT, data=0x1fa1550) at ../../../../rpc/rpc-lib/src/rpc-clnt.c:870 #12 0x00007f4fc8623e78 in rpc_transport_notify (this=0x1fa1550, event=RPC_TRANSPORT_DISCONNECT, data=0x1fa1550) at ../../../../rpc/rpc-lib/src/rpc-transport.c:498 #13 0x00007f4fc510a1c7 in socket_event_poll_err (this=0x1fa1550) at ../../../../../rpc/rpc-transport/socket/src/socket.c:694 #14 0x00007f4fc510e880 in socket_event_handler (fd=14, idx=7, data=0x1fa1550, poll_in=1, poll_out=0, poll_err=16) at ../../../../../rpc/rpc-transport/socket/src/socket.c:1808 #15 0x00007f4fc887e0c4 in event_dispatch_epoll (event_pool=0x1dc8c30) at ../../../libglusterfs/src/event.c:816 #16 0x00007f4fc887e2e7 in event_pool_new (count=1) at ../../../libglusterfs/src/event.c:893 #17 0x00007f4fc887e672 in list_del (old=0xffffffff29a21188) at ../../../libglusterfs/src/list.h:61 #18 0x0000000000407ecd in main (argc=4, argv=0x7fffcd1c2618) at ../../../glusterfsd/src/glusterfsd.c:1609 (gdb) f 4 #4 0x00007f4fc4063c94 in afr_sh_save_child_iatts_from_policy (children=0x7f4fb800fe50, bufs=0x7f4fb800a510, save=0x7f4fbc1acb30, child_count=3) at ../../../../../xlators/cluster/afr/src/afr-self-heal-common.c:1588 1588 GF_ASSERT (saved); (gdb) p saved $1 = _gf_false (gdb) l afr_sh_save_child_iatts_from_policy 1567 1568 void 1569 afr_sh_save_child_iatts_from_policy (int32_t *children, struct iatt *bufs, 1570 struct iatt *save, 1571 unsigned int child_count) 1572 { 1573 int i = 0; 1574 int child = 0; 1575 gf_boolean_t saved = _gf_false; 1576 (gdb) 1577 GF_ASSERT (save); 1578 //if iatt buf with gfid exists sets it 1579 for (i = 0; i < child_count; i++) { 1580 child = children[i]; 1581 if (child == -1) 1582 break; 1583 *save = bufs[child]; 1584 saved = _gf_true; 1585 if (!uuid_is_null (save->ia_gfid)) 1586 break; (gdb) 1587 } 1588 GF_ASSERT (saved); 1589 } 1590 1591 void 1592 afr_get_children_of_fresh_parent_dirs (afr_self_heal_t *sh, 1593 unsigned int child_count) 1594 { 1595 afr_children_intersection_get (sh->success_children, 1596 sh->fresh_parent_dirs, (gdb) p children[0] $2 = -1 (gdb) p children[0][1P]1] $3 = -1 (gdb) p children[1][1P]2] $4 = -1 (gdb) p child_count $5 = 3 (gdb) quit[K[K[K[Kinfo thr 7 Thread 0x7f4fc5d19700 (LWP 21104) 0x0000003a2a40b3dc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 6 Thread 0x7f4fbe29a700 (LWP 21108) 0x0000003a29cddae7 in readv () from /lib64/libc.so.6 5 Thread 0x7f4fc711b700 (LWP 21102) 0x0000003a2a40f245 in sigwait () from /lib64/libpthread.so.0 4 Thread 0x7f4fc4eee700 (LWP 21107) 0x0000003a2a40eccd in nanosleep () from /lib64/libpthread.so.0 3 Thread 0x7f4fbd899700 (LWP 21109) 0x0000003a2a40e4ed in read () from /lib64/libpthread.so.0 2 Thread 0x7f4fc671a700 (LWP 21103) 0x0000003a2a40b3dc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 * 1 Thread 0x7f4fc83f0700 (LWP 21101) 0x0000003a29c32885 in raise () from /lib64/libc.so.6 (gdb) bt #0 0x0000003a29c32885 in raise () from /lib64/libc.so.6 #1 0x0000003a29c34065 in abort () from /lib64/libc.so.6 #2 0x0000003a29c2b9fe in __assert_fail_base () from /lib64/libc.so.6 #3 0x0000003a29c2bac0 in __assert_fail () from /lib64/libc.so.6 #4 0x00007f4fc4063c94 in afr_sh_save_child_iatts_from_policy (children=0x7f4fb800fe50, bufs=0x7f4fb800a510, save=0x7f4fbc1acb30, child_count=3) at ../../../../../xlators/cluster/afr/src/afr-self-heal-common.c:1588 #5 0x00007f4fc4063f11 in afr_sh_children_lookup_done (frame=0x7f4fc747a894, this=0x1f3e270, op_ret=0, op_errno=107) at ../../../../../xlators/cluster/afr/src/afr-self-heal-common.c:1635 #6 0x00007f4fc4062f62 in afr_sh_common_lookup_cbk (frame=0x7f4fc747a894, cookie=0x1, this=0x1f3e270, op_ret=0, op_errno=107, inode=0x7f4fb652d174, buf=0x7fffcd1c1d90, xattr=0x0, postparent=0x7fffcd1c1d20) at ../../../../../xlators/cluster/afr/src/afr-self-heal-common.c:1316 #7 0x00007f4fc42cd597 in client3_1_lookup_cbk (req=0x7f4fb7ef51ec, iov=0x7fffcd1c1fa0, count=1, myframe=0x7f4fc7681c64) at ../../../../../xlators/protocol/client/src/client3_1-fops.c:2185 #8 0x00007f4fc8626ab4 in saved_frames_unwind (saved_frames=0x205b210) at ../../../../rpc/rpc-lib/src/rpc-clnt.c:387 #9 0x00007f4fc8626b63 in saved_frames_destroy (frames=0x205b210) at ../../../../rpc/rpc-lib/src/rpc-clnt.c:405 #10 0x00007f4fc862711d in rpc_clnt_connection_cleanup (conn=0x1f91a40) at ../../../../rpc/rpc-lib/src/rpc-clnt.c:567 #11 0x00007f4fc8627bfe in rpc_clnt_notify (trans=0x1fa1550, mydata=0x1f91a40, event=RPC_TRANSPORT_DISCONNECT, data=0x1fa1550) at ../../../../rpc/rpc-lib/src/rpc-clnt.c:870 #12 0x00007f4fc8623e78 in rpc_transport_notify (this=0x1fa1550, event=RPC_TRANSPORT_DISCONNECT, data=0x1fa1550) at ../../../../rpc/rpc-lib/src/rpc-transport.c:498 #13 0x00007f4fc510a1c7 in socket_event_poll_err (this=0x1fa1550) at ../../../../../rpc/rpc-transport/socket/src/socket.c:694 #14 0x00007f4fc510e880 in socket_event_handler (fd=14, idx=7, data=0x1fa1550, poll_in=1, poll_out=0, poll_err=16) at ../../../../../rpc/rpc-transport/socket/src/socket.c:1808 #15 0x00007f4fc887e0c4 in event_dispatch_epoll (event_pool=0x1dc8c30) at ../../../libglusterfs/src/event.c:816 #16 0x00007f4fc887e2e7 in event_pool_new (count=1) at ../../../libglusterfs/src/event.c:893 #17 0x00007f4fc887e672 in list_del (old=0xffffffff29a21188) at ../../../libglusterfs/src/list.h:61 #18 0x0000000000407ecd in main (argc=4, argv=0x7fffcd1c2618) at ../../../glusterfsd/src/glusterfsd.c:1609 (gdb) f 5 #5 0x00007f4fc4063f11 in afr_sh_children_lookup_done (frame=0x7f4fc747a894, this=0x1f3e270, op_ret=0, op_errno=107) at ../../../../../xlators/cluster/afr/src/afr-self-heal-common.c:1635 1635 afr_sh_save_child_iatts_from_policy (sh->fresh_children, (gdb) l 1630 sh->op_failed = 1; 1631 afr_sh_purge_entry (frame, this); 1632 } else if (!afr_conflicting_iattrs (sh->buf, sh->fresh_children, 1633 priv->child_count, local->loc.path, 1634 this->name)) { 1635 afr_sh_save_child_iatts_from_policy (sh->fresh_children, 1636 sh->buf, &sh->entrybuf, 1637 priv->child_count); 1638 afr_update_gfid_from_iatts (sh->sh_gfid_req, sh->buf, 1639 sh->fresh_children, (gdb) quit Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: glusterfs client asserted since all the bricks were down (while self-healing) Expected results: glusterfs client should not crash. Additional info:
http://review.gluster.org/3092 has been posted for this (but for some reason incorrectly shows up as "rfc" in Gerrit).
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.5.0, please reopen this bug report. glusterfs-3.5.0 has been announced on the Gluster Developers mailinglist [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/6137 [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user