Bug 769055 - [255fed3b0d5b9d210d1da47dbd647dd6497cd550] glustershd process crashed with SIGABRT
Summary: [255fed3b0d5b9d210d1da47dbd647dd6497cd550] glustershd process crashed with SI...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: replicate
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
Assignee: Pranith Kumar K
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 817967
TreeView+ depends on / blocked
 
Reported: 2011-12-19 19:04 UTC by Rahul C S
Modified: 2013-07-24 17:37 UTC (History)
2 users (show)

Fixed In Version: glusterfs-3.4.0
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-07-24 17:37:14 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions: 1f3a0dd4742a2fcd3215aee4a5e22125d7ea4f4d
Embargoed:


Attachments (Terms of Use)

Description Rahul C S 2011-12-19 19:04:33 UTC
Description of problem:
I was running sanity on a distributed replicate volume & also glusterfs build on another mount. Did a brick down & up. Issued gluster volume heal command & also did a rebalances. Crashed at 3 places with the same core. glustershd, rebalance failed, and then finally the client on which glusterfs build was happening crashed.

I have all the cores, & since it cant be attached, pls contact me for the cores.

Core backtrace:
Core was generated by `/usr/local/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /etc/'.
Program terminated with signal 6, Aborted.
#0  0x00007f4ec6033d05 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
64	../nptl/sysdeps/unix/sysv/linux/raise.c: Transport endpoint is not connected.
	in ../nptl/sysdeps/unix/sysv/linux/raise.c
(gdb) bt
#0  0x00007f4ec6033d05 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x00007f4ec6037ab6 in abort () at abort.c:92
#2  0x00007f4ec602c7c5 in __assert_fail (assertion=0x7f4ec2d8459b "0", file=<value optimized out>, line=2912, function=<value optimized out>) at assert.c:81
#3  0x00007f4ec2d716c3 in client3_1_unlink (frame=0x7f4ec52a2b60, this=0x98b080, data=0x7fff8835ae70)
    at ../../../../../xlators/protocol/client/src/client3_1-fops.c:2910
#4  0x00007f4ec2d5c6e7 in client_unlink (frame=0x7f4ec52a2b60, this=0x98b080, loc=0x7f4ebc20c868) at ../../../../../xlators/protocol/client/src/client.c:531
#5  0x00007f4ec2b14c3b in afr_sh_entry_expunge_unlink (expunge_frame=0x7f4ec4f76358, this=0x98e510, active_src=1)
    at ../../../../../xlators/cluster/afr/src/afr-self-heal-entry.c:441
#6  0x00007f4ec2b15056 in afr_sh_entry_expunge_remove (expunge_frame=0x7f4ec4f76358, this=0x98e510, active_src=1, buf=0x7fff8835b330)
    at ../../../../../xlators/cluster/afr/src/afr-self-heal-entry.c:504
#7  0x00007f4ec2b152f7 in afr_sh_entry_expunge_lookup_cbk (expunge_frame=0x7f4ec4f76358, cookie=0x1, this=0x98e510, op_ret=0, op_errno=22, 
    inode=0x7f4ec05664ec, buf=0x7fff8835b330, x=0x7f4ebc214d70, postparent=0x7fff8835b2c0)
    at ../../../../../xlators/cluster/afr/src/afr-self-heal-entry.c:559
#8  0x00007f4ec2d6ef6e in client3_1_lookup_cbk (req=0x7f4ec266e04c, iov=0x7f4ec266e08c, count=1, myframe=0x7f4ec52a26ac)
    at ../../../../../xlators/protocol/client/src/client3_1-fops.c:2254
#9  0x00007f4ec69de9c6 in rpc_clnt_handle_reply (clnt=0x996350, pollin=0x7f4ebc0029b0) at ../../../../rpc/rpc-lib/src/rpc-clnt.c:789
#10 0x00007f4ec69ded28 in rpc_clnt_notify (trans=0x996670, mydata=0x996380, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x7f4ebc0029b0)
    at ../../../../rpc/rpc-lib/src/rpc-clnt.c:908
#11 0x00007f4ec69dae3d in rpc_transport_notify (this=0x996670, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x7f4ebc0029b0)
    at ../../../../rpc/rpc-lib/src/rpc-transport.c:498
#12 0x00007f4ec3c1a359 in socket_event_poll_in (this=0x996670) at ../../../../../rpc/rpc-transport/socket/src/socket.c:1675
#13 0x00007f4ec3c1a8cd in socket_event_handler (fd=15, idx=8, data=0x996670, poll_in=1, poll_out=0, poll_err=0)
    at ../../../../../rpc/rpc-transport/socket/src/socket.c:1790
#14 0x00007f4ec6c2f4b9 in event_dispatch_epoll_handler (event_pool=0x97b2d0, events=0x97ff00, i=0) at ../../../libglusterfs/src/event.c:794
#15 0x00007f4ec6c2f6d3 in event_dispatch_epoll (event_pool=0x97b2d0) at ../../../libglusterfs/src/event.c:856
#16 0x00007f4ec6c2fa45 in event_dispatch (event_pool=0x97b2d0) at ../../../libglusterfs/src/event.c:956
#17 0x0000000000407d83 in main (argc=11, argv=0x7fff8835b898) at ../../../glusterfsd/src/glusterfsd.c:1601

log:
[2011-12-20 00:11:45.103353] I [afr-common.c:1297:afr_launch_self_heal] 0-vol-replicate-2: background  entry self-heal triggered. path: /glusterfs-3git/doc, 
reason: lookup detected pending operations
[2011-12-20 00:11:45.148216] E [afr-self-heal-common.c:1057:afr_sh_common_lookup_resp_handler] 0-vol-replicate-2: path /glusterfs-3git/doc/Makefile on subvol
ume vol-client-5 => -1 (No such file or directory)
[2011-12-20 00:11:45.231214] I [afr-self-heal-common.c:2060:afr_self_heal_completion_cbk] 0-vol-replicate-2: background  entry self-heal completed on /gluste
rfs-3git/doc
[2011-12-20 00:11:45.255314] I [afr-common.c:1297:afr_launch_self_heal] 0-vol-replicate-2: background  entry self-heal triggered. path: /glusterfs-3git/doc/e
xamples, reason: lookup detected pending operations
[2011-12-20 00:11:45.341540] E [afr-self-heal-common.c:1057:afr_sh_common_lookup_resp_handler] 0-vol-replicate-2: path /glusterfs-3git/doc/examples/Makefile 
on subvolume vol-client-5 => -1 (No such file or directory)
[2011-12-20 00:11:45.353802] I [afr-self-heal-common.c:2060:afr_self_heal_completion_cbk] 0-vol-replicate-2: background  entry self-heal completed on /gluste
rfs-3git/doc/examples
[2011-12-20 00:11:45.400359] I [afr-common.c:1297:afr_launch_self_heal] 0-vol-replicate-2: background  entry self-heal triggered. path: /glusterfs-3git/extra
s/init.d, reason: lookup detected pending operations
[2011-12-20 00:11:45.520587] I [afr-self-heal-entry.c:642:afr_sh_entry_expunge_entry_cbk] 0-vol-replicate-2: missing entry /glusterfs-3git/extras/init.d/glus
terd-SuSE on vol-client-4
pending frames:

patchset: git://git.gluster.com/glusterfs.git
signal received: 6
time of crash: 2011-12-20 00:11:45
configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3git
/lib/x86_64-linux-gnu/libc.so.6(+0x33d80)[0x7f4ec6033d80]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x35)[0x7f4ec6033d05]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x186)[0x7f4ec6037ab6]
/lib/x86_64-linux-gnu/libc.so.6(__assert_fail+0xf5)[0x7f4ec602c7c5]
/usr/local/lib/glusterfs/3git/xlator/protocol/client.so(client3_1_unlink+0x13f)[0x7f4ec2d716c3]
/usr/local/lib/glusterfs/3git/xlator/protocol/client.so(client_unlink+0x149)[0x7f4ec2d5c6e7]
/usr/local/lib/glusterfs/3git/xlator/cluster/replicate.so(afr_sh_entry_expunge_unlink+0x30d)[0x7f4ec2b14c3b]
/usr/local/lib/glusterfs/3git/xlator/cluster/replicate.so(afr_sh_entry_expunge_remove+0xdc)[0x7f4ec2b15056]
/usr/local/lib/glusterfs/3git/xlator/cluster/replicate.so(afr_sh_entry_expunge_lookup_cbk+0x161)[0x7f4ec2b152f7]
/usr/local/lib/glusterfs/3git/xlator/protocol/client.so(client3_1_lookup_cbk+0x7ae)[0x7f4ec2d6ef6e]
/usr/local/lib/libgfrpc.so.0(rpc_clnt_handle_reply+0x20e)[0x7f4ec69de9c6]
/usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x29f)[0x7f4ec69ded28]
/usr/local/lib/libgfrpc.so.0(rpc_transport_notify+0x115)[0x7f4ec69dae3d]
/usr/local/lib/glusterfs/3git/rpc-transport/socket.so(socket_event_poll_in+0x54)[0x7f4ec3c1a359]
/usr/local/lib/glusterfs/3git/rpc-transport/socket.so(socket_event_handler+0x21d)[0x7f4ec3c1a8cd]
/usr/local/lib/libglusterfs.so.0(+0x474b9)[0x7f4ec6c2f4b9]
/usr/local/lib/libglusterfs.so.0(+0x476d3)[0x7f4ec6c2f6d3]
/usr/local/lib/libglusterfs.so.0(event_dispatch+0x88)[0x7f4ec6c2fa45]
/usr/local/sbin/glusterfs(main+0x238)[0x407d83]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xff)[0x7f4ec601eeff]
/usr/local/sbin/glusterfs[0x403db9]
------------

Comment 1 Anand Avati 2011-12-22 13:18:16 UTC
CHANGE: http://review.gluster.com/2495 (cluster/afr: Set pargfid when missing) merged in master by Vijay Bellur (vijay)

Comment 2 Rahul C S 2012-04-05 06:49:28 UTC
No crashes found while doing the above operations with this git head: 1f3a0dd4742a2fcd3215aee4a5e22125d7ea4f4d


Note You need to log in before you can comment on or make changes to this bug.