Bug 802015

Summary: core:glusterd crash
Product: [Community] GlusterFS Reporter: Saurabh <saujain>
Component: glusterdAssignee: krishnan parthasarathi <kparthas>
Status: CLOSED WORKSFORME QA Contact:
Severity: medium Docs Contact:
Priority: high    
Version: pre-releaseCC: amarts, gluster-bugs, mzywusko, nsathyan, vbellur
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-07-11 02:29:48 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:

Description Saurabh 2012-03-10 04:09:44 EST
Description of problem:
bt
#0  0x000000309d232885 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.47.el6.x86_64 libgcc-4.4.6-3.el6.x86_64
(gdb) bt
#0  0x000000309d232885 in raise () from /lib64/libc.so.6
#1  0x000000309d234065 in abort () from /lib64/libc.so.6
#2  0x000000309d22b9fe in __assert_fail_base () from /lib64/libc.so.6
#3  0x000000309d22bac0 in __assert_fail () from /lib64/libc.so.6
#4  0x00007ff18e361b18 in glusterd_delete_brick (volinfo=0x16e1ba0, brickinfo=0x16ec060)
    at glusterd-utils.c:4474
#5  0x00007ff18e361bf3 in glusterd_delete_all_bricks (volinfo=0x16e1ba0) at glusterd-utils.c:4493
#6  0x00007ff18e35aebc in glusterd_delete_stale_volume (stale_volinfo=0x16e1ba0, valid_volinfo=0x1738470)
    at glusterd-utils.c:2331
#7  0x00007ff18e35b034 in glusterd_import_friend_volume (vols=0x16dfe70, count=2) at glusterd-utils.c:2363
#8  0x00007ff18e35b175 in glusterd_import_friend_volumes (vols=0x16dfe70) at glusterd-utils.c:2395
#9  0x00007ff18e35b33b in glusterd_compare_friend_data (vols=0x16dfe70, status=0x7fffb04e4c64)
    at glusterd-utils.c:2443
#10 0x00007ff18e347ac5 in glusterd_ac_handle_friend_add_req (event=0x16dd930, ctx=0x16dfd30)
    at glusterd-sm.c:648
#11 0x00007ff18e3480cc in glusterd_friend_sm () at glusterd-sm.c:994
#12 0x00007ff18e34167e in glusterd_handle_incoming_friend_req (req=0x7ff18e2ac04c) at glusterd-handler.c:1493
#13 0x00007ff1916a50b9 in rpcsvc_handle_rpc_call (svc=0x16cf1e0, trans=0x16dd580, msg=0x16dfdf0)
    at rpcsvc.c:514
#14 0x00007ff1916a545c in rpcsvc_notify (trans=0x16dd580, mydata=0x16cf1e0, event=RPC_TRANSPORT_MSG_RECEIVED, 
    data=0x16dfdf0) at rpcsvc.c:610
#15 0x00007ff1916aadb8 in rpc_transport_notify (this=0x16dd580, event=RPC_TRANSPORT_MSG_RECEIVED, 
    data=0x16dfdf0) at rpc-transport.c:498
#16 0x00007ff18e0a1270 in socket_event_poll_in (this=0x16dd580) at socket.c:1686
#17 0x00007ff18e0a17f4 in socket_event_handler (fd=6, idx=4, data=0x16dd580, poll_in=1, poll_out=0, poll_err=0)
    at socket.c:1801
#18 0x00007ff19190507c in event_dispatch_epoll_handler (event_pool=0x16ca290, events=0x16dc920, i=0)
#19 0x00007ff19190529f in event_dispatch_epoll (event_pool=0x16ca290) at event.c:856
#20 0x00007ff19190562a in event_dispatch (event_pool=0x16ca290) at event.c:956
#21 0x0000000000407dbd in main (argc=2, argv=0x7fffb04e52d8) at glusterfsd.c:1611

information from glusterd.vol.log,
[2012-03-10 06:34:04.225527] I [glusterd-utils.c:796:glusterd_volume_brickinfo_get] 0-management: Found brick
pending frames:

patchset: git://git.gluster.com/glusterfs.git
signal received: 6
time of crash: 2012-03-10 06:34:04
configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.3.0qa26
/lib64/libc.so.6[0x309d232900]
/lib64/libc.so.6(gsignal+0x35)[0x309d232885]
/lib64/libc.so.6(abort+0x175)[0x309d234065]
/lib64/libc.so.6[0x309d22b9fe]
/lib64/libc.so.6(__assert_perror_fail+0x0)[0x309d22bac0]
/root/330/inst/lib/glusterfs/3.3.0qa26/xlator/mgmt/glusterd.so(glusterd_delete_brick+0xc7)[0x7ff18e361b18]
/root/330/inst/lib/glusterfs/3.3.0qa26/xlator/mgmt/glusterd.so(glusterd_delete_all_bricks+0x88)[0x7ff18e361bf3]
/root/330/inst/lib/glusterfs/3.3.0qa26/xlator/mgmt/glusterd.so(glusterd_delete_stale_volume+0xc8)[0x7ff18e35aebc]
/root/330/inst/lib/glusterfs/3.3.0qa26/xlator/mgmt/glusterd.so(glusterd_import_friend_volume+0x11d)[0x7ff18e35b034]
/root/330/inst/lib/glusterfs/3.3.0qa26/xlator/mgmt/glusterd.so(glusterd_import_friend_volumes+0x7e)[0x7ff18e35b175]
/root/330/inst/lib/glusterfs/3.3.0qa26/xlator/mgmt/glusterd.so(glusterd_compare_friend_data+0x153)[0x7ff18e35b33b]
/root/330/inst/lib/glusterfs/3.3.0qa26/xlator/mgmt/glusterd.so(+0x37ac5)[0x7ff18e347ac5]
/root/330/inst/lib/glusterfs/3.3.0qa26/xlator/mgmt/glusterd.so(glusterd_friend_sm+0x1e7)[0x7ff18e3480cc]
/root/330/inst/lib/glusterfs/3.3.0qa26/xlator/mgmt/glusterd.so(glusterd_handle_incoming_friend_req+0x155)[0x7ff18e34167e]
/root/330/inst/lib/libgfrpc.so.0(rpcsvc_handle_rpc_call+0x360)[0x7ff1916a50b9]
/root/330/inst/lib/libgfrpc.so.0(rpcsvc_notify+0x181)[0x7ff1916a545c]
/root/330/inst/lib/libgfrpc.so.0(rpc_transport_notify+0x130)[0x7ff1916aadb8]
/root/330/inst/lib/glusterfs/3.3.0qa26/rpc-transport/socket.so(socket_event_poll_in+0x54)[0x7ff18e0a1270]
/root/330/inst/lib/glusterfs/3.3.0qa26/rpc-transport/socket.so(socket_event_handler+0x21d)[0x7ff18e0a17f4]
/root/330/inst/lib/libglusterfs.so.0(+0x4d07c)[0x7ff19190507c]
/root/330/inst/lib/libglusterfs.so.0(+0x4d29f)[0x7ff19190529f]
/root/330/inst/lib/libglusterfs.so.0(event_dispatch+0x88)[0x7ff19190562a]
/root/330/inst/sbin/glusterd(main+0x238)[0x407dbd]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x309d21ecdd]
/root/330/inst/sbin/glusterd[0x403f79]
---------





Version-Release number of selected component (if applicable):
3.3.0qa26


How reproducible:
saw it once till now


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:
Comment 1 Amar Tumballi 2012-03-12 05:47:05 EDT
please update these bugs w.r.to 3.3.0qa27, need to work on it as per target milestone set.
Comment 2 krishnan parthasarathi 2012-03-12 13:35:08 EDT
Saurabh, could you add "Steps to reproduce" ?
Comment 3 Saurabh 2012-03-13 03:08:39 EDT
This happened only on server of cluster of 4 nodes and presently with 3.3.0qa27 it is not seen till now. 

Earlier I didn't mention the steps as it was happening only for one server and every time bringing up the "glusterd" this core got generated.
Comment 4 Amar Tumballi 2012-05-11 02:58:38 EDT
not reproducible, but have sent a patch to solve a possible (review based) race condition @ http://review.gluster.com/3183
Comment 5 Vijay Bellur 2012-05-18 09:00:47 EDT
Removing from blocker list as this has been seen only once.
Comment 6 krishnan parthasarathi 2012-06-25 00:44:50 EDT
Saurabh,
Could you check if this issue is seen in RHS 2.0 RC2?
Comment 7 Saurabh 2012-06-25 07:34:06 EDT
I tried to reproduce the issue on 3.3.0-rc2c, but the this is issue was not seen.

test that tried,
1. create a distribute volume.
2. bring down the glusterd on one node.
3. stop and delete the volume from other node.
4. start the glusterd of the other node.
Comment 8 Amar Tumballi 2012-07-11 02:29:48 EDT
Not able to reproduce the issue. Please re-open if you see the issue again.