Bug 797129

Summary: gluster volume stop <volume_name> failed and crashed glusterd
Product: [Community] GlusterFS Reporter: Shwetha Panduranga <shwetha.h.panduranga>
Component: glusterdAssignee: Kaushal <kaushal>
Status: CLOSED WORKSFORME QA Contact:
Severity: medium Docs Contact:
Priority: high    
Version: mainlineCC: gluster-bugs, vbellur
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-04-16 13:18:14 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
glusterd log file none

Description Shwetha Panduranga 2012-02-24 11:07:16 UTC
Created attachment 565582 [details]
glusterd log file

Description of problem:

Core was generated by `glusterd --xlator-option *.brick-with-valgrind=yes'.
Program terminated with signal 6, Aborted.
#0  0x0000003af1a32905 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.25.el6_1.3.x86_64 libgcc-4.4.6-3.el6.x86_64
(gdb) bt
#0  0x0000003af1a32905 in raise () from /lib64/libc.so.6
#1  0x0000003af1a340e5 in abort () from /lib64/libc.so.6
#2  0x0000003af1a2b9be in __assert_fail_base () from /lib64/libc.so.6
#3  0x0000003af1a2ba80 in __assert_fail () from /lib64/libc.so.6
#4  0x00007fbcd358c0ef in __gf_free (free_ptr=0x1dd40d0) at mem-pool.c:273
#5  0x00007fbcd3333a90 in saved_frames_destroy (frames=0x1dd40d0) at rpc-clnt.c:406
#6  0x00007fbcd33365b6 in rpc_clnt_destroy (rpc=0x1dbf350) at rpc-clnt.c:1577
#7  0x00007fbcd3336679 in rpc_clnt_unref (rpc=0x1dbf350) at rpc-clnt.c:1603
#8  0x00007fbccffd7352 in glusterd_friend_cleanup (peerinfo=0x1dbca00) at glusterd-utils.c:893
#9  0x00007fbccffc990d in glusterd_ac_friend_remove (event=0x7fbcc8000bb0, ctx=0x0) at glusterd-sm.c:608
#10 0x00007fbccffca074 in glusterd_friend_sm () at glusterd-sm.c:994
#11 0x00007fbcd0012f82 in glusterd_handle_cli_stop_volume (req=0x7fbccff2f930) at glusterd-volume-ops.c:355
#12 0x00007fbcd332b0a9 in rpcsvc_handle_rpc_call (svc=0x1d8d010, trans=0x1d99ac0, msg=0x1d916b0) at rpcsvc.c:514
#13 0x00007fbcd332b44c in rpcsvc_notify (trans=0x1d99ac0, mydata=0x1d8d010, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x1d916b0) at rpcsvc.c:610
#14 0x00007fbcd3330da8 in rpc_transport_notify (this=0x1d99ac0, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x1d916b0) at rpc-transport.c:498
#15 0x00007fbccfd23270 in socket_event_poll_in (this=0x1d99ac0) at socket.c:1686
#16 0x00007fbccfd237f4 in socket_event_handler (fd=11, idx=6, data=0x1d99ac0, poll_in=1, poll_out=0, poll_err=0) at socket.c:1801
#17 0x00007fbcd358b05c in event_dispatch_epoll_handler (event_pool=0x1d81370, events=0x1d90330, i=0) at event.c:794
#18 0x00007fbcd358b27f in event_dispatch_epoll (event_pool=0x1d81370) at event.c:856
#19 0x00007fbcd358b60a in event_dispatch (event_pool=0x1d81370) at event.c:956
#20 0x0000000000407dcc in main (argc=3, argv=0x7fff40096a78) at glusterfsd.c:1612
(gdb) bt full
#0  0x0000003af1a32905 in raise () from /lib64/libc.so.6
No symbol table info available.
#1  0x0000003af1a340e5 in abort () from /lib64/libc.so.6
No symbol table info available.
#2  0x0000003af1a2b9be in __assert_fail_base () from /lib64/libc.so.6
No symbol table info available.
#3  0x0000003af1a2ba80 in __assert_fail () from /lib64/libc.so.6
No symbol table info available.
#4  0x00007fbcd358c0ef in __gf_free (free_ptr=0x1dd40d0) at mem-pool.c:273
        req_size = 0
        ptr = 0x1dd40c4 ""
        type = 0
        xl = 0x0
        __PRETTY_FUNCTION__ = "__gf_free"
#5  0x00007fbcd3333a90 in saved_frames_destroy (frames=0x1dd40d0) at rpc-clnt.c:406
No locals.
#6  0x00007fbcd33365b6 in rpc_clnt_destroy (rpc=0x1dbf350) at rpc-clnt.c:1577
No locals.
#7  0x00007fbcd3336679 in rpc_clnt_unref (rpc=0x1dbf350) at rpc-clnt.c:1603
        count = 0
#8  0x00007fbccffd7352 in glusterd_friend_cleanup (peerinfo=0x1dbca00) at glusterd-utils.c:893
        __PRETTY_FUNCTION__ = "glusterd_friend_cleanup"
        peerctx = 0x1dd4520
#9  0x00007fbccffc990d in glusterd_ac_friend_remove (event=0x7fbcc8000bb0, ctx=0x0) at glusterd-sm.c:608
        ret = 0
        __FUNCTION__ = "glusterd_ac_friend_remove"
#10 0x00007fbccffca074 in glusterd_friend_sm () at glusterd-sm.c:994
        event = 0x7fbcc8000bb0
        tmp = 0x7fbcd02498b0
        ret = -1
        handler = 0x7fbccffc9874 <glusterd_ac_friend_remove>
        state = 0x7fbcd0246740
        peerinfo = 0x1dbca00
        event_type = GD_FRIEND_EVENT_REMOVE_FRIEND
        is_await_conn = _gf_false
        __FUNCTION__ = "glusterd_friend_sm"


Version-Release number of selected component (if applicable):
mainline

How reproducible:
occasionally 

Steps to Reproduce:
[root@APP-SERVER1 ~]# gluster volume replace-brick dist 192.168.2.35:/export1
192.168.2.36:/export1 start
192.168.2.36, is not a friend
[root@APP-SERVER1 ~]# gluster peer probe 192.168.2.36
[root@APP-SERVER1 ~]# echo $?
110
[root@APP-SERVER1 ~]# gluster peer probe 192.168.2.36
Probe on host 192.168.2.36 port 24007 already in peer list
[root@APP-SERVER1 ~]# gluster peer detach 192.168.2.36
[root@APP-SERVER1 ~]# echo $?
110
[root@APP-SERVER1 ~]# gluster peer status
Number of Peers: 1
Hostname: 192.168.2.36
Uuid: 5d16bbd9-065d-486d-9f3b-51d888a73fee
State: Probe Sent to Peer (Connected)
[root@APP-SERVER1 ~]# gluster peer detach 192.168.2.36
[root@APP-SERVER1 ~]# echo $?
110
[root@APP-SERVER1 ~]# gluster volume replace-brick dist 192.168.2.35:/export1
192.168.2.36:/export1 start
192.168.2.36, is not befriended at the moment
[root@APP-SERVER1 ~]# gluster peer probe 192.168.2.35
Probe on localhost not needed
[root@APP-SERVER1 ~]# gluster peer probe 192.168.2.36
Probe on host 192.168.2.36 port 24007 already in peer list
[root@APP-SERVER1 ~]# 
[root@APP-SERVER1 ~]# gluster volume replace-brick dist 192.168.2.35:/export1
192.168.2.36:/export1 start
192.168.2.36, is not befriended at the moment
[root@APP-SERVER1 ~]# gluster volume stop dist
Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y

Additional info:
Refer to Bug 797105. Steps in the bug 797105 were executed before volume stop.

Comment 1 Amar Tumballi 2012-03-12 09:46:19 UTC
please update these bugs w.r.to 3.3.0qa27, need to work on it as per target milestone set.

Comment 2 Kaushal 2012-04-16 13:18:14 UTC
Cannot reproduce this on latest mainline. Closing as works for me.