Bug 1298524 - glusterd service crashed on restarting rpcbind
Summary: glusterd service crashed on restarting rpcbind
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: glusterd
Version: rhgs-3.1
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: ---
Assignee: Atin Mukherjee
QA Contact: SATHEESARAN
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-01-14 10:38 UTC by Apeksha
Modified: 2016-09-17 16:43 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-01-19 07:20:52 UTC
Target Upstream Version:


Attachments (Terms of Use)
glusterd log file (97.07 KB, text/plain)
2016-01-14 10:38 UTC, Apeksha
no flags Details

Description Apeksha 2016-01-14 10:38:32 UTC
Created attachment 1114749 [details]
glusterd log file

Description of problem:
glusterd service crashed on restarting rpcbind

Version-Release number of selected component (if applicable):
glusterfs-3.7.5-16.el7rhgs.x86_64

How reproducible:


Steps to Reproduce:
1. Update glusterfs from 3.75.15 tp 3.7.5.16 build
2. start glusterd and start the volume
3. setup ganesha on 4 nodes
4. Add the nlm port in ganesha.conf file
5. restart the rpcbind service on all 4nodes, glusterd stops on all 4 nodes

/var/log/glusterfs/etc-glusterfs-glusterd.vol.log
5/xlator/mgmt/glusterd.so(__glusterd_peer_rpc_notify+0x162) [0x7fa47375dbd2] -->/usr/lib64/glusterfs/3.7.5/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_unlock+0x58a) [0x7fa4737fcd2a] ) 0-management: Lock for vol testvol not held
[2016-01-14 17:57:19.545973] W [MSGID: 106118] [glusterd-handler.c:5088:__glusterd_peer_rpc_notify] 0-management: Lock not released for testvol
pending frames:
frame : type(0) op(0)
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash:
2016-01-14 17:57:19
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1

Actual results: glusterd crashed


Expected results: 


Additional info:

Comment 4 Atin Mukherjee 2016-01-14 12:02:15 UTC
I got to know that it was a layered installation and vdsm package was not installed due to which core_pattern was not set and hence we couldn't find the core file. Until and unless we have the core, we can't analyse the reason of the crash. Please try to reproduce this bug with vdsm package and let us know the behaviour. From the look of it, I feel you may not be able to hit it always and hence I'd suggest you to run the steps multiple times.

Comment 5 surabhi 2016-01-19 06:50:42 UTC
While running automation tests on cifs mount for multiple times following crash is seen and here is the bt :


(gdb) bt
#0  0x00007f55616460ad in rcu_read_lock_bp () from /lib64/liburcu-bp.so.1
#1  0x00007f5561c15ac0 in __glusterd_peer_rpc_notify (rpc=rpc@entry=0x7f556eeb1530, mydata=mydata@entry=0x7f556ee83850, event=event@entry=RPC_CLNT_DISCONNECT, 
    data=data@entry=0x0) at glusterd-handler.c:5020
#2  0x00007f5561c0bb6c in glusterd_big_locked_notify (rpc=0x7f556eeb1530, mydata=0x7f556ee83850, event=RPC_CLNT_DISCONNECT, data=0x0, 
    notify_fn=0x7f5561c15a70 <__glusterd_peer_rpc_notify>) at glusterd-handler.c:71
#3  0x00007f556ce7fcf0 in rpc_clnt_notify (trans=<optimized out>, mydata=0x7f556eeb1560, event=RPC_TRANSPORT_DISCONNECT, data=0x7f556eeb5630) at rpc-clnt.c:874
#4  0x00007f556ce7b913 in rpc_transport_notify (this=this@entry=0x7f556eeb5630, event=event@entry=RPC_TRANSPORT_DISCONNECT, data=data@entry=0x7f556eeb5630)
    at rpc-transport.c:545
#5  0x00007f555f0a5352 in socket_event_poll_err (this=0x7f556eeb5630) at socket.c:1151
#6  socket_event_handler (fd=fd@entry=13, idx=idx@entry=2, data=0x7f556eeb5630, poll_in=1, poll_out=0, poll_err=<optimized out>) at socket.c:2356
#7  0x00007f556d1128ca in event_dispatch_epoll_handler (event=0x7f555d0d2e80, event_pool=0x7f556ee31c90) at event-epoll.c:575
#8  event_dispatch_epoll_worker (data=0x7f556ee44fb0) at event-epoll.c:678
#9  0x00007f556bf19dc5 in start_thread (arg=0x7f555d0d3700) at pthread_create.c:308
#10 0x00007f556b8601cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

I am collecting the sosreports , will be uploading it soon with the core dump as well.

Comment 7 SATHEESARAN 2016-05-11 13:44:34 UTC
Removing the needinfo, as the bug was already closed


Note You need to log in before you can comment on or make changes to this bug.