This one is estethic IMO (low prio). Each time we update, glusterd seg. faults in rcu_bp_register(): warning: /var/lib/glusterd/vols/vdisks-h1/trusted-vdisks-h1.tcp-fuse.vol saved as /var/lib/glusterd/vols/vdisks-h1/trusted-vdisks-h1.tcp-fuse.vol.rpmsave warning: /var/lib/glusterd/vols/vdisks-h1/vdisks-h1-rebalance.vol saved as /var/lib/glusterd/vols/vdisks-h1/vdisks-h1-rebalance.vol.rpmsave warning: /var/lib/glusterd/vols/vdisks-h1/vdisks-h1.tcp-fuse.vol saved as /var/lib/glusterd/vols/vdisks-h1/vdisks-h1.tcp-fuse.vol.rpmsave librdmacm: Warning: couldn't read ABI version. librdmacm: Warning: assuming: 4 librdmacm: Fatal: unable to get RDMA device list /var/tmp/rpm-tmp.JUQVwZ: line 50: 28247 Segmentation fault (core dumped) glusterd --xlator-option *.upgrade=on -N <------------------ Updating : vdsm-infra-4.17.0-783.git79781a1.el7.noarch 8/37 Updating : vdsm-python-4.17.0-783.git79781a1.el7.noarch 9/37 Updating : vdsm-xmlrpc-4.17.0-783.git79781a1.el7.noarch == Stack Trace == (gdb) bt #0 0x00007f62a1fe2c3b in rcu_bp_register () from /lib64/liburcu-bp.so.1 #1 0x00007f62a1fe2f7e in rcu_read_lock_bp () from /lib64/liburcu-bp.so.1 #2 0x00007f62a28c18e0 in __glusterd_peer_rpc_notify (rpc=rpc@entry=0x7f62b00ff210, mydata=mydata@entry=0x7f62b010a5f0, event=event@entry=RPC_CLNT_CONNECT, data=data@entry=0x0) at glusterd-handler.c:4689 #3 0x00007f62a28b943c in glusterd_big_locked_notify (rpc=0x7f62b00ff210, mydata=0x7f62b010a5f0, event=RPC_CLNT_CONNECT, data=0x0, notify_fn=0x7f62a28c1890 <__glusterd_peer_rpc_notify>) at glusterd-handler.c:71 #4 0x00007f62ad838720 in rpc_clnt_notify (trans=<optimized out>, mydata=0x7f62b00ff240, event=<optimized out>, data=<optimized out>) at rpc-clnt.c:926 #5 0x00007f62ad8345d3 in rpc_transport_notify (this=this@entry=0x7f62b010d840, event=event@entry=RPC_TRANSPORT_CONNECT, data=data@entry=0x7f62b010d840) at rpc-transport.c:543 #6 0x00007f62a02c5cf7 in socket_connect_finish (this=this@entry=0x7f62b010d840) at socket.c:2366 #7 0x00007f62a02cb28f in socket_event_handler (fd=fd@entry=11, idx=idx@entry=2, data=0x7f62b010d840, poll_in=0, poll_out=4, poll_err=0) at socket.c:2396 #8 0x00007f62adac632a in event_dispatch_epoll_handler (event=0x7f629e2cde80, event_pool=0x7f62b00310a0) at event-epoll.c:572 #9 event_dispatch_epoll_worker (data=0x7f62b003fa00) at event-epoll.c:674 #10 0x00007f62acbc7df5 in start_thread () from /lib64/libpthread.so.0 #11 0x00007f62ac5061ad in clone () from /lib64/libc.so.6
in __glusterd_peer_rpc_notify ( rpc=rpc@entry=0x7f62b00ff210, mydata=mydata@entry=0x7f62b010a5f0, event=event@entry=RPC_CLNT_CONNECT, data=data@entry=0x0) at glusterd-handler.c:4689 4684 GF_FREE (peerctx->peername); 4685 GF_FREE (peerctx); 4686 return 0; 4687 } 4688 4689 rcu_read_lock (); <---------------- 4690 4691 peerinfo = glusterd_peerinfo_find_by_generation (peerctx->peerinfo_gen); 4692 if (!peerinfo) { 4693 /* Peerinfo should be available at this point. Not finding it --- Called from --- in glusterd_big_locked_notify (rpc=0x7f62b00ff210, mydata=0x7f62b010a5f0, event=RPC_CLNT_CONNECT, data=0x0, notify_fn=0x7f62a28c1890 <__glusterd_peer_rpc_notify>) at glusterd-handler.c:71 66 { 67 glusterd_conf_t *priv = THIS->private; 68 int ret = -1; 69 70 synclock_lock (&priv->big_lock); 71 ret = notify_fn (rpc, mydata, event, data); <------ 72 synclock_unlock (&priv->big_lock); 73 74 return ret; 75 } --- Called from --- #6 0x00007f62a02c5cf7 in socket_connect_finish ( this=this@entry=0x7f62b010d840) at socket.c:2366 2361 } 2362 unlock: 2363 pthread_mutex_unlock (&priv->lock); 2364 2365 if (notify_rpc) { 2366 rpc_transport_notify (this, event, this); <---- 2367 } 2368 out: 2369 return ret; 2370 } --- Called from --- #7 0x00007f62a02cb28f in socket_event_handler (fd=fd@entry=11, idx=idx@entry=2, data=0x7f62b010d840, poll_in=0, poll_out=4, poll_err=0) at socket.c:2396 (gdb) l 2391 { 2392 priv->idx = idx; 2393 } 2394 pthread_mutex_unlock (&priv->lock); 2395 2396 ret = (priv->connected == 1) ? 0 : <-----socket_connect_finish(this); 2397 2398 if (!ret && poll_out) { 2399 ret = socket_event_poll_out (this); 2400 }
Could you provide the information on from which version you are trying to do update and TO version ?
this bug is similar to https://bugzilla.redhat.com/show_bug.cgi?id=1209461 bug .
As I see from the mail-thread in gluster-devel@ML, this bug is related to upstream and moving to the correct component.
Hi Christopher, As Anand has already mentioned, this bug is most likely the same as 1209461. To verify, could you do a `thread apply all bt` on the core file. If you have another thread stack with `cleanup_and_exit` or `exit` present, then this bug is the same as 1209461, and we can close this as a duplicate. You can take a look at https://bugzilla.redhat.com/show_bug.cgi?id=1209461#c4 for a sample backtrace and more explanation as to the cause of the crash. We have a fix for this at http://review.gluster.org/10758 but it causes some other components to fail. We are working on it, and should hopefully have the fix merged for the next release.
Hi Kaushal, I lost the core file after reinstalling. I'm now testing Gluster 3.7. Since it looks pretty similar to BZ 1209461 (and it's low prio), I will close it as DUP and reopen with a 'thread apply all bt' if I see it again. *** This bug has been marked as a duplicate of bug 1209461 ***