Bug 1220623 - Seg. Fault during yum update
Summary: Seg. Fault during yum update
Keywords:
Status: CLOSED DUPLICATE of bug 1209461
Alias: None
Product: GlusterFS
Classification: Community
Component: glusterd
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
low
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-05-12 03:21 UTC by Christopher Pereira
Modified: 2015-05-25 21:33 UTC (History)
6 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2015-05-25 21:33:03 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Christopher Pereira 2015-05-12 03:21:32 UTC
This one is estethic IMO (low prio).

Each time we update, glusterd seg. faults in rcu_bp_register():

    warning: /var/lib/glusterd/vols/vdisks-h1/trusted-vdisks-h1.tcp-fuse.vol saved as /var/lib/glusterd/vols/vdisks-h1/trusted-vdisks-h1.tcp-fuse.vol.rpmsave
    warning: /var/lib/glusterd/vols/vdisks-h1/vdisks-h1-rebalance.vol saved as /var/lib/glusterd/vols/vdisks-h1/vdisks-h1-rebalance.vol.rpmsave
    warning: /var/lib/glusterd/vols/vdisks-h1/vdisks-h1.tcp-fuse.vol saved as /var/lib/glusterd/vols/vdisks-h1/vdisks-h1.tcp-fuse.vol.rpmsave
    librdmacm: Warning: couldn't read ABI version.
    librdmacm: Warning: assuming: 4
    librdmacm: Fatal: unable to get RDMA device list
    /var/tmp/rpm-tmp.JUQVwZ: line 50: 28247 Segmentation fault      (core dumped) glusterd --xlator-option *.upgrade=on -N <------------------
      Updating   : vdsm-infra-4.17.0-783.git79781a1.el7.noarch                                                                                                         8/37
      Updating   : vdsm-python-4.17.0-783.git79781a1.el7.noarch                                                                                                        9/37
      Updating   : vdsm-xmlrpc-4.17.0-783.git79781a1.el7.noarch  

== Stack Trace ==

(gdb) bt
#0  0x00007f62a1fe2c3b in rcu_bp_register () from /lib64/liburcu-bp.so.1
#1  0x00007f62a1fe2f7e in rcu_read_lock_bp () from /lib64/liburcu-bp.so.1
#2  0x00007f62a28c18e0 in __glusterd_peer_rpc_notify (rpc=rpc@entry=0x7f62b00ff210, mydata=mydata@entry=0x7f62b010a5f0, event=event@entry=RPC_CLNT_CONNECT,
    data=data@entry=0x0) at glusterd-handler.c:4689
#3  0x00007f62a28b943c in glusterd_big_locked_notify (rpc=0x7f62b00ff210, mydata=0x7f62b010a5f0, event=RPC_CLNT_CONNECT, data=0x0,
    notify_fn=0x7f62a28c1890 <__glusterd_peer_rpc_notify>) at glusterd-handler.c:71
#4  0x00007f62ad838720 in rpc_clnt_notify (trans=<optimized out>, mydata=0x7f62b00ff240, event=<optimized out>, data=<optimized out>) at rpc-clnt.c:926
#5  0x00007f62ad8345d3 in rpc_transport_notify (this=this@entry=0x7f62b010d840, event=event@entry=RPC_TRANSPORT_CONNECT, data=data@entry=0x7f62b010d840)
    at rpc-transport.c:543
#6  0x00007f62a02c5cf7 in socket_connect_finish (this=this@entry=0x7f62b010d840) at socket.c:2366
#7  0x00007f62a02cb28f in socket_event_handler (fd=fd@entry=11, idx=idx@entry=2, data=0x7f62b010d840, poll_in=0, poll_out=4, poll_err=0) at socket.c:2396
#8  0x00007f62adac632a in event_dispatch_epoll_handler (event=0x7f629e2cde80, event_pool=0x7f62b00310a0) at event-epoll.c:572
#9  event_dispatch_epoll_worker (data=0x7f62b003fa00) at event-epoll.c:674
#10 0x00007f62acbc7df5 in start_thread () from /lib64/libpthread.so.0
#11 0x00007f62ac5061ad in clone () from /lib64/libc.so.6

Comment 1 Christopher Pereira 2015-05-12 03:34:14 UTC
in __glusterd_peer_rpc_notify (
    rpc=rpc@entry=0x7f62b00ff210, mydata=mydata@entry=0x7f62b010a5f0,
    event=event@entry=RPC_CLNT_CONNECT, data=data@entry=0x0)
    at glusterd-handler.c:4689
4684                    GF_FREE (peerctx->peername);
4685                    GF_FREE (peerctx);
4686                    return 0;
4687            }
4688
4689            rcu_read_lock (); <----------------
4690
4691            peerinfo = glusterd_peerinfo_find_by_generation (peerctx->peerinfo_gen);
4692            if (!peerinfo) {
4693                    /* Peerinfo should be available at this point. Not finding it

 --- Called from ---

in glusterd_big_locked_notify (rpc=0x7f62b00ff210,
    mydata=0x7f62b010a5f0, event=RPC_CLNT_CONNECT, data=0x0,
    notify_fn=0x7f62a28c1890 <__glusterd_peer_rpc_notify>)
    at glusterd-handler.c:71

66      {
67              glusterd_conf_t *priv = THIS->private;
68              int              ret   = -1;
69
70              synclock_lock (&priv->big_lock);
71              ret = notify_fn (rpc, mydata, event, data); <------
72              synclock_unlock (&priv->big_lock);
73
74              return ret;
75      }

 --- Called from ---

#6  0x00007f62a02c5cf7 in socket_connect_finish (
    this=this@entry=0x7f62b010d840) at socket.c:2366

2361            }
2362    unlock:
2363            pthread_mutex_unlock (&priv->lock);
2364
2365            if (notify_rpc) {
2366                    rpc_transport_notify (this, event, this); <----
2367            }
2368    out:
2369            return ret;
2370    }

 --- Called from ---

#7  0x00007f62a02cb28f in socket_event_handler (fd=fd@entry=11,
    idx=idx@entry=2, data=0x7f62b010d840, poll_in=0, poll_out=4, poll_err=0)
    at socket.c:2396
(gdb) l
2391            {
2392                    priv->idx = idx;
2393            }
2394            pthread_mutex_unlock (&priv->lock);
2395
2396            ret = (priv->connected == 1) ? 0 :  <-----socket_connect_finish(this);
2397
2398            if (!ret && poll_out) {
2399                    ret = socket_event_poll_out (this);
2400            }

Comment 2 SATHEESARAN 2015-05-12 03:40:13 UTC
Could you provide the information on from which version you are trying to do update and TO version ?

Comment 3 Anand Nekkunti 2015-05-12 05:11:34 UTC
this bug is similar to https://bugzilla.redhat.com/show_bug.cgi?id=1209461 bug .

Comment 5 SATHEESARAN 2015-05-19 14:56:35 UTC
As I see from the mail-thread in gluster-devel@ML, this bug is related to upstream and moving to the correct component.

Comment 6 Kaushal 2015-05-20 07:16:54 UTC
Hi Christopher,

As Anand has already mentioned, this bug is most likely the same as 1209461. To verify, could you do a `thread apply all bt` on the core file. If you have another thread stack with `cleanup_and_exit` or `exit` present, then this bug is the same as 1209461, and we can close this as a duplicate. You can take a look at https://bugzilla.redhat.com/show_bug.cgi?id=1209461#c4 for a sample backtrace and more explanation as to the cause of the crash.

We have a fix for this at http://review.gluster.org/10758 but it causes some other components to fail. We are working on it, and should hopefully have the fix merged for the next release.

Comment 7 Christopher Pereira 2015-05-25 21:33:03 UTC
Hi Kaushal,

I lost the core file after reinstalling.
I'm now testing Gluster 3.7.
Since it looks pretty similar to BZ 1209461 (and it's low prio), I will close it as DUP and reopen with a 'thread apply all bt' if I see it again.

*** This bug has been marked as a duplicate of bug 1209461 ***


Note You need to log in before you can comment on or make changes to this bug.