Bug 1442928 - Brick Multiplexing: Glusterd crashed when stopping volumes
Summary: Brick Multiplexing: Glusterd crashed when stopping volumes
Keywords:
Status: CLOSED DUPLICATE of bug 1238067
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: glusterd
Version: rhgs-3.3
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: Atin Mukherjee
QA Contact: Bala Konda Reddy M
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-04-18 06:15 UTC by Nag Pavan Chilakam
Modified: 2018-11-30 05:38 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-04-18 07:15:21 UTC
Embargoed:


Attachments (Terms of Use)
core (1.69 MB, application/x-gzip)
2017-04-18 06:15 UTC, Nag Pavan Chilakam
no flags Details

Description Nag Pavan Chilakam 2017-04-18 06:15:16 UTC
Created attachment 1272221 [details]
core

Description of problem:
=================
Glusterd crashed when stopping volumes. 
I had about 20 volumes and was stopping them in sequence one after another post hitting BZ# 1442787 - Brick Multiplexing: During Remove brick when glusterd of a node is stopped, the brick process gets disconnected from glusterd purview and hence losing multiplexing feature 

As the trace is different to 1437957 - Brick Multiplexing: Glusterd crashed when stopping volumes hence raising a new bz




[root@dhcp35-130 ~]# file /core.19886 
/core.19886: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from '/usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO', real uid: 0, effective uid: 0, real gid: 0, effective gid: 0, execfn: '/usr/sbin/glusterd', platform: 'x86_64'
[root@dhcp35-130 ~]# gdb /usr/sbin/glusterd /core.19886 
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-94.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/sbin/glusterfsd...Reading symbols from /usr/lib/debug/usr/sbin/glusterfsd.debug...done.
done.

warning: core file may not match specified executable file.
[New LWP 20094]
[New LWP 19891]
[New LWP 19890]
[New LWP 19887]
[New LWP 20093]
[New LWP 19889]
[New LWP 19888]
[New LWP 19886]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007fcd547110ad in rcu_read_lock_bp () from /lib64/liburcu-bp.so.1
Missing separate debuginfos, use: debuginfo-install bzip2-libs-1.0.6-13.el7.x86_64 device-mapper-event-libs-1.02.135-1.el7_3.3.x86_64 device-mapper-libs-1.02.135-1.el7_3.3.x86_64 elfutils-libelf-0.166-2.el7.x86_64 elfutils-libs-0.166-2.el7.x86_64 glibc-2.17-157.el7_3.1.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.14.1-27.el7_3.x86_64 libattr-2.4.46-12.el7.x86_64 libblkid-2.23.2-33.el7.x86_64 libcap-2.22-8.el7.x86_64 libcom_err-1.42.9-9.el7.x86_64 libgcc-4.8.5-11.el7.x86_64 libselinux-2.5-6.el7.x86_64 libsepol-2.5-6.el7.x86_64 libuuid-2.23.2-33.el7.x86_64 libxml2-2.9.1-6.el7_2.3.x86_64 lvm2-libs-2.02.166-1.el7_3.3.x86_64 openssl-libs-1.0.1e-60.el7_3.1.x86_64 pcre-8.32-15.el7_2.1.x86_64 systemd-libs-219-30.el7_3.7.x86_64 userspace-rcu-0.7.9-2.el7rhgs.x86_64 xz-libs-5.2.2-1.el7.x86_64 zlib-1.2.7-17.el7.x86_64
(gdb) 
(gdb) bt
#0  0x00007fcd547110ad in rcu_read_lock_bp () from /lib64/liburcu-bp.so.1
#1  0x00007fcd54ca990e in __glusterd_peer_rpc_notify (rpc=rpc@entry=0x7fcd40199290, mydata=mydata@entry=0x7fcd40198440, event=event@entry=RPC_CLNT_DISCONNECT, data=data@entry=0x0) at glusterd-handler.c:5807
#2  0x00007fcd54c9fc3c in glusterd_big_locked_notify (rpc=0x7fcd40199290, mydata=0x7fcd40198440, event=RPC_CLNT_DISCONNECT, data=0x0, notify_fn=0x7fcd54ca98c0 <__glusterd_peer_rpc_notify>)
    at glusterd-handler.c:69
#3  0x00007fcd5ff44a13 in rpc_clnt_handle_disconnect (conn=0x7fcd401992c0, clnt=0x7fcd40199290) at rpc-clnt.c:892
#4  rpc_clnt_notify (trans=<optimized out>, mydata=0x7fcd401992c0, event=<optimized out>, data=0x7fcd40199490) at rpc-clnt.c:955
#5  0x00007fcd5ff409f3 in rpc_transport_notify (this=this@entry=0x7fcd40199490, event=event@entry=RPC_TRANSPORT_DISCONNECT, data=data@entry=0x7fcd40199490) at rpc-transport.c:538
#6  0x00007fcd5210f822 in socket_event_poll_err (this=0x7fcd40199490) at socket.c:1184
#7  socket_event_handler (fd=<optimized out>, idx=6, data=0x7fcd40199490, poll_in=1, poll_out=0, poll_err=<optimized out>) at socket.c:2418
#8  0x00007fcd601d4e50 in event_dispatch_epoll_handler (event=0x7fcd4bffee80, event_pool=0x7fcd61ef6f00) at event-epoll.c:572
#9  event_dispatch_epoll_worker (data=0x7fcd61f4c490) at event-epoll.c:675
#10 0x00007fcd5efdadc5 in start_thread () from /lib64/libpthread.so.0
#11 0x00007fcd5e91f73d in clone () from /lib64/libc.so.6
(gdb) 
#0  0x00007fcd547110ad in rcu_read_lock_bp () from /lib64/liburcu-bp.so.1
#1  0x00007fcd54ca990e in __glusterd_peer_rpc_notify (rpc=rpc@entry=0x7fcd40199290, mydata=mydata@entry=0x7fcd40198440, event=event@entry=RPC_CLNT_DISCONNECT, data=data@entry=0x0) at glusterd-handler.c:5807
#2  0x00007fcd54c9fc3c in glusterd_big_locked_notify (rpc=0x7fcd40199290, mydata=0x7fcd40198440, event=RPC_CLNT_DISCONNECT, data=0x0, notify_fn=0x7fcd54ca98c0 <__glusterd_peer_rpc_notify>)
    at glusterd-handler.c:69
#3  0x00007fcd5ff44a13 in rpc_clnt_handle_disconnect (conn=0x7fcd401992c0, clnt=0x7fcd40199290) at rpc-clnt.c:892
#4  rpc_clnt_notify (trans=<optimized out>, mydata=0x7fcd401992c0, event=<optimized out>, data=0x7fcd40199490) at rpc-clnt.c:955
#5  0x00007fcd5ff409f3 in rpc_transport_notify (this=this@entry=0x7fcd40199490, event=event@entry=RPC_TRANSPORT_DISCONNECT, data=data@entry=0x7fcd40199490) at rpc-transport.c:538
#6  0x00007fcd5210f822 in socket_event_poll_err (this=0x7fcd40199490) at socket.c:1184
#7  socket_event_handler (fd=<optimized out>, idx=6, data=0x7fcd40199490, poll_in=1, poll_out=0, poll_err=<optimized out>) at socket.c:2418
#8  0x00007fcd601d4e50 in event_dispatch_epoll_handler (event=0x7fcd4bffee80, event_pool=0x7fcd61ef6f00) at event-epoll.c:572
#9  event_dispatch_epoll_worker (data=0x7fcd61f4c490) at event-epoll.c:675
#10 0x00007fcd5efdadc5 in start_thread () from /lib64/libpthread.so.0
#11 0x00007fcd5e91f73d in clone () from /lib64/libc.so.6
(gdb) 
#0  0x00007fcd547110ad in rcu_read_lock_bp () from /lib64/liburcu-bp.so.1
#1  0x00007fcd54ca990e in __glusterd_peer_rpc_notify (rpc=rpc@entry=0x7fcd40199290, mydata=mydata@entry=0x7fcd40198440, event=event@entry=RPC_CLNT_DISCONNECT, data=data@entry=0x0) at glusterd-handler.c:5807
#2  0x00007fcd54c9fc3c in glusterd_big_locked_notify (rpc=0x7fcd40199290, mydata=0x7fcd40198440, event=RPC_CLNT_DISCONNECT, data=0x0, notify_fn=0x7fcd54ca98c0 <__glusterd_peer_rpc_notify>)
    at glusterd-handler.c:69
#3  0x00007fcd5ff44a13 in rpc_clnt_handle_disconnect (conn=0x7fcd401992c0, clnt=0x7fcd40199290) at rpc-clnt.c:892
#4  rpc_clnt_notify (trans=<optimized out>, mydata=0x7fcd401992c0, event=<optimized out>, data=0x7fcd40199490) at rpc-clnt.c:955
#5  0x00007fcd5ff409f3 in rpc_transport_notify (this=this@entry=0x7fcd40199490, event=event@entry=RPC_TRANSPORT_DISCONNECT, data=data@entry=0x7fcd40199490) at rpc-transport.c:538
#6  0x00007fcd5210f822 in socket_event_poll_err (this=0x7fcd40199490) at socket.c:1184
#7  socket_event_handler (fd=<optimized out>, idx=6, data=0x7fcd40199490, poll_in=1, poll_out=0, poll_err=<optimized out>) at socket.c:2418
#8  0x00007fcd601d4e50 in event_dispatch_epoll_handler (event=0x7fcd4bffee80, event_pool=0x7fcd61ef6f00) at event-epoll.c:572
#9  event_dispatch_epoll_worker (data=0x7fcd61f4c490) at event-epoll.c:675
#10 0x00007fcd5efdadc5 in start_thread () from /lib64/libpthread.so.0
#11 0x00007fcd5e91f73d in clone () from /lib64/libc.so.6
(gdb) 
#0  0x00007fcd547110ad in rcu_read_lock_bp () from /lib64/liburcu-bp.so.1
#1  0x00007fcd54ca990e in __glusterd_peer_rpc_notify (rpc=rpc@entry=0x7fcd40199290, mydata=mydata@entry=0x7fcd40198440, event=event@entry=RPC_CLNT_DISCONNECT, data=data@entry=0x0) at glusterd-handler.c:5807
#2  0x00007fcd54c9fc3c in glusterd_big_locked_notify (rpc=0x7fcd40199290, mydata=0x7fcd40198440, event=RPC_CLNT_DISCONNECT, data=0x0, notify_fn=0x7fcd54ca98c0 <__glusterd_peer_rpc_notify>)
    at glusterd-handler.c:69
#3  0x00007fcd5ff44a13 in rpc_clnt_handle_disconnect (conn=0x7fcd401992c0, clnt=0x7fcd40199290) at rpc-clnt.c:892
#4  rpc_clnt_notify (trans=<optimized out>, mydata=0x7fcd401992c0, event=<optimized out>, data=0x7fcd40199490) at rpc-clnt.c:955
#5  0x00007fcd5ff409f3 in rpc_transport_notify (this=this@entry=0x7fcd40199490, event=event@entry=RPC_TRANSPORT_DISCONNECT, data=data@entry=0x7fcd40199490) at rpc-transport.c:538
#6  0x00007fcd5210f822 in socket_event_poll_err (this=0x7fcd40199490) at socket.c:1184
#7  socket_event_handler (fd=<optimized out>, idx=6, data=0x7fcd40199490, poll_in=1, poll_out=0, poll_err=<optimized out>) at socket.c:2418
#8  0x00007fcd601d4e50 in event_dispatch_epoll_handler (event=0x7fcd4bffee80, event_pool=0x7fcd61ef6f00) at event-epoll.c:572
#9  event_dispatch_epoll_worker (data=0x7fcd61f4c490) at event-epoll.c:675
#10 0x00007fcd5efdadc5 in start_thread () from /lib64/libpthread.so.0
#11 0x00007fcd5e91f73d in clone () from /lib64/libc.so.6
(gdb) t a a bt

Thread 8 (Thread 0x7fcd60654780 (LWP 19886)):
#0  0x00007fcd5efdbef7 in pthread_join () from /lib64/libpthread.so.0
#1  0x00007fcd601d52e0 in event_dispatch_epoll (event_pool=0x7fcd61ef6f00) at event-epoll.c:759
#2  0x00007fcd6066ed95 in main (argc=5, argv=<optimized out>) at glusterfsd.c:2464

Thread 7 (Thread 0x7fcd56ffd700 (LWP 19888)):
#0  0x00007fcd5450af04 in _fini () from /lib64/liburcu-cds.so.1
#1  0x00007fcd60455878 in _dl_fini () from /lib64/ld-linux-x86-64.so.2
#2  0x00007fcd5e860a49 in __run_exit_handlers () from /lib64/libc.so.6
#3  0x00007fcd5e860a95 in exit () from /lib64/libc.so.6
#4  0x00007fcd60671e16 in cleanup_and_exit (signum=15) at glusterfsd.c:1342
#5  0x00007fcd60671f05 in glusterfs_sigwaiter (arg=<optimized out>) at glusterfsd.c:2063
#6  0x00007fcd5efdadc5 in start_thread () from /lib64/libpthread.so.0
#7  0x00007fcd5e91f73d in clone () from /lib64/libc.so.6

Thread 6 (Thread 0x7fcd567fc700 (LWP 19889)):
#0  0x00007fcd5e8e666d in nanosleep () from /lib64/libc.so.6
#1  0x00007fcd5e8e6504 in sleep () from /lib64/libc.so.6
#2  0x00007fcd601a182d in pool_sweeper (arg=<optimized out>) at mem-pool.c:464
#3  0x00007fcd5efdadc5 in start_thread () from /lib64/libpthread.so.0
#4  0x00007fcd5e91f73d in clone () from /lib64/libc.so.6

Thread 5 (Thread 0x7fcd50eff700 (LWP 20093)):
#0  0x00007fcd5efde6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fcd54d53c43 in hooks_worker (args=<optimized out>) at glusterd-hooks.c:531
#2  0x00007fcd5efdadc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007fcd5e91f73d in clone () from /lib64/libc.so.6

Thread 4 (Thread 0x7fcd577fe700 (LWP 19887)):
#0  0x00007fcd5efe1bdd in nanosleep () from /lib64/libpthread.so.0
#1  0x00007fcd60188306 in gf_timer_proc (data=0x7fcd61efe770) at timer.c:176
#2  0x00007fcd5efdadc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007fcd5e91f73d in clone () from /lib64/libc.so.6

Thread 3 (Thread 0x7fcd55ffb700 (LWP 19890)):
#0  0x00007fcd5efdea82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fcd601b3898 in syncenv_task (proc=proc@entry=0x7fcd61efefc0) at syncop.c:603
#2  0x00007fcd601b46e0 in syncenv_processor (thdata=0x7fcd61efefc0) at syncop.c:695
#3  0x00007fcd5efdadc5 in start_thread () from /lib64/libpthread.so.0
#4  0x00007fcd5e91f73d in clone () from /lib64/libc.so.6

Thread 2 (Thread 0x7fcd557fa700 (LWP 19891)):
#0  0x00007fcd5efdea82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fcd601b3898 in syncenv_task (proc=proc@entry=0x7fcd61eff380) at syncop.c:603
#2  0x00007fcd601b46e0 in syncenv_processor (thdata=0x7fcd61eff380) at syncop.c:695
#3  0x00007fcd5efdadc5 in start_thread () from /lib64/libpthread.so.0
#4  0x00007fcd5e91f73d in clone () from /lib64/libc.so.6

Thread 1 (Thread 0x7fcd4bfff700 (LWP 20094)):
#0  0x00007fcd547110ad in rcu_read_lock_bp () from /lib64/liburcu-bp.so.1
#1  0x00007fcd54ca990e in __glusterd_peer_rpc_notify (rpc=rpc@entry=0x7fcd40199290, mydata=mydata@entry=0x7fcd40198440, event=event@entry=RPC_CLNT_DISCONNECT, data=data@entry=0x0) at glusterd-handler.c:5807
#2  0x00007fcd54c9fc3c in glusterd_big_locked_notify (rpc=0x7fcd40199290, mydata=0x7fcd40198440, event=RPC_CLNT_DISCONNECT, data=0x0, notify_fn=0x7fcd54ca98c0 <__glusterd_peer_rpc_notify>)
    at glusterd-handler.c:69
#3  0x00007fcd5ff44a13 in rpc_clnt_handle_disconnect (conn=0x7fcd401992c0, clnt=0x7fcd40199290) at rpc-clnt.c:892
---Type <return> to continue, or q <return> to quit---
#4  rpc_clnt_notify (trans=<optimized out>, mydata=0x7fcd401992c0, event=<optimized out>, data=0x7fcd40199490) at rpc-clnt.c:955
#5  0x00007fcd5ff409f3 in rpc_transport_notify (this=this@entry=0x7fcd40199490, event=event@entry=RPC_TRANSPORT_DISCONNECT, data=data@entry=0x7fcd40199490) at rpc-transport.c:538
#6  0x00007fcd5210f822 in socket_event_poll_err (this=0x7fcd40199490) at socket.c:1184
#7  socket_event_handler (fd=<optimized out>, idx=6, data=0x7fcd40199490, poll_in=1, poll_out=0, poll_err=<optimized out>) at socket.c:2418
#8  0x00007fcd601d4e50 in event_dispatch_epoll_handler (event=0x7fcd4bffee80, event_pool=0x7fcd61ef6f00) at event-epoll.c:572
#9  event_dispatch_epoll_worker (data=0x7fcd61f4c490) at event-epoll.c:675
#10 0x00007fcd5efdadc5 in start_thread () from /lib64/libpthread.so.0
#11 0x00007fcd5e91f73d in clone () from /lib64/libc.so.6




Version-Release number of selected component (if applicable):
======
3.8.4-22

Comment 2 Atin Mukherjee 2017-04-18 07:15:21 UTC

*** This bug has been marked as a duplicate of bug 1238067 ***


Note You need to log in before you can comment on or make changes to this bug.