Bug 1038051 - glusterd may crash when volume is deleted after performing a rebalance operation
Summary: glusterd may crash when volume is deleted after performing a rebalance operation
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: glusterd
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
Assignee: krishnan parthasarathi
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-12-04 10:33 UTC by krishnan parthasarathi
Modified: 2015-11-03 23:05 UTC (History)
2 users (show)

Fixed In Version: glusterfs-3.5.0
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-04-17 11:52:01 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:


Attachments (Terms of Use)

Description krishnan parthasarathi 2013-12-04 10:33:56 UTC
Description of problem:
glusterd crashes when a volume is deleted after a rebalance operation was stopped/completed on the same. This happens especially when one or more glusterd(s) in the cluster go down and come back up. This could be due to node going down or due to network partition.

Version-Release number of selected component (if applicable):
master

How reproducible:
Nearly always.

Steps to Reproduce:
1. Create a volume 
2. Start a volume
3. Create some data in the volume to ensure rebalance operation doesn't finish to soon
4. Start a rebalance operation
5. Bring down one or more peers and back up before the rebalance operation 
completes.
6. Keep performing other operations on the cluster. Within a matter of minutes, one of the glusterd(s) in the cluster may have crashed.

Actual results:
One or more glusterd(s) may crash.

Expected results:
glusterd shouldn't crash.


Additional info:
Backtrace from the crash

pending frames:

patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 2013-12-03 12:29:35configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1

/lib64/libc.so.6[0x30d3c32960]
/usr/lib64/glusterfs/3.4.0.44.1u2rhs/xlator/mgmt/glusterd.so(__glusterd_defrag_notify+0x1d0)[0x7f6a194fd550]
/usr/lib64/glusterfs/3.4.0.44.1u2rhs/xlator/mgmt/glusterd.so(glusterd_big_locked_notify+0x60)[0x7f6a194ad830]
/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x109)[0x7f6a1ccdf2e9]
/usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x28)[0x7f6a1ccdab78]
/usr/lib64/glusterfs/3.4.0.44.1u2rhs/rpc-transport/socket.so(+0x557c)[0x7f6a17d5057c]
/usr/lib64/glusterfs/3.4.0.44.1u2rhs/rpc-transport/socket.so(+0xa5b8)[0x7f6a17d555b8]
/usr/lib64/libglusterfs.so.0(+0x62327)[0x7f6a1cf4a327]
/usr/sbin/glusterd(main+0x6c7)[0x4069d7]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x30d3c1ecdd]
/usr/sbin/glusterd[0x404619]

Comment 1 krishnan parthasarathi 2013-12-04 10:39:29 UTC
Root cause analysis:
--------------------
Important observations that lead to the resolution.
---------------------------------------------------
1) Backtrace has a rebalance notification call stack of a volume that has been deleted.

2) Glusterd logs indicate that there are two (or more) connections to the rebalance process. This we know from the following,

<log snip>

[2013-12-03 09:48:40.120897] I [socket.c:2235:socket_event_handler] 0-transport: disconnecting now
[2013-12-03 09:48:40.120974] I [socket.c:2235:socket_event_handler] 0-transport: disconnecting now
[2013-12-03 09:48:43.121833] I [socket.c:2235:socket_event_handler] 0-transport: disconnecting now
[2013-12-03 09:48:43.121884] I [socket.c:2235:socket_event_handler] 0-transport: disconnecting now
[2013-12-03 09:48:46.122443] I [socket.c:2235:socket_event_handler] 0-transport: disconnecting now
[2013-12-03 09:48:46.122514] I [socket.c:2235:socket_event_handler] 0-transport: disconnecting now

</log snip>
We see that there are two connections that are attempting to reconnect every 3 secs, in tandem.

3) The following logs assert there is more than on rpc object (read unix domain socket connection) to the same (dead) rebalance process,

<log snip>

====> [2013-12-03 09:48:36.495445] W [socket.c:522:__socket_rwv] 0-management: readv on /var/lib/glusterd/vols/vol1/rebalance/b2461aa6-24bb-4d70-b43b-d6a73ab84698.sock failed (No data available)
====> [2013-12-03 09:48:36.506183] W [socket.c:522:__socket_rwv] 0-management: readv on /var/lib/glusterd/vols/vol1/rebalance/b2461aa6-24bb-4d70-b43b-d6a73ab84698.sock failed (No data available)
[2013-12-03 09:48:36.506238] I [mem-pool.c:539:mem_pool_destroy] 0-management: size=2236 max=1 total=1
[2013-12-03 09:48:36.506254] I [mem-pool.c:539:mem_pool_destroy] 0-management: size=124 max=1 total=1

</log snip>

The arrow mark highlight the evidence we have.

Now for the root cause,
Every time a peer (re)joins the cluster, each member in the cluster, for every new (re)joinee (probably due to network partition or node reboot), checks if it needs to restart daemons for all the started volumes in the cluster.

Rebalance is once such daemon/process that each restarts conditionally. But, the corresponding management connection to rebalance process was being (re)created unconditionally, 'leaking' previously created rpc objects. Now these 'ghost' rpc objects go into reconnect loop, once every 3 seconds. With rebalance process stopped/dead. and the volume itself stopped and deleted, they have a stale reference to the deleted volume. Any access to this reference is a potential segmentation fault (SIGSEGV).
Finally, justice is delivered, glusterd segfaults, and thus we see the crash.

Comment 2 Anand Avati 2013-12-04 10:50:17 UTC
REVIEW: http://review.gluster.org/6423 (glusterd: create rpc obj for rebalance only if absent) posted (#1) for review on master by Krishnan Parthasarathi (kparthas)

Comment 3 Anand Avati 2013-12-04 10:52:55 UTC
REVIEW: http://review.gluster.org/6424 (glusterd: create rpc obj for rebalance only if absent) posted (#1) for review on release-3.5 by Krishnan Parthasarathi (kparthas)

Comment 4 Anand Avati 2013-12-04 11:19:45 UTC
REVIEW: http://review.gluster.org/6423 (glusterd: create rpc obj for rebalance only if absent) posted (#2) for review on master by Krishnan Parthasarathi (kparthas)

Comment 5 Anand Avati 2013-12-04 11:21:44 UTC
REVIEW: http://review.gluster.org/6424 (glusterd: create rpc obj for rebalance only if absent) posted (#2) for review on release-3.5 by Krishnan Parthasarathi (kparthas)

Comment 6 Anand Avati 2013-12-04 18:07:49 UTC
COMMIT: http://review.gluster.org/6423 committed in master by Anand Avati (avati) 
------
commit e967e5c5ab42359b765d602abb439b579d7a7423
Author: Krishnan Parthasarathi <kparthas>
Date:   Wed Dec 4 15:55:01 2013 +0530

    glusterd: create rpc obj for rebalance only if absent
    
    Change-Id: Iff305023577ff92a8f43f24dafcf201f86805769
    BUG: 1038051
    Signed-off-by: Krishnan Parthasarathi <kparthas>
    Reviewed-on: http://review.gluster.org/6423
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Anand Avati <avati>

Comment 7 Anand Avati 2013-12-06 00:44:24 UTC
COMMIT: http://review.gluster.org/6424 committed in release-3.5 by Anand Avati (avati) 
------
commit b58810f5df92873ddd658efaae1caddddce96ae2
Author: Krishnan Parthasarathi <kparthas>
Date:   Wed Dec 4 15:55:01 2013 +0530

    glusterd: create rpc obj for rebalance only if absent
    
    Change-Id: Iff305023577ff92a8f43f24dafcf201f86805769
    BUG: 1038051
    Signed-off-by: Krishnan Parthasarathi <kparthas>
    Reviewed-on: http://review.gluster.org/6424
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Anand Avati <avati>

Comment 8 Anand Avati 2013-12-19 07:58:49 UTC
REVIEW: http://review.gluster.org/6525 (glusterd: ignore failure to stop a stopped service.) posted (#3) for review on master by Krishnan Parthasarathi (kparthas)

Comment 9 Anand Avati 2013-12-19 07:58:56 UTC
REVIEW: http://review.gluster.org/6521 (glusterd: make volinfo a refcnt'ed object.) posted (#4) for review on master by Krishnan Parthasarathi (kparthas)

Comment 10 Anand Avati 2013-12-19 07:59:04 UTC
REVIEW: http://review.gluster.org/6522 (glusterd: rebalance to ref volinfo before starting) posted (#4) for review on master by Krishnan Parthasarathi (kparthas)

Comment 11 Anand Avati 2013-12-19 11:46:01 UTC
COMMIT: http://review.gluster.org/6525 committed in master by Vijay Bellur (vbellur) 
------
commit 709d9247bb467b801814637bd181bc7cddd36cb5
Author: Krishnan Parthasarathi <kparthas>
Date:   Tue Dec 17 11:43:22 2013 +0530

    glusterd: ignore failure to stop a stopped service.
    
    kill(2) returns -1 with errno set to ESRCH when the pid of the process
    being killed doesn't exist. Failing glusterd_brick_stop on a stopped
    brick could result in volume-stop failing, in commit phase.
    This fix prevents that from happening.
    
    Change-Id: I00f46fa06e489a671efbb8e4119f545f8ccea329
    BUG: 1038051
    Signed-off-by: Krishnan Parthasarathi <kparthas>
    Reviewed-on: http://review.gluster.org/6525
    Reviewed-by: Vijaikumar Mallikarjuna <vmallika>
    Reviewed-by: Kaushal M <kaushal>
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Vijay Bellur <vbellur>

Comment 12 Anand Avati 2013-12-19 15:04:19 UTC
REVIEW: http://review.gluster.org/6521 (glusterd: make volinfo a refcnt'ed object.) posted (#5) for review on master by Krishnan Parthasarathi (kparthas)

Comment 13 Anand Avati 2013-12-19 15:04:29 UTC
REVIEW: http://review.gluster.org/6522 (glusterd: rebalance to ref volinfo before starting) posted (#5) for review on master by Krishnan Parthasarathi (kparthas)

Comment 14 Anand Avati 2013-12-20 08:55:50 UTC
COMMIT: http://review.gluster.org/6522 committed in master by Vijay Bellur (vbellur) 
------
commit 79d5a31279825bdc61ad036b30fbe7e41b76fe5e
Author: Krishnan Parthasarathi <kparthas>
Date:   Tue Dec 17 01:12:05 2013 +0530

    glusterd: rebalance to ref volinfo before starting
    
    Change-Id: Ib316897dcbd0748bfb3bfcda186b9fe30c07f80f
    BUG: 1038051
    Signed-off-by: Krishnan Parthasarathi <kparthas>
    Reviewed-on: http://review.gluster.org/6522
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Kaushal M <kaushal>

Comment 15 Anand Avati 2013-12-20 09:17:58 UTC
COMMIT: http://review.gluster.org/6521 committed in master by Vijay Bellur (vbellur) 
------
commit 6fcc8df5956501bbb3687331ea518b231611856a
Author: Krishnan Parthasarathi <kparthas>
Date:   Mon Dec 16 10:29:19 2013 +0530

    glusterd: make volinfo a refcnt'ed object.
    
    Add glusterd_volinfo_remove(..) which removes @volinfo from the list
    of volumes in the cluster and performs an unref on @volinfo
    
    Change-Id: I5f546ca58f61bc334ab1bab4c51c4a21e1f66161
    BUG: 1038051
    Signed-off-by: Krishnan Parthasarathi <kparthas>
    Reviewed-on: http://review.gluster.org/6521
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Kaushal M <kaushal>
    Reviewed-by: Vijay Bellur <vbellur>

Comment 16 Anand Avati 2013-12-23 09:00:15 UTC
REVIEW: http://review.gluster.org/6568 (glusterd: ignore failure to stop a stopped service.) posted (#1) for review on release-3.5 by Krishnan Parthasarathi (kparthas)

Comment 17 Anand Avati 2013-12-23 09:00:25 UTC
REVIEW: http://review.gluster.org/6569 (glusterd: make volinfo a refcnt'ed object.) posted (#1) for review on release-3.5 by Krishnan Parthasarathi (kparthas)

Comment 18 Anand Avati 2013-12-23 09:00:31 UTC
REVIEW: http://review.gluster.org/6570 (glusterd: rebalance to ref volinfo before starting) posted (#1) for review on release-3.5 by Krishnan Parthasarathi (kparthas)

Comment 19 Anand Avati 2013-12-23 14:59:16 UTC
COMMIT: http://review.gluster.org/6568 committed in release-3.5 by Vijay Bellur (vbellur) 
------
commit 0a99ef20b8e8f3486f5ada8e82e4634eb9fbf62b
Author: Krishnan Parthasarathi <kparthas>
Date:   Mon Dec 23 14:09:54 2013 +0530

    glusterd: ignore failure to stop a stopped service.
    
            Backport of http://review.gluster.org/6525
    
    kill(2) returns -1 with errno set to ESRCH when the pid of the process
    being killed doesn't exist. Failing glusterd_brick_stop on a stopped
    brick could result in volume-stop failing, in commit phase.
    This fix prevents that from happening.
    
    Change-Id: I00f46fa06e489a671efbb8e4119f545f8ccea329
    BUG: 1038051
    Signed-off-by: Krishnan Parthasarathi <kparthas>
    Reviewed-on: http://review.gluster.org/6568
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Vijay Bellur <vbellur>

Comment 20 Anand Avati 2013-12-23 15:00:42 UTC
COMMIT: http://review.gluster.org/6569 committed in release-3.5 by Vijay Bellur (vbellur) 
------
commit 94ed403ec213ee955acc55cc4a04f7c39470855b
Author: Krishnan Parthasarathi <kparthas>
Date:   Mon Dec 16 10:29:19 2013 +0530

    glusterd: make volinfo a refcnt'ed object.
    
            Backport of http://review.gluster.org/6521
    
    Add glusterd_volinfo_remove(..) which removes @volinfo from the list
    of volumes in the cluster and performs an unref on @volinfo
    
    Change-Id: I5f546ca58f61bc334ab1bab4c51c4a21e1f66161
    BUG: 1038051
    Signed-off-by: Krishnan Parthasarathi <kparthas>
    Reviewed-on: http://review.gluster.org/6569
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Vijay Bellur <vbellur>

Comment 21 Anand Avati 2013-12-23 15:00:50 UTC
COMMIT: http://review.gluster.org/6570 committed in release-3.5 by Vijay Bellur (vbellur) 
------
commit 592bf35c4b84df2693f3e113355d86316beaf26d
Author: Krishnan Parthasarathi <kparthas>
Date:   Tue Dec 17 01:12:05 2013 +0530

    glusterd: rebalance to ref volinfo before starting
    
            Backport of http://review.gluster.org/6522
    
    Change-Id: Ib316897dcbd0748bfb3bfcda186b9fe30c07f80f
    BUG: 1038051
    Signed-off-by: Krishnan Parthasarathi <kparthas>
    Reviewed-on: http://review.gluster.org/6570
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Vijay Bellur <vbellur>

Comment 22 Niels de Vos 2014-04-17 11:52:01 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.5.0, please reopen this bug report.

glusterfs-3.5.0 has been announced on the Gluster Developers mailinglist [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/6137
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user


Note You need to log in before you can comment on or make changes to this bug.