Hide Forgot
Description of problem: glusterd crashes when a volume is deleted after a rebalance operation was stopped/completed on the same. This happens especially when one or more glusterd(s) in the cluster go down and come back up. This could be due to node going down or due to network partition. Version-Release number of selected component (if applicable): master How reproducible: Nearly always. Steps to Reproduce: 1. Create a volume 2. Start a volume 3. Create some data in the volume to ensure rebalance operation doesn't finish to soon 4. Start a rebalance operation 5. Bring down one or more peers and back up before the rebalance operation completes. 6. Keep performing other operations on the cluster. Within a matter of minutes, one of the glusterd(s) in the cluster may have crashed. Actual results: One or more glusterd(s) may crash. Expected results: glusterd shouldn't crash. Additional info: Backtrace from the crash pending frames: patchset: git://git.gluster.com/glusterfs.git signal received: 11 time of crash: 2013-12-03 12:29:35configuration details: argp 1 backtrace 1 dlfcn 1 fdatasync 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 /lib64/libc.so.6[0x30d3c32960] /usr/lib64/glusterfs/3.4.0.44.1u2rhs/xlator/mgmt/glusterd.so(__glusterd_defrag_notify+0x1d0)[0x7f6a194fd550] /usr/lib64/glusterfs/3.4.0.44.1u2rhs/xlator/mgmt/glusterd.so(glusterd_big_locked_notify+0x60)[0x7f6a194ad830] /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x109)[0x7f6a1ccdf2e9] /usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x28)[0x7f6a1ccdab78] /usr/lib64/glusterfs/3.4.0.44.1u2rhs/rpc-transport/socket.so(+0x557c)[0x7f6a17d5057c] /usr/lib64/glusterfs/3.4.0.44.1u2rhs/rpc-transport/socket.so(+0xa5b8)[0x7f6a17d555b8] /usr/lib64/libglusterfs.so.0(+0x62327)[0x7f6a1cf4a327] /usr/sbin/glusterd(main+0x6c7)[0x4069d7] /lib64/libc.so.6(__libc_start_main+0xfd)[0x30d3c1ecdd] /usr/sbin/glusterd[0x404619]
Root cause analysis: -------------------- Important observations that lead to the resolution. --------------------------------------------------- 1) Backtrace has a rebalance notification call stack of a volume that has been deleted. 2) Glusterd logs indicate that there are two (or more) connections to the rebalance process. This we know from the following, <log snip> [2013-12-03 09:48:40.120897] I [socket.c:2235:socket_event_handler] 0-transport: disconnecting now [2013-12-03 09:48:40.120974] I [socket.c:2235:socket_event_handler] 0-transport: disconnecting now [2013-12-03 09:48:43.121833] I [socket.c:2235:socket_event_handler] 0-transport: disconnecting now [2013-12-03 09:48:43.121884] I [socket.c:2235:socket_event_handler] 0-transport: disconnecting now [2013-12-03 09:48:46.122443] I [socket.c:2235:socket_event_handler] 0-transport: disconnecting now [2013-12-03 09:48:46.122514] I [socket.c:2235:socket_event_handler] 0-transport: disconnecting now </log snip> We see that there are two connections that are attempting to reconnect every 3 secs, in tandem. 3) The following logs assert there is more than on rpc object (read unix domain socket connection) to the same (dead) rebalance process, <log snip> ====> [2013-12-03 09:48:36.495445] W [socket.c:522:__socket_rwv] 0-management: readv on /var/lib/glusterd/vols/vol1/rebalance/b2461aa6-24bb-4d70-b43b-d6a73ab84698.sock failed (No data available) ====> [2013-12-03 09:48:36.506183] W [socket.c:522:__socket_rwv] 0-management: readv on /var/lib/glusterd/vols/vol1/rebalance/b2461aa6-24bb-4d70-b43b-d6a73ab84698.sock failed (No data available) [2013-12-03 09:48:36.506238] I [mem-pool.c:539:mem_pool_destroy] 0-management: size=2236 max=1 total=1 [2013-12-03 09:48:36.506254] I [mem-pool.c:539:mem_pool_destroy] 0-management: size=124 max=1 total=1 </log snip> The arrow mark highlight the evidence we have. Now for the root cause, Every time a peer (re)joins the cluster, each member in the cluster, for every new (re)joinee (probably due to network partition or node reboot), checks if it needs to restart daemons for all the started volumes in the cluster. Rebalance is once such daemon/process that each restarts conditionally. But, the corresponding management connection to rebalance process was being (re)created unconditionally, 'leaking' previously created rpc objects. Now these 'ghost' rpc objects go into reconnect loop, once every 3 seconds. With rebalance process stopped/dead. and the volume itself stopped and deleted, they have a stale reference to the deleted volume. Any access to this reference is a potential segmentation fault (SIGSEGV). Finally, justice is delivered, glusterd segfaults, and thus we see the crash.
REVIEW: http://review.gluster.org/6423 (glusterd: create rpc obj for rebalance only if absent) posted (#1) for review on master by Krishnan Parthasarathi (kparthas)
REVIEW: http://review.gluster.org/6424 (glusterd: create rpc obj for rebalance only if absent) posted (#1) for review on release-3.5 by Krishnan Parthasarathi (kparthas)
REVIEW: http://review.gluster.org/6423 (glusterd: create rpc obj for rebalance only if absent) posted (#2) for review on master by Krishnan Parthasarathi (kparthas)
REVIEW: http://review.gluster.org/6424 (glusterd: create rpc obj for rebalance only if absent) posted (#2) for review on release-3.5 by Krishnan Parthasarathi (kparthas)
COMMIT: http://review.gluster.org/6423 committed in master by Anand Avati (avati) ------ commit e967e5c5ab42359b765d602abb439b579d7a7423 Author: Krishnan Parthasarathi <kparthas> Date: Wed Dec 4 15:55:01 2013 +0530 glusterd: create rpc obj for rebalance only if absent Change-Id: Iff305023577ff92a8f43f24dafcf201f86805769 BUG: 1038051 Signed-off-by: Krishnan Parthasarathi <kparthas> Reviewed-on: http://review.gluster.org/6423 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Anand Avati <avati>
COMMIT: http://review.gluster.org/6424 committed in release-3.5 by Anand Avati (avati) ------ commit b58810f5df92873ddd658efaae1caddddce96ae2 Author: Krishnan Parthasarathi <kparthas> Date: Wed Dec 4 15:55:01 2013 +0530 glusterd: create rpc obj for rebalance only if absent Change-Id: Iff305023577ff92a8f43f24dafcf201f86805769 BUG: 1038051 Signed-off-by: Krishnan Parthasarathi <kparthas> Reviewed-on: http://review.gluster.org/6424 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Anand Avati <avati>
REVIEW: http://review.gluster.org/6525 (glusterd: ignore failure to stop a stopped service.) posted (#3) for review on master by Krishnan Parthasarathi (kparthas)
REVIEW: http://review.gluster.org/6521 (glusterd: make volinfo a refcnt'ed object.) posted (#4) for review on master by Krishnan Parthasarathi (kparthas)
REVIEW: http://review.gluster.org/6522 (glusterd: rebalance to ref volinfo before starting) posted (#4) for review on master by Krishnan Parthasarathi (kparthas)
COMMIT: http://review.gluster.org/6525 committed in master by Vijay Bellur (vbellur) ------ commit 709d9247bb467b801814637bd181bc7cddd36cb5 Author: Krishnan Parthasarathi <kparthas> Date: Tue Dec 17 11:43:22 2013 +0530 glusterd: ignore failure to stop a stopped service. kill(2) returns -1 with errno set to ESRCH when the pid of the process being killed doesn't exist. Failing glusterd_brick_stop on a stopped brick could result in volume-stop failing, in commit phase. This fix prevents that from happening. Change-Id: I00f46fa06e489a671efbb8e4119f545f8ccea329 BUG: 1038051 Signed-off-by: Krishnan Parthasarathi <kparthas> Reviewed-on: http://review.gluster.org/6525 Reviewed-by: Vijaikumar Mallikarjuna <vmallika> Reviewed-by: Kaushal M <kaushal> Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Vijay Bellur <vbellur>
REVIEW: http://review.gluster.org/6521 (glusterd: make volinfo a refcnt'ed object.) posted (#5) for review on master by Krishnan Parthasarathi (kparthas)
REVIEW: http://review.gluster.org/6522 (glusterd: rebalance to ref volinfo before starting) posted (#5) for review on master by Krishnan Parthasarathi (kparthas)
COMMIT: http://review.gluster.org/6522 committed in master by Vijay Bellur (vbellur) ------ commit 79d5a31279825bdc61ad036b30fbe7e41b76fe5e Author: Krishnan Parthasarathi <kparthas> Date: Tue Dec 17 01:12:05 2013 +0530 glusterd: rebalance to ref volinfo before starting Change-Id: Ib316897dcbd0748bfb3bfcda186b9fe30c07f80f BUG: 1038051 Signed-off-by: Krishnan Parthasarathi <kparthas> Reviewed-on: http://review.gluster.org/6522 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Kaushal M <kaushal>
COMMIT: http://review.gluster.org/6521 committed in master by Vijay Bellur (vbellur) ------ commit 6fcc8df5956501bbb3687331ea518b231611856a Author: Krishnan Parthasarathi <kparthas> Date: Mon Dec 16 10:29:19 2013 +0530 glusterd: make volinfo a refcnt'ed object. Add glusterd_volinfo_remove(..) which removes @volinfo from the list of volumes in the cluster and performs an unref on @volinfo Change-Id: I5f546ca58f61bc334ab1bab4c51c4a21e1f66161 BUG: 1038051 Signed-off-by: Krishnan Parthasarathi <kparthas> Reviewed-on: http://review.gluster.org/6521 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Kaushal M <kaushal> Reviewed-by: Vijay Bellur <vbellur>
REVIEW: http://review.gluster.org/6568 (glusterd: ignore failure to stop a stopped service.) posted (#1) for review on release-3.5 by Krishnan Parthasarathi (kparthas)
REVIEW: http://review.gluster.org/6569 (glusterd: make volinfo a refcnt'ed object.) posted (#1) for review on release-3.5 by Krishnan Parthasarathi (kparthas)
REVIEW: http://review.gluster.org/6570 (glusterd: rebalance to ref volinfo before starting) posted (#1) for review on release-3.5 by Krishnan Parthasarathi (kparthas)
COMMIT: http://review.gluster.org/6568 committed in release-3.5 by Vijay Bellur (vbellur) ------ commit 0a99ef20b8e8f3486f5ada8e82e4634eb9fbf62b Author: Krishnan Parthasarathi <kparthas> Date: Mon Dec 23 14:09:54 2013 +0530 glusterd: ignore failure to stop a stopped service. Backport of http://review.gluster.org/6525 kill(2) returns -1 with errno set to ESRCH when the pid of the process being killed doesn't exist. Failing glusterd_brick_stop on a stopped brick could result in volume-stop failing, in commit phase. This fix prevents that from happening. Change-Id: I00f46fa06e489a671efbb8e4119f545f8ccea329 BUG: 1038051 Signed-off-by: Krishnan Parthasarathi <kparthas> Reviewed-on: http://review.gluster.org/6568 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Vijay Bellur <vbellur>
COMMIT: http://review.gluster.org/6569 committed in release-3.5 by Vijay Bellur (vbellur) ------ commit 94ed403ec213ee955acc55cc4a04f7c39470855b Author: Krishnan Parthasarathi <kparthas> Date: Mon Dec 16 10:29:19 2013 +0530 glusterd: make volinfo a refcnt'ed object. Backport of http://review.gluster.org/6521 Add glusterd_volinfo_remove(..) which removes @volinfo from the list of volumes in the cluster and performs an unref on @volinfo Change-Id: I5f546ca58f61bc334ab1bab4c51c4a21e1f66161 BUG: 1038051 Signed-off-by: Krishnan Parthasarathi <kparthas> Reviewed-on: http://review.gluster.org/6569 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Vijay Bellur <vbellur>
COMMIT: http://review.gluster.org/6570 committed in release-3.5 by Vijay Bellur (vbellur) ------ commit 592bf35c4b84df2693f3e113355d86316beaf26d Author: Krishnan Parthasarathi <kparthas> Date: Tue Dec 17 01:12:05 2013 +0530 glusterd: rebalance to ref volinfo before starting Backport of http://review.gluster.org/6522 Change-Id: Ib316897dcbd0748bfb3bfcda186b9fe30c07f80f BUG: 1038051 Signed-off-by: Krishnan Parthasarathi <kparthas> Reviewed-on: http://review.gluster.org/6570 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Vijay Bellur <vbellur>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.5.0, please reopen this bug report. glusterfs-3.5.0 has been announced on the Gluster Developers mailinglist [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/6137 [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user