Description of problem: ======================= Glusterd crashed on a few nodes Geo-replication status was CREATED/ACTIVE as opposed to ACTIVE/PASSIVE. Geo-replication session was started and the following was shown as the status of the session: ---------------------------------------------------------------------------------------------- [root@dhcp41-226 scripts]# gluster volume geo-replication master 10.70.41.160::slave status MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED ----------------------------------------------------------------------------------------------------------------------------------------------------- 10.70.41.226 master /rhs/brick3/b7 root 10.70.41.160::slave N/A Created N/A N/A 10.70.41.226 master /rhs/brick1/b1 root 10.70.41.160::slave N/A Created N/A N/A 10.70.41.230 master /rhs/brick2/b5 root 10.70.41.160::slave N/A Created N/A N/A 10.70.41.229 master /rhs/brick2/b4 root 10.70.41.160::slave N/A Created N/A N/A 10.70.41.219 master /rhs/brick2/b6 root 10.70.41.160::slave N/A Created N/A N/A 10.70.41.227 master /rhs/brick3/b8 root 10.70.41.160::slave N/A Created N/A N/A 10.70.41.227 master /rhs/brick1/b2 root 10.70.41.160::slave N/A Created N/A N/A 10.70.41.228 master /rhs/brick3/b9 root 10.70.41.160::slave 10.70.41.160 Active Changelog Crawl 2018-04-23 06:13:53 10.70.41.228 master /rhs/brick1/b3 root 10.70.41.160::slave 10.70.42.79 Active Changelog Crawl 2018-04-23 06:13:53 glusterd logs: ------------- [2018-04-23 07:34:16.850166] E [mem-pool.c:307:__gf_free] (-->/usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0x419cf) [0x7f98a9e619cf] -->/usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0x44ca5) [0x7f98a9e64ca5] -->/lib64/libglusterfs.so.0(__gf_free+0xac) [0x7f98b53e268c] ) 0-: Assertion failed: GF_MEM_HEADER_MAGIC == header->magic pending frames: frame : type(0) op(0) frame : type(0) op(0) patchset: git://git.gluster.org/glusterfs.git signal received: 6 time of crash: 2018-04-23 07:34:16 configuration details: argp 1 backtrace 1 dlfcn 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.12.2 /lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xa0)[0x7f98b53ba4d0] /lib64/libglusterfs.so.0(gf_print_trace+0x334)[0x7f98b53c4414] /lib64/libc.so.6(+0x36280)[0x7f98b3a19280] /lib64/libc.so.6(gsignal+0x37)[0x7f98b3a19207] /lib64/libc.so.6(abort+0x148)[0x7f98b3a1a8f8] /lib64/libc.so.6(+0x78cc7)[0x7f98b3a5bcc7] /lib64/libc.so.6(+0x7f574)[0x7f98b3a62574] /usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0x44ca5)[0x7f98a9e64ca5] /usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0x419cf)[0x7f98a9e619cf] /usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0x1bdc2)[0x7f98a9e3bdc2] /usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0x23b6e)[0x7f98a9e43b6e] /lib64/libglusterfs.so.0(synctask_wrap+0x10)[0x7f98b53f3250] /lib64/libc.so.6(+0x47fc0)[0x7f98b3a2afc0] --------- Version-Release number of selected component (if applicable): ============================================================= mainline How reproducible: ================= 1/1 Steps to Reproduce: =================== 1. Create Master and a Slave cluster from 6 nodes (each) 2. Create and Start master volume (Tiered: cold-tier 1x(4+2) and hot-tier 1x3) 4. Create and Start slave volume (Tiered: cold-tier 1x(4+2) and hot-tier 1x3) 5. Enable quota on master volume 6. Enable shared storage on master volume 7. Setup geo-rep session between master and slave volume 8. Mount master volume on client 9. Create data from master client Actual results: ================ Glusterd crashed on a few nodes Geo-rep session was in Created/ACTIVE state Expected results: ================= Glusterd should not crash A geo-rep session which was started should be in ACTIVE/PASSIVE state. (gdb) bt #0 0x00007f3fbd4d7e4d in __gf_free () from /lib64/libglusterfs.so.0 #1 0x00007f3fb1ff63de in gd_sync_task_begin () from /usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so #2 0x00007f3fb1ff6c50 in glusterd_op_begin_synctask () from /usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so #3 0x00007f3fb1fc3d98 in __glusterd_handle_gsync_set () from /usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so #4 0x00007f3fb1f38b1e in glusterd_big_locked_handler () from /usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so #5 0x00007f3fbd4e8ad0 in synctask_wrap () from /lib64/libglusterfs.so.0 #6 0x00007f3fbbb1ffc0 in ?? () from /lib64/libc.so.6 #7 0x0000000000000000 in ?? () (gdb)
REVIEW: https://review.gluster.org/19993 (glusterd/geo-rep: Fix glusterd crash) posted (#1) for review on master by Kotresh HR
COMMIT: https://review.gluster.org/19993 committed in master by "Amar Tumballi" <amarts> with a commit message- glusterd/geo-rep: Fix glusterd crash Using strdump instead of gf_strdup crashes during free if mempool is being used. gf_free checks the magic number in the header which will not be taken care if strdup is used. fixes: bz#1576392 Change-Id: Iab36496554b838a036af9d863e3f5fd07fd9780e Signed-off-by: Kotresh HR <khiremat>
REVISION POSTED: https://review.gluster.org/20019 (glusterd/geo-rep: Fix glusterd crash) posted (#2) for review on release-3.12 by Kotresh HR
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-5.0, please open a new bug report. glusterfs-5.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] https://lists.gluster.org/pipermail/announce/2018-October/000115.html [2] https://www.gluster.org/pipermail/gluster-users/