Created attachment 1481296 [details] Coredump of glusterd (on master server) Description of problem: Gluster daemon on master servers crashes when geo-replication is created. Version-Release number of selected component (if applicable): 4.1.3 How reproducible: Always Steps to Reproduce: Master volume setup: # gluster v create VOLUME2 replica 3 arbiter 1 transport tcp clrv0000110367:/gluster/VOLUME2/export clrv0000110371:/gluster/VOLUME2/export clrv0000110389:/gluster/VOLUME2/export # gluster v start VOLUME2 # gluster volume set all cluster.enable-shared-storage enable Slave volume setup # gluster v create VOLUME2 replica 3 arbiter 1 transport tcp clrv0000110605:/gluster/VOLUME2/export clrv0000110608:/gluster/VOLUME2/export clrv0000110606:/gluster/VOLUME2/export # gluster v start VOLUME2 # gluster volume set all cluster.enable-shared-storage enable On master server: # ssh-keygen (accepting all defaults) # ssh-copy-id clrv0000110605 (one of the slave servers) # gluster-georep-sshkey generate # gluster volume geo-replication VOLUME2 clrv0000110605.ic.ing.net::VOLUME2 create push-pem Actual results: Glusterd crashes (on master servers). /var/log/glusterfs/glusterd.log has : ---- [2018-09-06 12:43:13.426622] E [mem-pool.c:335:__gf_free] (-->/usr/lib64/glusterfs/4.1.3/xlator/mgmt/glusterd.so(+0x2436e) [0x7f1e1bf3536e] -->/usr/lib64/glusterfs/4.1.3/xlator/mgmt/glusterd.so(+0x20e10) [0x7f1e1bf31e10] -->/lib64/libglusterfs.so.0(__gf_free+0x104) [0x7f1e274d44f4] ) 0-: Assertion failed: GF_MEM_TRAILER_MAGIC == *(uint32_t *)((char *)free_ptr + header->size) [2018-09-06 12:43:13.452244] E [mem-pool.c:335:__gf_free] (-->/usr/lib64/glusterfs/4.1.3/xlator/mgmt/glusterd.so(+0x2436e) [0x7f1e1bf3536e] -->/usr/lib64/glusterfs/4.1.3/xlator/mgmt/glusterd.so(+0x20e10) [0x7f1e1bf31e10] -->/lib64/libglusterfs.so.0(__gf_free+0x104) [0x7f1e274d44f4] ) 0-: Assertion failed: GF_MEM_TRAILER_MAGIC == *(uint32_t *)((char *)free_ptr + header->size) The message "I [MSGID: 106584] [glusterd-handler.c:5904:__glusterd_handle_get_state] 0-management: Received request to get state for glusterd" repeated 3 times between [2018-09-06 12:42:30.427752] and [2018-09-06 12:43:13.451952] [2018-09-06 12:43:15.475145] I [MSGID: 106495] [glusterd-handler.c:3073:__glusterd_handle_getwd] 0-glusterd: Received getwd req [2018-09-06 12:43:21.173767] I [MSGID: 106487] [glusterd-handler.c:1486:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req [2018-09-06 12:43:26.112439] I [MSGID: 106495] [glusterd-handler.c:3073:__glusterd_handle_getwd] 0-glusterd: Received getwd req [2018-09-06 12:43:27.326758] I [MSGID: 106494] [glusterd-handler.c:3024:__glusterd_handle_cli_profile_volume] 0-management: Received volume profile req for volume VOLUME1 [2018-09-06 12:43:30.827592] W [MSGID: 106028] [glusterd-geo-rep.c:2568:glusterd_get_statefile_name] 0-management: Config file (/var/lib/glusterd/geo-replication/VOLUME2_clrv0000110605.ic.ing.net_VOLUME2/gsyncd.conf) missing. Looking for template config file (/var/lib/glusterd/geo-replication/gsyncd_template.conf) [No such file or directory] [2018-09-06 12:43:30.827709] I [MSGID: 106294] [glusterd-geo-rep.c:2577:glusterd_get_statefile_name] 0-management: Using default config template(/var/lib/glusterd/geo-replication/gsyncd_template.conf). [2018-09-06 12:43:34.309355] I [MSGID: 106131] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: nfs already stopped [2018-09-06 12:43:34.309437] I [MSGID: 106568] [glusterd-svc-mgmt.c:235:glusterd_svc_stop] 0-management: nfs service is stopped [2018-09-06 12:43:34.309491] I [MSGID: 106599] [glusterd-nfs-svc.c:82:glusterd_nfssvc_manager] 0-management: nfs/server.so xlator is not installed [2018-09-06 12:43:34.312800] I [MSGID: 106568] [glusterd-proc-mgmt.c:87:glusterd_proc_stop] 0-management: Stopping glustershd daemon running in pid: 7655 [2018-09-06 12:43:35.313131] I [MSGID: 106568] [glusterd-svc-mgmt.c:235:glusterd_svc_stop] 0-management: glustershd service is stopped [2018-09-06 12:43:35.313365] I [MSGID: 106567] [glusterd-svc-mgmt.c:203:glusterd_svc_start] 0-management: Starting glustershd service [2018-09-06 12:43:35.320449] I [MSGID: 106131] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: bitd already stopped [2018-09-06 12:43:35.320552] I [MSGID: 106568] [glusterd-svc-mgmt.c:235:glusterd_svc_stop] 0-management: bitd service is stopped [2018-09-06 12:43:35.320778] I [MSGID: 106131] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: scrub already stopped [2018-09-06 12:43:35.320844] I [MSGID: 106568] [glusterd-svc-mgmt.c:235:glusterd_svc_stop] 0-management: scrub service is stopped [2018-09-06 12:43:35.352791] I [run.c:241:runner_log] (-->/usr/lib64/glusterfs/4.1.3/xlator/mgmt/glusterd.so(+0xe2b1a) [0x7f1e1bff3b1a] -->/usr/lib64/glusterfs/4.1.3/xlator/mgmt/glusterd.so(+0xe25e5) [0x7f1e1bff35e5] -->/lib64/libglusterfs.so.0(runner_log+0x115) [0x7f1e274ff0c5] ) 0-management: Ran script: /var/lib/glusterd/hooks/1/gsync-create/post/S56glusterd-geo-rep-create-post.sh --volname=VOLUME2 is_push_pem=0,pub_file=/var/lib/glusterd/geo-replication/common_secret.pem.pub,slave_user=root,slave_ip=clrv0000110605.ic.ing.net,slave_vol=VOLUME2,ssh_port=22 [2018-09-06 12:43:37.336124] I [MSGID: 106584] [glusterd-handler.c:5904:__glusterd_handle_get_state] 0-management: Received request to get state for glusterd [2018-09-06 12:43:37.336509] E [mem-pool.c:335:__gf_free] (-->/usr/lib64/glusterfs/4.1.3/xlator/mgmt/glusterd.so(+0x2436e) [0x7f1e1bf3536e] -->/usr/lib64/glusterfs/4.1.3/xlator/mgmt/glusterd.so(+0x20e10) [0x7f1e1bf31e10] -->/lib64/libglusterfs.so.0(__gf_free+0x104) [0x7f1e274d44f4] ) 0-: Assertion failed: GF_MEM_TRAILER_MAGIC == *(uint32_t *)((char *)free_ptr + header->size) [2018-09-06 12:43:37.580687] I [MSGID: 106327] [glusterd-geo-rep.c:4482:glusterd_read_status_file] 0-management: Using passed config template(/var/lib/glusterd/geo-replication/VOLUME2_clrv0000110605.ic.ing.net_VOLUME2/gsyncd.conf). [2018-09-06 12:43:37.895299] E [mem-pool.c:326:__gf_free] (-->/lib64/libglusterfs.so.0(+0x1a2c0) [0x7f1e2749e2c0] -->/lib64/libglusterfs.so.0(data_destroy+0x5d) [0x7f1e2749d92d] -->/lib64/libglusterfs.so.0(__gf_free+0xa4) [0x7f1e274d4494] ) 0-: Assertion failed: GF_MEM_HEADER_MAGIC == header->magic The message "E [MSGID: 106332] [glusterd-utils.c:12886:glusterd_get_value_for_vme_entry] 0-management: Failed to get option for nufa key" repeated 2 times between [2018-09-06 12:42:30.567851] and [2018-09-06 12:42:30.743621] [2018-09-06 12:42:30.802375] E [MSGID: 106332] [glusterd-utils.c:12886:glusterd_get_value_for_vme_entry] 0-management: Failed to get option for max-op-version key The message "E [MSGID: 106332] [glusterd-utils.c:12886:glusterd_get_value_for_vme_entry] 0-management: Failed to get option for localtime-logging key" repeated 2 times between [2018-09-06 12:42:30.654159] and [2018-09-06 12:42:30.829908] pending frames: frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) patchset: git://git.gluster.org/glusterfs.git signal received: 11 time of crash: 2018-09-06 12:43:37 configuration details: argp 1 backtrace 1 dlfcn 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 4.1.3 /lib64/libglusterfs.so.0(+0x25920)[0x7f1e274a9920] /lib64/libglusterfs.so.0(gf_print_trace+0x334)[0x7f1e274b3874] /lib64/libc.so.6(+0x36280)[0x7f1e25b0e280] /lib64/libglusterfs.so.0(__gf_free+0xb5)[0x7f1e274d44a5] /lib64/libglusterfs.so.0(data_destroy+0x5d)[0x7f1e2749d92d] /lib64/libglusterfs.so.0(+0x1a2c0)[0x7f1e2749e2c0] /usr/lib64/glusterfs/4.1.3/xlator/mgmt/glusterd.so(+0x22254)[0x7f1e1bf33254] /usr/lib64/glusterfs/4.1.3/xlator/mgmt/glusterd.so(+0x2436e)[0x7f1e1bf3536e] /lib64/libglusterfs.so.0(+0x622b0)[0x7f1e274e62b0] /lib64/libc.so.6(+0x47fc0)[0x7f1e25b1ffc0] --------- Expected results: A created geo-replication session for a given volume Additional info:
Sanju, did we recently fix this?
Snaju, Looks like https://review.gluster.org/#/c/glusterfs/+/20461/. https://bugzilla.redhat.com/show_bug.cgi?id=1598345. please confirm and close this.
Yes Sunny, this is the same bug. Closing this bug. *** This bug has been marked as a duplicate of bug 1598345 ***