Bug 1626043 - Geo-replication setup fails, glusterd crahes when replication is setup
Summary: Geo-replication setup fails, glusterd crahes when replication is setup
Keywords:
Status: CLOSED DUPLICATE of bug 1598345
Alias: None
Product: GlusterFS
Classification: Community
Component: geo-replication
Version: 4.1
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Shwetha K Acharya
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-09-06 13:00 UTC by Nico van Roijen
Modified: 2019-09-26 12:37 UTC (History)
6 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2019-09-26 12:37:40 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)
Coredump of glusterd (on master server) (638.21 KB, application/x-gzip)
2018-09-06 13:00 UTC, Nico van Roijen
no flags Details

Description Nico van Roijen 2018-09-06 13:00:10 UTC
Created attachment 1481296 [details]
Coredump of glusterd (on master server)

Description of problem:
Gluster daemon on master servers crashes when geo-replication is created.

Version-Release number of selected component (if applicable):
4.1.3

How reproducible:
Always

Steps to Reproduce:
Master volume setup:
# gluster v create VOLUME2 replica 3 arbiter 1 transport tcp clrv0000110367:/gluster/VOLUME2/export clrv0000110371:/gluster/VOLUME2/export clrv0000110389:/gluster/VOLUME2/export
# gluster v start VOLUME2
# gluster volume set all cluster.enable-shared-storage enable

Slave volume setup
# gluster v create VOLUME2 replica 3 arbiter 1 transport tcp clrv0000110605:/gluster/VOLUME2/export clrv0000110608:/gluster/VOLUME2/export clrv0000110606:/gluster/VOLUME2/export
# gluster v start VOLUME2
# gluster volume set all cluster.enable-shared-storage enable

On master server:
# ssh-keygen   (accepting all defaults)
# ssh-copy-id  clrv0000110605    (one of the slave servers)
# gluster-georep-sshkey generate
# gluster volume geo-replication VOLUME2 clrv0000110605.ic.ing.net::VOLUME2 create push-pem

Actual results:
Glusterd crashes (on master servers).

/var/log/glusterfs/glusterd.log has :
----
[2018-09-06 12:43:13.426622] E [mem-pool.c:335:__gf_free] (-->/usr/lib64/glusterfs/4.1.3/xlator/mgmt/glusterd.so(+0x2436e) [0x7f1e1bf3536e] -->/usr/lib64/glusterfs/4.1.3/xlator/mgmt/glusterd.so(+0x20e10) [0x7f1e1bf31e10] -->/lib64/libglusterfs.so.0(__gf_free+0x104) [0x7f1e274d44f4] ) 0-: Assertion failed: GF_MEM_TRAILER_MAGIC == *(uint32_t *)((char *)free_ptr + header->size)
[2018-09-06 12:43:13.452244] E [mem-pool.c:335:__gf_free] (-->/usr/lib64/glusterfs/4.1.3/xlator/mgmt/glusterd.so(+0x2436e) [0x7f1e1bf3536e] -->/usr/lib64/glusterfs/4.1.3/xlator/mgmt/glusterd.so(+0x20e10) [0x7f1e1bf31e10] -->/lib64/libglusterfs.so.0(__gf_free+0x104) [0x7f1e274d44f4] ) 0-: Assertion failed: GF_MEM_TRAILER_MAGIC == *(uint32_t *)((char *)free_ptr + header->size)
The message "I [MSGID: 106584] [glusterd-handler.c:5904:__glusterd_handle_get_state] 0-management: Received request to get state for glusterd"
repeated 3 times between [2018-09-06 12:42:30.427752] and [2018-09-06 12:43:13.451952]
[2018-09-06 12:43:15.475145] I [MSGID: 106495] [glusterd-handler.c:3073:__glusterd_handle_getwd] 0-glusterd: Received getwd req
[2018-09-06 12:43:21.173767] I [MSGID: 106487] [glusterd-handler.c:1486:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req
[2018-09-06 12:43:26.112439] I [MSGID: 106495] [glusterd-handler.c:3073:__glusterd_handle_getwd] 0-glusterd: Received getwd req
[2018-09-06 12:43:27.326758] I [MSGID: 106494] [glusterd-handler.c:3024:__glusterd_handle_cli_profile_volume] 0-management: Received volume profile req for volume VOLUME1
[2018-09-06 12:43:30.827592] W [MSGID: 106028] [glusterd-geo-rep.c:2568:glusterd_get_statefile_name] 0-management: Config file (/var/lib/glusterd/geo-replication/VOLUME2_clrv0000110605.ic.ing.net_VOLUME2/gsyncd.conf) missing. Looking for template config file (/var/lib/glusterd/geo-replication/gsyncd_template.conf) [No such file or directory]
[2018-09-06 12:43:30.827709] I [MSGID: 106294] [glusterd-geo-rep.c:2577:glusterd_get_statefile_name] 0-management: Using default config template(/var/lib/glusterd/geo-replication/gsyncd_template.conf).
[2018-09-06 12:43:34.309355] I [MSGID: 106131] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: nfs already stopped
[2018-09-06 12:43:34.309437] I [MSGID: 106568] [glusterd-svc-mgmt.c:235:glusterd_svc_stop] 0-management: nfs service is stopped
[2018-09-06 12:43:34.309491] I [MSGID: 106599] [glusterd-nfs-svc.c:82:glusterd_nfssvc_manager] 0-management: nfs/server.so xlator is not installed
[2018-09-06 12:43:34.312800] I [MSGID: 106568] [glusterd-proc-mgmt.c:87:glusterd_proc_stop] 0-management: Stopping glustershd daemon running in pid: 7655
[2018-09-06 12:43:35.313131] I [MSGID: 106568] [glusterd-svc-mgmt.c:235:glusterd_svc_stop] 0-management: glustershd service is stopped
[2018-09-06 12:43:35.313365] I [MSGID: 106567] [glusterd-svc-mgmt.c:203:glusterd_svc_start] 0-management: Starting glustershd service
[2018-09-06 12:43:35.320449] I [MSGID: 106131] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: bitd already stopped
[2018-09-06 12:43:35.320552] I [MSGID: 106568] [glusterd-svc-mgmt.c:235:glusterd_svc_stop] 0-management: bitd service is stopped
[2018-09-06 12:43:35.320778] I [MSGID: 106131] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: scrub already stopped
[2018-09-06 12:43:35.320844] I [MSGID: 106568] [glusterd-svc-mgmt.c:235:glusterd_svc_stop] 0-management: scrub service is stopped
[2018-09-06 12:43:35.352791] I [run.c:241:runner_log] (-->/usr/lib64/glusterfs/4.1.3/xlator/mgmt/glusterd.so(+0xe2b1a) [0x7f1e1bff3b1a] -->/usr/lib64/glusterfs/4.1.3/xlator/mgmt/glusterd.so(+0xe25e5) [0x7f1e1bff35e5] -->/lib64/libglusterfs.so.0(runner_log+0x115) [0x7f1e274ff0c5] ) 0-management: Ran script: /var/lib/glusterd/hooks/1/gsync-create/post/S56glusterd-geo-rep-create-post.sh --volname=VOLUME2 is_push_pem=0,pub_file=/var/lib/glusterd/geo-replication/common_secret.pem.pub,slave_user=root,slave_ip=clrv0000110605.ic.ing.net,slave_vol=VOLUME2,ssh_port=22
[2018-09-06 12:43:37.336124] I [MSGID: 106584] [glusterd-handler.c:5904:__glusterd_handle_get_state] 0-management: Received request to get state for glusterd
[2018-09-06 12:43:37.336509] E [mem-pool.c:335:__gf_free] (-->/usr/lib64/glusterfs/4.1.3/xlator/mgmt/glusterd.so(+0x2436e) [0x7f1e1bf3536e] -->/usr/lib64/glusterfs/4.1.3/xlator/mgmt/glusterd.so(+0x20e10) [0x7f1e1bf31e10] -->/lib64/libglusterfs.so.0(__gf_free+0x104) [0x7f1e274d44f4] ) 0-: Assertion failed: GF_MEM_TRAILER_MAGIC == *(uint32_t *)((char *)free_ptr + header->size)
[2018-09-06 12:43:37.580687] I [MSGID: 106327] [glusterd-geo-rep.c:4482:glusterd_read_status_file] 0-management: Using passed config template(/var/lib/glusterd/geo-replication/VOLUME2_clrv0000110605.ic.ing.net_VOLUME2/gsyncd.conf).
[2018-09-06 12:43:37.895299] E [mem-pool.c:326:__gf_free] (-->/lib64/libglusterfs.so.0(+0x1a2c0) [0x7f1e2749e2c0] -->/lib64/libglusterfs.so.0(data_destroy+0x5d) [0x7f1e2749d92d] -->/lib64/libglusterfs.so.0(__gf_free+0xa4) [0x7f1e274d4494] ) 0-: Assertion failed: GF_MEM_HEADER_MAGIC ==
header->magic
The message "E [MSGID: 106332] [glusterd-utils.c:12886:glusterd_get_value_for_vme_entry] 0-management: Failed to get option for nufa key" repeated 2 times between [2018-09-06 12:42:30.567851] and [2018-09-06 12:42:30.743621]
[2018-09-06 12:42:30.802375] E [MSGID: 106332] [glusterd-utils.c:12886:glusterd_get_value_for_vme_entry] 0-management: Failed to get option for max-op-version key
The message "E [MSGID: 106332] [glusterd-utils.c:12886:glusterd_get_value_for_vme_entry] 0-management: Failed to get option for localtime-logging key" repeated 2 times between [2018-09-06 12:42:30.654159] and [2018-09-06 12:42:30.829908]
pending frames:
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
patchset: git://git.gluster.org/glusterfs.git
signal received: 11
time of crash:
2018-09-06 12:43:37
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 4.1.3
/lib64/libglusterfs.so.0(+0x25920)[0x7f1e274a9920]
/lib64/libglusterfs.so.0(gf_print_trace+0x334)[0x7f1e274b3874]
/lib64/libc.so.6(+0x36280)[0x7f1e25b0e280]
/lib64/libglusterfs.so.0(__gf_free+0xb5)[0x7f1e274d44a5]
/lib64/libglusterfs.so.0(data_destroy+0x5d)[0x7f1e2749d92d]
/lib64/libglusterfs.so.0(+0x1a2c0)[0x7f1e2749e2c0]
/usr/lib64/glusterfs/4.1.3/xlator/mgmt/glusterd.so(+0x22254)[0x7f1e1bf33254]
/usr/lib64/glusterfs/4.1.3/xlator/mgmt/glusterd.so(+0x2436e)[0x7f1e1bf3536e]
/lib64/libglusterfs.so.0(+0x622b0)[0x7f1e274e62b0]
/lib64/libc.so.6(+0x47fc0)[0x7f1e25b1ffc0]
---------

Expected results:
A created geo-replication session for a given volume

Additional info:

Comment 1 Shwetha K Acharya 2019-09-26 10:01:45 UTC
Sanju, did we recently fix this?

Comment 2 Sunny Kumar 2019-09-26 10:06:11 UTC
Snaju,

Looks like https://review.gluster.org/#/c/glusterfs/+/20461/.

https://bugzilla.redhat.com/show_bug.cgi?id=1598345.

please confirm and close this.

Comment 3 Sanju 2019-09-26 12:37:40 UTC
Yes Sunny, this is the same bug. Closing this bug.

*** This bug has been marked as a duplicate of bug 1598345 ***


Note You need to log in before you can comment on or make changes to this bug.