1173732 – Glusterd fails when script set_geo_rep_pem_keys.sh is executed on peer

Bug 1173732 - Glusterd fails when script set_geo_rep_pem_keys.sh is executed on peer

Summary: Glusterd fails when script set_geo_rep_pem_keys.sh is executed on peer

Keywords:
Status:	CLOSED NEXTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	geo-replication
Sub Component:
Version:	3.6.1
Hardware:	x86_64
OS:	Linux
Priority:	medium
Severity:	urgent
Target Milestone:	---
Assignee:	bugs@gluster.org
QA Contact:
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1173725 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2014-12-12 19:29 UTC by vnosov
Modified:	2016-08-19 11:43 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2016-08-19 11:41:35 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description vnosov 2014-12-12 19:29:42 UTC

Description of problem:

Have geo-replication in active state. Slave system has 2 nodes. Slave volume is replicated between these 2 nodes. Run script  "/usr/local/libexec/glusterfs/set_geo_rep_pem_keys.sh geoaccount"
on one of the slave nodes. Glusterd on other slave node hits assert.


Version-Release number of selected component (if applicable): 3.6.1


How reproducible: 100%


Steps to Reproduce:
1. Call [root@SC-10-10-200-142 log]# /usr/local/libexec/glusterfs/set_geo_rep_pem_keys.sh geoaccount
2.
3.

Actual results:
Contents of the log file "/var/log/glusterfs/usr-local-etc-glusterfs-glusterd.vol.log" from the failed node "192.168.5.141":

[2014-12-12 18:14:33.975724] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/ef07a1f029b2d014b26651eab86a2517.socket failed (Invalid argument)
The message "I [MSGID: 106006] [glusterd-handler.c:4257:__glusterd_nodesvc_rpc_notify] 0-management: nfs has disconnected from glusterd." repeated 39 times between [2014-12-12 18:12:36.960062] and [2014-12-12 18:14:33.975855]
[2014-12-12 18:14:36.976121] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/ef07a1f029b2d014b26651eab86a2517.socket failed (Invalid argument)
[2014-12-12 18:14:36.976221] I [MSGID: 106006] [glusterd-handler.c:4257:__glusterd_nodesvc_rpc_notify] 0-management: nfs has disconnected from glusterd.
[2014-12-12 18:14:39.976446] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/ef07a1f029b2d014b26651eab86a2517.socket failed (Invalid argument)
[2014-12-12 18:14:42.976771] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/ef07a1f029b2d014b26651eab86a2517.socket failed (Invalid argument)
[2014-12-12 18:14:45.977159] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/ef07a1f029b2d014b26651eab86a2517.socket failed (Invalid argument)
[2014-12-12 18:14:48.977489] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/ef07a1f029b2d014b26651eab86a2517.socket failed (Invalid argument)
[2014-12-12 18:14:49.050113] E [mem-pool.c:242:__gf_free] (--> /usr/local/lib/libglusterfs.so.0(_gf_log_callingfn+0x1ab)[0x7f44075f8ffb] (--> /usr/local/lib/libglusterfs.so.0(__gf_free+0xb4)[0x7f4407626084] (--> /usr/local/lib/libglusterfs.so.0(data_destroy+0x55)[0x7f44075f36e5] (--> /usr/local/lib/libglusterfs.so.0(dict_destroy+0x3e)[0x7f44075f40ae] (--> /usr/local/lib/glusterfs/3.6.1/xlator/mgmt/glusterd.so(glusterd_destroy_req_ctx+0x17)[0x7f43fd2bba57] ))))) 0-: Assertion failed: GF_MEM_HEADER_MAGIC == *(uint32_t *)ptr
The message "I [MSGID: 106006] [glusterd-handler.c:4257:__glusterd_nodesvc_rpc_notify] 0-management: nfs has disconnected from glusterd." repeated 4 times between [2014-12-12 18:14:36.976221] and [2014-12-12 18:14:48.977580]
pending frames:
frame : type(0) op(0)
frame : type(0) op(0)
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash:
2014-12-12 18:14:49
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.6.1
/usr/local/lib/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb6)[0x7f44075f8c26]
/usr/local/lib/libglusterfs.so.0(gf_print_trace+0x2ed)[0x7f440761315d]
/lib64/libc.so.6[0x30008329a0]
/usr/local/lib/libglusterfs.so.0(__gf_free+0xcc)[0x7f440762609c]
/usr/local/lib/libglusterfs.so.0(data_destroy+0x55)[0x7f44075f36e5]
/usr/local/lib/libglusterfs.so.0(dict_destroy+0x3e)[0x7f44075f40ae]
/usr/local/lib/glusterfs/3.6.1/xlator/mgmt/glusterd.so(glusterd_destroy_req_ctx+0x17)[0x7f43fd2bba57]
/usr/local/lib/glusterfs/3.6.1/xlator/mgmt/glusterd.so(glusterd_op_sm+0x2f8)[0x7f43fd2c0198]
/usr/local/lib/glusterfs/3.6.1/xlator/mgmt/glusterd.so(__glusterd_handle_commit_op+0x108)[0x7f43fd2a3af8]
/usr/local/lib/glusterfs/3.6.1/xlator/mgmt/glusterd.so(glusterd_big_locked_handler+0x3f)[0x7f43fd2a106f]
/usr/local/lib/libglusterfs.so.0(synctask_wrap+0x12)[0x7f4407635762]
/lib64/libc.so.6[0x3000843bf0]
---------
[2014-12-12 18:20:21.089219] I [MSGID: 100030] [glusterfsd.c:2018:main] 0-/usr/local/sbin/glusterd: Started running /usr/local/sbin/glusterd version 3.6.1 (args: /usr/local/sbin/glusterd --pid-file=/run/glusterd.pid)
[2014-12-12 18:20:21.095768] I [glusterd.c:1214:init] 0-management: Maximum allowed open file descriptors set to 65536
[2014-12-12 18:20:21.095863] I [glusterd.c:1259:init] 0-management: Using /var/lib/glusterd as working directory
[2014-12-12 18:20:21.103419] E [rpc-transport.c:266:rpc_transport_load] 0-rpc-transport: /usr/local/lib/glusterfs/3.6.1/rpc-transport/rdma.so: cannot open shared object file: No such file or directory
[2014-12-12 18:20:21.103476] W [rpc-transport.c:270:rpc_transport_load] 0-rpc-transport: volume 'rdma.management': transport-type 'rdma' is not valid or not found on this machine
[2014-12-12 18:20:21.103550] W [rpcsvc.c:1524:rpcsvc_transport_create] 0-rpc-service: cannot create listener, initing the transport failed
[2014-12-12 18:20:28.294458] I [glusterd-store.c:2043:glusterd_restore_op_version] 0-glusterd: retrieved op-version: 30600
[2014-12-12 18:20:29.175538] I [glusterd-handler.c:3146:glusterd_friend_add_from_peerinfo] 0-management: connect returned 0
[2014-12-12 18:20:29.175663] I [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2014-12-12 18:20:29.181732] I [glusterd.c:146:glusterd_uuid_init] 0-management: retrieved UUID: c0efccec-b0a0-a091-e517-00114331bbc4
Final graph:
+------------------------------------------------------------------------------+
  1: volume management
  2:     type mgmt/glusterd
  3:     option rpc-auth.auth-glusterfs on
  4:     option rpc-auth.auth-unix on
  5:     option rpc-auth.auth-null on
  6:     option transport.socket.listen-backlog 128
  7:     option rpc-auth-allow-insecure on
  8:     option geo-replication-log-group geogroup
  9:     option mountbroker-geo-replication.geoaccount nas-volume-0001
 10:     option mountbroker-root /var/mountbroker-root
 11:     option ping-timeout 30
 12:     option transport.socket.read-fail-log off
 13:     option transport.socket.keepalive-interval 2
 14:     option transport.socket.keepalive-time 10
 15:     option transport-type rdma
 16:     option working-directory /var/lib/glusterd
 17: end-volume
 18:
+------------------------------------------------------------------------------+

On Slave system:

[root@SC-10-10-200-141 log]# gluster volume info

Volume Name: gl-eae5fffa4556b4602-804-1418085976-nas-metadata
Type: Replicate
Volume ID: 41a6696e-780c-4582-b773-b544ce81dc53
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 192.168.5.141:/exports/nas-metadata-on-SC-10.10.200.141/nas-metadata
Brick2: 192.168.5.142:/exports/nas-metadata-on-SC-10.10.200.142/nas-metadata
Options Reconfigured:
performance.read-ahead: off
performance.write-behind: off
performance.stat-prefetch: off
nfs.disable: on
network.frame-timeout: 5
network.ping-timeout: 5
nfs.addr-namelookup: off

Volume Name: nas-volume-0001
Type: Replicate
Volume ID: 8eb4615c-0b1e-4eac-905a-b03c24e934f7
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 192.168.5.142:/exports/nas-segment-0008/nas-volume-0001
Brick2: 192.168.5.141:/exports/nas-segment-0003/nas-volume-0001
Options Reconfigured:
performance.read-ahead: off
performance.write-behind: off
performance.stat-prefetch: off
nfs.disable: on
nfs.addr-namelookup: off


On Master system:
[root@SC-10-10-200-182 log]# gluster volume geo-replication nas-volume-loc geoaccount.5.141::nas-volume-0001 status detail

MASTER NODE                     MASTER VOL        MASTER BRICK                                SLAVE                             STATUS    CHECKPOINT STATUS    CRAWL STATUS       FILES SYNCD    FILES PENDING    BYTES PENDING    DELETES PENDING    FILES SKIPPED
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SC-10-10-200-182.example.com    nas-volume-loc    /exports/nas-segment-0001/nas-volume-loc    192.168.5.141::nas-volume-0001    Active    N/A                  Changelog Crawl    39             0                0                0                  0


Expected results:


Additional info:

Comment 1 vnosov 2014-12-12 19:33:37 UTC

*** Bug 1173725 has been marked as a duplicate of this bug. ***

Comment 2 Aravinda VK 2016-08-19 11:41:35 UTC

GlusterFS-3.6 is nearing its End-Of-Life, only important security bugs still make a chance on getting fixed. Moving this to the mainline 'version'. If this needs to get fixed in 3.7 or 3.8 this bug should get cloned.

Comment 3 Aravinda VK 2016-08-19 11:43:28 UTC

This bug is being closed as GlusterFS-3.6 is nearing its End-Of-Life and only important security bugs will be fixed. This bug has been fixed in more recent GlusterFS releases. If you still face this bug with the newer GlusterFS versions, please open a new bug.

Note You need to log in before you can comment on or make changes to this bug.