Bug 1341108 - [geo-rep]: If the session is renamed, geo-rep configuration are not retained
Summary: [geo-rep]: If the session is renamed, geo-rep configuration are not retained
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: geo-replication
Version: 3.8.0
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: Saravanakumar
QA Contact:
URL:
Whiteboard:
Depends On: 1340383 1340853
Blocks: 1311817 1341121
TreeView+ depends on / blocked
 
Reported: 2016-05-31 09:44 UTC by Saravanakumar
Modified: 2016-06-16 12:32 UTC (History)
7 users (show)

Fixed In Version: glusterfs-3.8.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1340853
: 1341121 (view as bug list)
Environment:
Last Closed: 2016-06-16 12:32:47 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Saravanakumar 2016-05-31 09:44:35 UTC
Description of problem:
=======================

With the recent changes, we support to rename the existing geo-rep session from one slave hot to another slave host. Expected is to rename only the session and retain all the configuration/status as of the previous session. 

But the older configurations are not retained which passively breaks the geo-rep functionality for this use case. 

Existing session between: baloo 10.70.37.88::bagheera

[root@dhcp37-162 ~]# gluster volume geo-replication baloo 10.70.37.88::bagheera config change_detector
xsync
[root@dhcp37-162 ~]# gluster volume geo-replication baloo 10.70.37.88::bagheera config ignore_deletes
true
[root@dhcp37-162 ~]# gluster volume geo-replication baloo 10.70.37.88::bagheera config use_meta_volume
true
[root@dhcp37-162 ~]#

New Session between: baloo 10.70.37.43::bagheera 

[root@dhcp37-162 ~]# gluster volume geo-replication baloo 10.70.37.43::bagheera config use_meta_volume
[root@dhcp37-162 ~]# gluster volume geo-replication baloo 10.70.37.43::bagheera config ignore_deletes
false
[root@dhcp37-162 ~]# gluster volume geo-replication baloo 10.70.37.43::bagheera config change_detector
changelog
[root@dhcp37-162 ~]# 



Version-Release number of selected component (if applicable):
=============================================================


How reproducible:
=================

Always


Steps to Reproduce:
===================
1. Create georep session between master, slavehost1 and slave
2. Update configs for this session
3. Stop existing session
4. Recreate session between master, slavehost2 and slave
5. Start the session
6. Verify for the configs setup at step 2

Actual results:
===============

Config options are reset


Expected results:
=================

Since it is a rename of a session and not the new session, all config options should be retained

--- Additional comment from Red Hat Bugzilla Rules Engine on 2016-05-27 04:52:05 EDT ---

This bug is automatically being proposed for the current z-stream release of Red Hat Gluster Storage 3 by setting the release flag 'rhgs‑3.1.z' to '?'. 

If this bug should be proposed for a different release, please manually change the proposed release flag.

--- Additional comment from Rahul Hinduja on 2016-05-27 04:56:14 EDT ---

[root@dhcp37-162 ~]# gluster volume geo-replication baloo 10.70.37.88::bagheera status
 
MASTER NODE     MASTER VOL    MASTER BRICK      SLAVE USER    SLAVE                    SLAVE NODE      STATUS     CRAWL STATUS    LAST_SYNCED                  
----------------------------------------------------------------------------------------------------------------------------------------------------
10.70.37.162    baloo         /rhs/brick1/b1    root          10.70.37.88::bagheera    10.70.37.213    Active     Hybrid Crawl    2016-05-26 14:41:23          
10.70.37.162    baloo         /rhs/brick2/b7    root          10.70.37.88::bagheera    10.70.37.88     Active     Hybrid Crawl    2016-05-26 14:41:23          
10.70.37.116    baloo         /rhs/brick1/b3    root          10.70.37.88::bagheera    10.70.37.43     Active     Hybrid Crawl    2016-05-26 14:41:23          
10.70.37.121    baloo         /rhs/brick1/b5    root          10.70.37.88::bagheera    10.70.37.200    Passive    N/A             N/A                          
10.70.37.190    baloo         /rhs/brick1/b6    root          10.70.37.88::bagheera    10.70.37.213    Active     Hybrid Crawl    2016-05-26 14:41:23          
10.70.37.189    baloo         /rhs/brick1/b4    root          10.70.37.88::bagheera    10.70.37.52     Passive    N/A             N/A                          
10.70.37.40     baloo         /rhs/brick1/b2    root          10.70.37.88::bagheera    10.70.37.88     Passive    N/A             N/A                          
10.70.37.40     baloo         /rhs/brick2/b8    root          10.70.37.88::bagheera    10.70.37.43     Passive    N/A             N/A                          
[root@dhcp37-162 ~]# gluster volume geo-replication baloo 10.70.37.88::bagheera config 
special_sync_mode: partial
session_owner: 11b99a73-649f-4439-abc4-1eac15943f0e
state_socket_unencoded: /var/lib/glusterd/geo-replication/baloo_10.70.37.88_bagheera/ssh%3A%2F%2Froot%4010.70.37.88%3Agluster%3A%2F%2F127.0.0.1%3Abagheera.socket
gluster_log_file: /var/log/glusterfs/geo-replication/baloo/ssh%3A%2F%2Froot%4010.70.37.88%3Agluster%3A%2F%2F127.0.0.1%3Abagheera.gluster.log
ssh_command: ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem
ignore_deletes: true
change_detector: xsync
gluster_command_dir: /usr/sbin/
state_file: /var/lib/glusterd/geo-replication/baloo_10.70.37.88_bagheera/monitor.status
remote_gsyncd: /nonexistent/gsyncd
log_file: /var/log/glusterfs/geo-replication/baloo/ssh%3A%2F%2Froot%4010.70.37.88%3Agluster%3A%2F%2F127.0.0.1%3Abagheera.log
changelog_log_file: /var/log/glusterfs/geo-replication/baloo/ssh%3A%2F%2Froot%4010.70.37.88%3Agluster%3A%2F%2F127.0.0.1%3Abagheera-changes.log
socketdir: /var/run/gluster
working_dir: /var/lib/misc/glusterfsd/baloo/ssh%3A%2F%2Froot%4010.70.37.88%3Agluster%3A%2F%2F127.0.0.1%3Abagheera
state_detail_file: /var/lib/glusterd/geo-replication/baloo_10.70.37.88_bagheera/ssh%3A%2F%2Froot%4010.70.37.88%3Agluster%3A%2F%2F127.0.0.1%3Abagheera-detail.status
use_meta_volume: true
ssh_command_tar: ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/tar_ssh.pem
pid_file: /var/lib/glusterd/geo-replication/baloo_10.70.37.88_bagheera/monitor.pid
georep_session_working_dir: /var/lib/glusterd/geo-replication/baloo_10.70.37.88_bagheera/
gluster_params: aux-gfid-mount acl
volume_id: 11b99a73-649f-4439-abc4-1eac15943f0e
[root@dhcp37-162 ~]#



[root@dhcp37-162 ~]# gluster volume geo-replication baloo 10.70.37.43::bagheera status
 
MASTER NODE     MASTER VOL    MASTER BRICK      SLAVE USER    SLAVE                    SLAVE NODE      STATUS     CRAWL STATUS       LAST_SYNCED                  
-------------------------------------------------------------------------------------------------------------------------------------------------------
10.70.37.162    baloo         /rhs/brick1/b1    root          10.70.37.43::bagheera    10.70.37.213    Active     Changelog Crawl    2016-05-26 14:41:29          
10.70.37.162    baloo         /rhs/brick2/b7    root          10.70.37.43::bagheera    10.70.37.88     Active     Changelog Crawl    2016-05-26 14:41:28          
10.70.37.116    baloo         /rhs/brick1/b3    root          10.70.37.43::bagheera    10.70.37.43     Active     Changelog Crawl    2016-05-26 14:41:37          
10.70.37.121    baloo         /rhs/brick1/b5    root          10.70.37.43::bagheera    10.70.37.200    Active     Changelog Crawl    2016-05-26 14:41:22          
10.70.37.190    baloo         /rhs/brick1/b6    root          10.70.37.43::bagheera    10.70.37.213    Passive    N/A                N/A                          
10.70.37.189    baloo         /rhs/brick1/b4    root          10.70.37.43::bagheera    10.70.37.52     Passive    N/A                N/A                          
10.70.37.40     baloo         /rhs/brick1/b2    root          10.70.37.43::bagheera    10.70.37.88     Passive    N/A                N/A                          
10.70.37.40     baloo         /rhs/brick2/b8    root          10.70.37.43::bagheera    10.70.37.43     Passive    N/A                N/A                          
[root@dhcp37-162 ~]# gluster volume geo-replication baloo 10.70.37.43::bagheera config
special_sync_mode: partial
state_socket_unencoded: /var/lib/glusterd/geo-replication/baloo_10.70.37.43_bagheera/ssh%3A%2F%2Froot%4010.70.37.43%3Agluster%3A%2F%2F127.0.0.1%3Abagheera.socket
gluster_log_file: /var/log/glusterfs/geo-replication/baloo/ssh%3A%2F%2Froot%4010.70.37.43%3Agluster%3A%2F%2F127.0.0.1%3Abagheera.gluster.log
ssh_command: ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem
ignore_deletes: false
change_detector: changelog
gluster_command_dir: /usr/sbin/
state_file: /var/lib/glusterd/geo-replication/baloo_10.70.37.43_bagheera/monitor.status
remote_gsyncd: /nonexistent/gsyncd
log_file: /var/log/glusterfs/geo-replication/baloo/ssh%3A%2F%2Froot%4010.70.37.43%3Agluster%3A%2F%2F127.0.0.1%3Abagheera.log
changelog_log_file: /var/log/glusterfs/geo-replication/baloo/ssh%3A%2F%2Froot%4010.70.37.43%3Agluster%3A%2F%2F127.0.0.1%3Abagheera-changes.log
socketdir: /var/run/gluster
working_dir: /var/lib/misc/glusterfsd/baloo/ssh%3A%2F%2Froot%4010.70.37.43%3Agluster%3A%2F%2F127.0.0.1%3Abagheera
state_detail_file: /var/lib/glusterd/geo-replication/baloo_10.70.37.43_bagheera/ssh%3A%2F%2Froot%4010.70.37.43%3Agluster%3A%2F%2F127.0.0.1%3Abagheera-detail.status
session_owner: 11b99a73-649f-4439-abc4-1eac15943f0e
ssh_command_tar: ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/tar_ssh.pem
pid_file: /var/lib/glusterd/geo-replication/baloo_10.70.37.43_bagheera/monitor.pid
georep_session_working_dir: /var/lib/glusterd/geo-replication/baloo_10.70.37.43_bagheera/
gluster_params: aux-gfid-mount acl
volume_id: 11b99a73-649f-4439-abc4-1eac15943f0e
[root@dhcp37-162 ~]#



RCA:
In gsyncd.conf, "peers" sections contains  old Slave Host details.

example:
peers gluster%3A%2F%2F127.0.0.1%3Atv1 ssh%3A2F%2Froot%40192.168.122.186%3Agluster%3A%2F%2F127.0.0.1%3Atv2

Where, 192.168.122.186 is old slave host IP address.

Once the geo-rep session is renamed, old host details are no longer valid.
So, with new host, it is NOT possible to get config details.

Solution:
Remove old host details from peers section.

Use only master volume and slave volume as part of peers section and remove slave host detail.

Comment 1 Vijay Bellur 2016-05-31 09:46:13 UTC
REVIEW: http://review.gluster.org/14565 (geo-rep: update peers section in gsyncd conf) posted (#2) for review on release-3.8 by Saravanakumar Arumugam (sarumuga)

Comment 2 Vijay Bellur 2016-05-31 10:14:57 UTC
REVIEW: http://review.gluster.org/14565 (geo-rep: update peers section in gsyncd conf) posted (#3) for review on release-3.8 by Saravanakumar Arumugam (sarumuga)

Comment 3 Vijay Bellur 2016-06-02 10:53:32 UTC
COMMIT: http://review.gluster.org/14565 committed in release-3.8 by Aravinda VK (avishwan) 
------
commit a8d86daf45d4999b885c799307df4ab1abfd25f4
Author: Saravanakumar Arumugam <sarumuga>
Date:   Mon May 30 17:34:24 2016 +0530

    geo-rep: update peers section in gsyncd conf
    
    Problem:
    Once Slave volume uuid is involved as part of a geo-rep session, it is
    possible to create the same geo-rep session with different (slave)host.
    
    But, it reflects default values for geo-rep configuration values originally
    configured for old geo-rep session.
    Reason is, slave host is used while saving config options in gsyncd.conf.
    With new slave host, it is not possible to retrieve those config values.
    
    Solution:
    Remove slave host related information from gsyncd.conf and have only master
    volume and slave volume as part of peers section.
    
    Also, during upgrade from old geo-rep session, update peers section to
    reflect only master volume and slave volume.
    
    Change-Id: I7debf35a09a28d030b706b0c3e5d82c9b0467d0e
    BUG: 1341108
    Reviewed-on: http://review.gluster.org/#/c/14558/
    Signed-off-by: Saravanakumar Arumugam <sarumuga>
    Reviewed-on: http://review.gluster.org/14565
    Smoke: Gluster Build System <jenkins.com>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.com>
    Reviewed-by: Aravinda VK <avishwan>

Comment 4 Niels de Vos 2016-06-16 12:32:47 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report.

glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user


Note You need to log in before you can comment on or make changes to this bug.