Bug 1030052

Summary: dist-geo-rep: When use-tarssh is set to "true" when it is already using tar+ssh, gsyncd still restarts
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: M S Vishwanath Bhat <vbhat>
Component: geo-replicationAssignee: Kotresh HR <khiremat>
Status: CLOSED ERRATA QA Contact: Bhaskar Bandari <bbandari>
Severity: medium Docs Contact:
Priority: high    
Version: 2.1CC: aavati, csaba, david.macdonald, mzywusko, nsathyan, psriniva, vagarwal
Target Milestone: ---   
Target Release: RHGS 3.0.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Known Issue
Doc Text:
During a Geo-replication session, the gsyncd process restarts when you set use-tarssh, a Geo-replication configuration option to true even if it is already being set.
Story Points: ---
Clone Of:
: 1060797 (view as bug list) Environment:
Last Closed: 2014-09-22 19:29:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1060797    

Description M S Vishwanath Bhat 2013-11-13 20:01:09 UTC
Description of problem:
gsyncd needs to be restarted whenever the sync method is changed from tar+ssh to rsync or vice versa. But when use-tarssh it set to true using geo-rep config CLI while it is already in true gsyncd still restarts. This should not be the case ideally. gsyncd should restart only when there is a change in the sync method.

Version-Release number of selected component (if applicable):
glusterfs-3.4.0.44rhs-1.el6rhs.x86_64


How reproducible:
Always

Steps to Reproduce:
1. gluster v geo master slave_node::slave config use-tarssh true
2. gluster v geo master slave_node::slave config use-tarssh true

Actual results:


status before running the config 


[root@pythagoras ]# gluster v geo master euclid::slave status

MASTER NODE                  MASTER VOL    MASTER BRICK          SLAVE             STATUS     CHECKPOINT STATUS    CRAWL STATUS
-----------------------------------------------------------------------------------------------------------------------------------
pythagoras.blr.redhat.com    master        /rhs/bricks/brick0    euclid::slave     Active     N/A                  Changelog Crawl
aryabhatta.blr.redhat.com    master        /rhs/bricks/brick1    gauss::slave      Passive    N/A                  N/A
ramanujan.blr.redhat.com     master        /rhs/bricks/brick2    riemann::slave    Active     N/A                  Changelog Crawl
archimedes.blr.redhat.com    master        /rhs/bricks/brick3    euler::slave      Passive    N/A                  N/A




[root@pythagoras ]# gluster v geo master euclid::slave config use-tarssh true
geo-replication config updated successfully


gsyncd restarts after config change


[root@pythagoras ]# gluster v geo master euclid::slave status

MASTER NODE                  MASTER VOL    MASTER BRICK          SLAVE             STATUS             CHECKPOINT STATUS    CRAWL STATUS
----------------------------------------------------------------------------------------------------------------------------------------
pythagoras.blr.redhat.com    master        /rhs/bricks/brick0    euclid::slave     Initializing...    N/A                  N/A
ramanujan.blr.redhat.com     master        /rhs/bricks/brick2    riemann::slave    Initializing...    N/A                  N/A
aryabhatta.blr.redhat.com    master        /rhs/bricks/brick1    gauss::slave      Initializing...    N/A                  N/A
archimedes.blr.redhat.com    master        /rhs/bricks/brick3    euler::slave      Initializing...    N/A                  N/A


And then it becomes stable.


[root@pythagoras ]# gluster v geo master euclid::slave status

MASTER NODE                  MASTER VOL    MASTER BRICK          SLAVE             STATUS     CHECKPOINT STATUS    CRAWL STATUS
-----------------------------------------------------------------------------------------------------------------------------------
pythagoras.blr.redhat.com    master        /rhs/bricks/brick0    euclid::slave     Active     N/A                  Changelog Crawl
aryabhatta.blr.redhat.com    master        /rhs/bricks/brick1    gauss::slave      Passive    N/A                  N/A
archimedes.blr.redhat.com    master        /rhs/bricks/brick3    euler::slave      Passive    N/A                  N/A
ramanujan.blr.redhat.com     master        /rhs/bricks/brick2    riemann::slave    Active     N/A                  Changelog Crawl


run config to set it to true again


[root@pythagoras ]# gluster v geo master euclid::slave config use-tarssh true
geo-replication config updated successfully


status after config use-tarssh to true again
[root@pythagoras ]# gluster v geo master euclid::slave status

MASTER NODE                  MASTER VOL    MASTER BRICK          SLAVE             STATUS             CHECKPOINT STATUS    CRAWL STATUS
----------------------------------------------------------------------------------------------------------------------------------------
pythagoras.blr.redhat.com    master        /rhs/bricks/brick0    euclid::slave     Initializing...    N/A                  N/A
archimedes.blr.redhat.com    master        /rhs/bricks/brick3    euler::slave      Initializing...    N/A                  N/A
ramanujan.blr.redhat.com     master        /rhs/bricks/brick2    riemann::slave    Initializing...    N/A                  N/A
aryabhatta.blr.redhat.com    master        /rhs/bricks/brick1    gauss::slave      Initializing...    N/A                  N/A



gsyncd is restarting again

Expected results:
gsyncd should not restart in case it is not changing the syncing method.

Additional info:

[root@pythagoras ]# grep "sync engine" /var/log/glusterfs/geo-replication/master/ssh%3A%2F%2Froot%4010.70.37.188%3Agluster%3A%2F%2F127.0.0.1%3Aslave.log 
[2013-11-14 01:12:55.752557] I [master(/rhs/bricks/brick0):352:__init__] _GMaster: using 'rsync' as the sync engine
[2013-11-14 01:12:55.755635] I [master(/rhs/bricks/brick0):352:__init__] _GMaster: using 'rsync' as the sync engine
[2013-11-14 01:16:05.322597] I [master(/rhs/bricks/brick0):349:__init__] _GMaster: using 'tar over ssh' as the sync engine
[2013-11-14 01:16:05.323045] I [master(/rhs/bricks/brick0):349:__init__] _GMaster: using 'tar over ssh' as the sync engine
[2013-11-14 01:17:51.913451] I [master(/rhs/bricks/brick0):349:__init__] _GMaster: using 'tar over ssh' as the sync engine
[2013-11-14 01:17:51.913904] I [master(/rhs/bricks/brick0):349:__init__] _GMaster: using 'tar over ssh' as the sync engine

Comment 2 Kotresh HR 2014-05-19 06:17:58 UTC
This is patch http://review.gluster.org/6897 merged upstream
and now available in downstream.
Fixed in Version:  glusterfs-3.6.0.2-1

Comment 3 Vivek Agarwal 2014-05-26 06:16:34 UTC
Fixed as a part of the rebase, marking the bz for Denali

Comment 4 Vijaykumar Koppad 2014-06-04 08:47:37 UTC
verified on the build glusterfs-3.6.0.12-1.el6rhs

Comment 7 errata-xmlrpc 2014-09-22 19:29:37 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2014-1278.html