Bug 1437244

Summary: geo-rep not detecting changes
Product: [Community] GlusterFS Reporter: jeremiah
Component: geo-replicationAssignee: Pranith Kumar K <pkarampu>
Status: CLOSED EOL QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 3.10CC: bugs, dimitri.ars, jeremiah, wattersm
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-06-20 18:26:22 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Master volume log 1
none
Master volume log 2
none
Master volume log 3
none
Updated master log 1
none
Updated master log 2
none
Updated master log 3 none

Description jeremiah 2017-03-29 21:30:54 UTC
Description of problem: Initial sync is successful, however, further filesystem changes are not detected or synced.

Version-Release number of selected component (if applicable): Both 3.8 & 3.10

How reproducible: 100%

Steps to Reproduce: Setup is comprised of two servers, both running fully updated CentOS 7. No SELinux.

* ill: Local server. Master volume named "foobar".
* aws: Remote server. Slave volume named "foobar".

* Both servers are running ntpd.
* Their clocks are in the same time zone and in sync.
* Passwordless SSH is setup and working.
* common_secret.pem.pub was generated on the local host "ill".
* 'create push-pem' was successfully on the local host "ill". Verified remote side had the two expected "command=" entries in ~/.ssh/authorized_keys.
* geo-rep 'start' session between two volumes was successfully created.
* The local filesystem successfully does an initial sync to the remote filesystem.
* No other changes are detected or synced

* 'status detail' looks clean but LAST_SYNCED never changes:

MASTER NODE: ill.franz.com
MASTER VOL: foobar
MASTER BRICK: /gv0/foobar
SLAVE USER: root
SLAVE NODE: aws.franz.com::foobar
STATUS: Active
CRAWL STATUS: Changelog Crawl
LAST_SYNCED: 2017-03-29 13:44:23


Actual results: No changes detected or synced


Expected results: Changes to be detected and synced


Additional info: I went with the default 'config' options. However, during troubleshooting I noticed that some of the default values seem suspicious/wrong. 

For example, 'remote_gsyncd' is set to '/nonexistent/gsyncd'.

Also, some of the variable values are incorrectly guessed. For example, 'gluster_log_file' is guessed as:

/var/log/glusterfs/geo-replication/foobar/ssh%3A%2F%2Froot%4054.165.144.9%3Agluster%3A%2F%2F127.0.0.1%3Afoobar.gluster.log

but the real file is:

/var/log/glusterfs/geo-replication/foobar/ssh%3A%2F%2Froot%4054.165.144.9%3Agluster%3A%2F%2F127.0.0.1%3Afoobar.%2Fgv0%2Ffoobar.gluster.log

I did try updating these variables to what seemed like more correct values but none of my changes had any effect on my problem.

I tried changing the 'change_detector' to xsync. No change in behavior.

I tried with both ext4 & xfs filesystems. No change in behavior.

I tried setting a checkpoint. No change in behavior.

I'm out of ideas at this point but happy to try anything & provide logs. Thanks so much for your time & help debugging!

Comment 1 jeremiah 2017-03-30 07:41:56 UTC
Created attachment 1267436 [details]
Master volume log 1

Comment 2 jeremiah 2017-03-30 07:42:25 UTC
Created attachment 1267437 [details]
Master volume log 2

Comment 3 jeremiah 2017-03-30 07:42:42 UTC
Created attachment 1267438 [details]
Master volume log 3

Comment 4 jeremiah 2017-04-26 23:50:19 UTC
Created attachment 1274471 [details]
Updated master log 1

Here's the new batch of log files, they don't have any of the 'transport end point not connected' errors in them anymore.

Comment 5 jeremiah 2017-04-26 23:50:53 UTC
Created attachment 1274472 [details]
Updated master log 2

Comment 6 jeremiah 2017-04-26 23:51:19 UTC
Created attachment 1274473 [details]
Updated master log 3

Comment 7 Michael Watters 2017-04-28 19:41:58 UTC
I've also just noticed this issue.  geo-replication is working according to the status output however the data on my slave nodes does *not* match what is on the master volume.

[root@mdct-gluster-srv1 ~]# gluster volume geo-replication gv0 status
 
MASTER NODE          MASTER VOL    MASTER BRICK               SLAVE USER    SLAVE                                SLAVE NODE           STATUS     CRAWL STATUS       LAST_SYNCED                  
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
mdct-gluster-srv1    gv0           /var/mnt/gluster/brick2    root          ssh://mdct-gluster-srv3::slavevol    mdct-gluster-srv3    Active     Changelog Crawl    2017-04-28 08:35:58          
mdct-gluster-srv2    gv0           /var/mnt/gluster/brick     root          ssh://mdct-gluster-srv3::slavevol    mdct-gluster-srv3    Passive    N/A                N/A                          
mdct-gluster-srv1    gv0           /var/mnt/gluster/brick2    root          ssh://mdct-gluster-srv4::slavevol    mdct-gluster-srv4    Active     Changelog Crawl    2017-04-28 08:35:58          
mdct-gluster-srv2    gv0           /var/mnt/gluster/brick     root          ssh://mdct-gluster-srv4::slavevol    mdct-gluster-srv4    Passive    N/A                N/A

ls shows different data as show below.

[root@mdct-00fs-cent7 ~]# ls /var/mnt/shadow/pub/fedora/
dart  releases  updates

[root@mdct-00fs-cent7 ~]# ls /var/mnt/gluster2/pub/fedora/
20  21  22  24  25  dart  README  releases  updates

/var/mnt/shadow is the master volume.

Comment 8 Shyamsundar 2018-06-20 18:26:22 UTC
This bug reported is against a version of Gluster that is no longer maintained (or has been EOL'd). See https://www.gluster.org/release-schedule/ for the versions currently maintained.

As a result this bug is being closed.

If the bug persists on a maintained version of gluster or against the mainline gluster repository, request that it be reopened and the Version field be marked appropriately.