Description of problem: When assigning a label to a checkpoint, geo-replication status will throw: No active geo-replication sessions between [masternode] and [geo-rep target] Version-Release number of selected component (if applicable): [root@RHGS1 rep01]# rpm -qa | grep gluster glusterfs-rdma-3.7.9-10.el7rhgs.x86_64 glusterfs-geo-replication-3.7.9-10.el7rhgs.x86_64 glusterfs-libs-3.7.9-10.el7rhgs.x86_64 glusterfs-client-xlators-3.7.9-10.el7rhgs.x86_64 python-gluster-3.7.9-10.el7rhgs.noarch glusterfs-fuse-3.7.9-10.el7rhgs.x86_64 glusterfs-cli-3.7.9-10.el7rhgs.x86_64 gluster-nagios-common-0.2.4-1.el7rhgs.noarch nfs-ganesha-gluster-2.3.1-8.el7rhgs.x86_64 glusterfs-ganesha-3.7.9-10.el7rhgs.x86_64 glusterfs-3.7.9-10.el7rhgs.x86_64 glusterfs-api-3.7.9-10.el7rhgs.x86_64 gluster-nagios-addons-0.2.7-1.el7rhgs.x86_64 samba-vfs-glusterfs-4.4.3-7.el7rhgs.x86_64 vdsm-gluster-4.16.30-1.5.el7rhgs.noarch glusterfs-server-3.7.9-10.el7rhgs.x86_64 [root@RHGS1 rep01]# gluster --version glusterfs 3.7.9 built on Jun 10 2016 06:32:42 Repository revision: git://git.gluster.com/glusterfs.git Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com> GlusterFS comes with ABSOLUTELY NO WARRANTY. You may redistribute copies of GlusterFS under the terms of the GNU General Public License. How reproducible: Always Steps to Reproduce: 1. Set up a geo-replication session 2. Once started, create a checkpoint like this: # gluster volume geo-replication rep01 RHGS3::slave config checkpoint chris 3. Run # gluster volume geo-replication rep01 RHGS3::slave status Actual results: No active geo-replication sessions between rep01 and RHGS3::slave Expected results: MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED ---------------------------------------------------------------------------------------------------------------------------------------------- RHGS1 rep01 /rhs/brick1/rep01 root RHGS3::slave RHGS3 Active Changelog Crawl 2016-09-28 13:13:45 RHGS2 rep01 /rhs/brick1/rep01 root RHGS3::slave RHGS4 Passive N/A N/A Additional info: It seems like the geo-replication continues even though status says there is no active connection :( [root@RHGS1 rep01]# gluster volume geo-replication rep01 RHGS3::slave status MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED ---------------------------------------------------------------------------------------------------------------------------------------------- RHGS1 rep01 /rhs/brick1/rep01 root RHGS3::slave RHGS3 Active Changelog Crawl 2016-09-28 13:13:45 RHGS2 rep01 /rhs/brick1/rep01 root RHGS3::slave RHGS4 Passive N/A N/A [root@RHGS1 rep01]# gluster volume geo-replication rep01 RHGS3::slave config checkpoint chris geo-replication config updated successfully [root@RHGS1 rep01]# gluster volume geo-replication rep01 RHGS3::slave status No active geo-replication sessions between rep01 and RHGS3::slave [root@RHGS1 rep01]# gluster volume geo-replication rep01 RHGS3::slave config checkpoint now geo-replication config updated successfully [root@RHGS1 rep01]# gluster volume geo-replication rep01 RHGS3::slave status MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED ---------------------------------------------------------------------------------------------------------------------------------------------- RHGS1 rep01 /rhs/brick1/rep01 root RHGS3::slave RHGS3 Active Changelog Crawl 2016-09-28 13:13:45 RHGS2 rep01 /rhs/brick1/rep01 root RHGS3::slave RHGS4 Passive N/A N/A
Set the Checkpoint for current time using, gluster volume geo-replication rep01 RHGS3::slave config checkpoint now As mentioned in the description we need to validate for other inputs.(other than now)
Added validation for label format. Now Geo-rep checkpoint label will accept only valid date with format "YYYY-MM-DD HH:MM:SS" For example, "2016-10-25 14:30:45" Upstream patch sent to fix the issue http://review.gluster.org/15721
I don't approve with this... A 'label' should be a text string that I can assign (like "pre-prod"), not a very strictly defined date-time-stamp. Why is it not possible to use a string here?
(In reply to Chris Blum from comment #4) > I don't approve with this... A 'label' should be a text string that I can > assign (like "pre-prod"), not a very strictly defined date-time-stamp. > Why is it not possible to use a string here? Currently Geo-replication uses checkpoint date to find sync is complete till that time or not. Checkpoint completion means everything created in Master before the checkpoint time is synced to slave. Example usage of Checkpoint: gluster volume geo-replication rep01 RHGS3::slave config checkpoint "2016-10-25 20:00:00" Watch the Checkpoint status using Geo-rep status command, If the status says Checkpoint completed=Yes then it means all the files created/modified in Master Volume before 2016-10-25 20:00:00 are synced to Slave volume. May be I am missing something here. What is the usecase of non date checkpoint? We can enhance Geo-replication to support that usecase.
OK that makes more sense then - so the label is then implemented so that I can find out if things 5 days ago have been properly synced to the other side? Will the 'checkpoint completed' timestamp then show me when the files 5 days ago have been synced? Because why else would I be interested in an earlier date other than now if not?
(In reply to Chris Blum from comment #6) > OK that makes more sense then - so the label is then implemented so that I > can find out if things 5 days ago have been properly synced to the other > side? > Will the 'checkpoint completed' timestamp then show me when the files 5 days > ago have been synced? Because why else would I be interested in an earlier > date other than now if not? With the "last synced" column in status output, so earlier checkpoint date is not much useful. If last synced time from all Active workers are more than required time then it can be considered as checkpoint completed. Labeled checkpoint is more useful to set future times. For example, checkpoint is required for midnight current day. Instead of setting checkpoint at midnight using now, it can be set using label.
Upstream Patches: Mainline: http://review.gluster.org/15721 Release 3.7: http://review.gluster.org/15856 Release 3.8: http://review.gluster.org/15855 Release 3.9: http://review.gluster.org/15854 Downstream Patch: https://code.engineering.redhat.com/gerrit/90316
Verified with the build: glusterfs-geo-replication-3.8.4-17.el7rhgs.x86_64 checkpoint do not accept values other than now and format (Y-m-d H:M:S) 3.1.3: ====== [root@dhcp42-195 scripts]# gluster volume geo-replication master 10.70.43.63::slave status MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED --------------------------------------------------------------------------------------------------------------------------------------------------- 10.70.42.195 master /rhs/brick1/b1 root 10.70.43.63::slave 10.70.42.54 Active Changelog Crawl 2017-03-05 07:23:15 10.70.42.195 master /rhs/brick2/b4 root 10.70.43.63::slave 10.70.42.54 Active Changelog Crawl 2017-03-05 07:23:15 10.70.43.93 master /rhs/brick1/b3 root 10.70.43.63::slave 10.70.43.178 Active Changelog Crawl 2017-03-05 07:23:15 10.70.43.93 master /rhs/brick2/b6 root 10.70.43.63::slave 10.70.43.178 Active Changelog Crawl 2017-03-05 07:23:15 10.70.43.124 master /rhs/brick1/b2 root 10.70.43.63::slave 10.70.43.63 Active Changelog Crawl 2017-03-05 07:23:23 10.70.43.124 master /rhs/brick2/b5 root 10.70.43.63::slave 10.70.43.63 Active Changelog Crawl 2017-03-05 07:23:15 [root@dhcp42-195 scripts]# gluster volume geo-replication master 10.70.43.63::slave config checkpoint rahul geo-replication config updated successfully [root@dhcp42-195 scripts]# gluster volume geo-replication master 10.70.43.63::slave config checkpoint rahul [root@dhcp42-195 scripts]# gluster volume geo-replication master 10.70.43.63::slave status No active geo-replication sessions between master and 10.70.43.63::slave [root@dhcp42-195 scripts]# 3.2.0: ====== [root@dhcp42-7 scripts]# gluster volume geo-replication master 10.70.43.249::slave config checkpoint rahul Invalid Checkpoint label. Use format "Y-m-d H:M:S", Example: 2016-10-25 15:30:45 Usage: volume geo-replication [<VOLNAME>] [<SLAVE-URL>] {create [[ssh-port n] [[no-verify]|[push-pem]]] [force]|start [force]|stop [force]|pause [force]|resume [force]|config|status [detail]|delete [reset-sync-time]} [options...] [root@dhcp42-7 scripts]# Moving the bug to verified state.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2017-0486.html