Description of problem: snapshot with geo-rep, while creating files on master, resulted in failure to capture few files entry in changelogs. In one of the case, active replica doesn't have entry for the missing file but passive replica has. file in question and its gfid >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> #getfattr -n glusterfs.gfid.string /mnt/master/thread0/level05/level15/539ab42c%%MFQLJBDHI3 getfattr: Removing leading '/' from absolute path names # file: mnt/master/thread0/level05/level15/539ab42c%%MFQLJBDHI3 glusterfs.gfid.string="cc9ccc81-9af6-4ca4-8e21-4226f08543cf" >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> on active replica >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> # find /bricks/ | grep "539ab42c%%MFQLJBDHI3" /bricks/brick2/master_b7/thread0/level05/level15/539ab42c%%MFQLJBDHI3 [root@redcell ~]# grep "MFQLJBDHI3" /bricks/brick brick0/ brick1/ brick2/ brick3/ [root@redcell ~]# grep "MFQLJBDHI3" /bricks/brick2/master_b7/.glusterfs/changelogs/* [root@redcell ~]# >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> on passive replica >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> # find /bricks/ | grep "539ab42c%%MFQLJBDHI3" /bricks/brick2/master_b8/thread0/level05/level15/539ab42c%%MFQLJBDHI3 [root@redeye ~]# grep "MFQLJBDHI3" /bricks/brick2/master_b8/.glusterfs/changelogs/* Binary file /bricks/brick2/master_b8/.glusterfs/changelogs/CHANGELOG.1402647607 matches >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Explanation of the debugging explained in the additional info. Version-Release number of selected component (if applicable): glusterfs-3.6.0.16-1.el6rhs How reproducible: Doesn't happen everytime. Steps to Reproduce: 1. create and start a geo-rep relationship between master and slave. 2. start creating data on master using the command "crefi -T 10 -n 10 --multi -d 10 -b 10 --random --max=10K --min=1K /mnt/master" 3. while creating data pause geo-rep 4. create snap-shot of slave, 5. create snap-shot of master 6. resume geo-rep Actual results: Few of the file are failed to get captured in changelog. Expected results: None of the files should be missed in changelog.
REVIEW: http://review.gluster.org/8070 (features/changelog: Do not ignore self-heal fops in changelog) posted (#1) for review on master by Kotresh HR (khiremat)
COMMIT: http://review.gluster.org/8070 committed in master by Vijay Bellur (vbellur) ------ commit 62265f40d7201854dbf33d59a74286dda671a129 Author: Kotresh H R <khiremat> Date: Mon Jun 16 12:30:39 2014 +0530 features/changelog: Do not ignore self-heal fops in changelog Problem: Geo-rep fails to sync some files to slave as the changelog entries are missing for those files. Cause: Fops happened when the active brick is down and self- healed later when it came up. Solution: Capture self-heal fops as well in changelog so those entries are not missed. Change-Id: Ibc288779421b5156dd1695e529aba0b602a530e0 BUG: 1109692 Signed-off-by: Kotresh H R <khiremat> Reviewed-on: http://review.gluster.org/8070 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Vijay Bellur <vbellur>
REVIEW: http://review.gluster.org/8196 (feature/changelog: Fix for missing changelogs at backend.) posted (#1) for review on master by Kotresh HR (khiremat)
REVIEW: http://review.gluster.org/8196 (feature/changelog: Fix for missing changelogs at backend.) posted (#2) for review on master by Kotresh HR (khiremat)
COMMIT: http://review.gluster.org/8196 committed in master by Venky Shankar (vshankar) ------ commit 2417de9c37d83e36567551dc682bb23f851fd2d7 Author: Kotresh H R <khiremat> Date: Sat Jun 28 12:18:52 2014 +0530 feature/changelog: Fix for missing changelogs at backend. Problem: A few changelog files are missing at the backend during snapshot with changelog enabled. Cause: Race between actual rollover and explicit rollover. Changelog rollover can happen either due to actual or the explict rollover due to snapshot. Actual rollover is controlled by tuneable called rollover-time. The minimum granularity for rollover-time is 1 second Explicit rollover is asynchronous in nature and happens during snapshot. Basically, rollover renames the current CHANGELOG file to CHANGELOG.TIMESTAMP after rollover-time. Let's assume, at time 't1', actual and explicit rollover raced against each other and actual rollover won the race renaming the CHANGELOG file to CHANGELOG.t1 and opens a new CHANGELOG file. An immediate explicit rollover at time 't1' happened with in the same second to rename CHANGELOG file to CHANGELOG.t1 resulting in purging the earlier CHANGELOG.t1 file created by actual rollover. Solution: Adding a delay of 1 sec guarantees unique CHANGELOG.TIMESTAMP during explicit rollover. Thanks Venky, for the all the help in root causing the issue. Change-Id: I8958824e107e16f61be9f09a11d95f8645ecf34d BUG: 1109692 Signed-off-by: Kotresh H R <khiremat> Reviewed-on: http://review.gluster.org/8196 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Venky Shankar <vshankar> Tested-by: Venky Shankar <vshankar>
A beta release for GlusterFS 3.6.0 has been released. Please verify if the release solves this bug report for you. In case the glusterfs-3.6.0beta1 release does not have a resolution for this issue, leave a comment in this bug and move the status to ASSIGNED. If this release fixes the problem for you, leave a note and change the status to VERIFIED. Packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update (possibly an "updates-testing" repository) infrastructure for your distribution. [1] http://supercolony.gluster.org/pipermail/gluster-users/2014-September/018836.html [2] http://supercolony.gluster.org/pipermail/gluster-users/
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.6.1, please reopen this bug report. glusterfs-3.6.1 has been announced [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://supercolony.gluster.org/pipermail/gluster-users/2014-November/019410.html [2] http://supercolony.gluster.org/mailman/listinfo/gluster-users