Bug 1296208 - Geo-Replication Session goes "FAULTY" when application logs rolled on master
Summary: Geo-Replication Session goes "FAULTY" when application logs rolled on master
Alias: None
Product: GlusterFS
Classification: Community
Component: geo-replication
Version: 3.7.7
Hardware: All
OS: Linux
Target Milestone: ---
Assignee: Milind Changire
QA Contact:
Depends On: 1264986
Blocks: 1296206 glusterfs-3.7.9
TreeView+ depends on / blocked
Reported: 2016-01-06 15:31 UTC by Milind Changire
Modified: 2016-04-19 07:21 UTC (History)
12 users (show)

Fixed In Version: glusterfs-3.7.9
Doc Type: Bug Fix
Doc Text:
Clone Of: 1264986
Last Closed: 2016-03-22 08:15:30 UTC
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:

Attachments (Terms of Use)

Comment 2 Vijay Bellur 2016-03-02 04:50:12 UTC
REVIEW: http://review.gluster.org/13571 (georep: avoid creating multiple entries with same gfid) posted (#1) for review on release-3.7 by Milind Changire (mchangir@redhat.com)

Comment 3 Kotresh HR 2016-03-02 05:16:51 UTC
Description of problem: 
When application rolls logs on Master, the session goes FAULTY.

Investigation revealed that there is an issue with CREATE + RENAME log replay from geo-rep.

Comment 4 Vijay Bellur 2016-03-08 16:35:03 UTC
COMMIT: http://review.gluster.org/13571 committed in release-3.7 by Vijay Bellur (vbellur@redhat.com) 
commit 16f42cdef539d5c63784f989af9ae877a94d72e7
Author: Milind Changire <mchangir@redhat.com>
Date:   Fri Jan 29 13:53:07 2016 +0530

    georep: avoid creating multiple entries with same gfid
    CREATE + RENAME changelogs replayed by geo-replication cause
    stale old-name entries with same gfid on slave nodes.
    A gfid is a unique key in the file-system and should not be
    assigned to multiple entries.
    Create entry on slave only if lstat(gfid) at aux-mount fails.
    This applies to files as well as directories.
    Change-Id: Ice3340f4ae1251c2dcef024a2388c4d33b5d4919
    BUG: 1296208
    Signed-off-by: Milind Changire <mchangir@redhat.com>
    Reviewed-on: http://review.gluster.org/13316
    Smoke: Gluster Build System <jenkins@build.gluster.com>
    Reviewed-by: Kotresh HR <khiremat@redhat.com>
    Reviewed-by: Aravinda VK <avishwan@redhat.com>
    NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
    CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
    (cherry picked from commit 87d93fac9fcc4b258b7eb432ac4151cdd043534f)
    Reviewed-on: http://review.gluster.org/13571

Comment 5 Vijay Bellur 2016-03-08 19:40:00 UTC
Has any performance characterization been done to ascertain the percentage of creates being affected due to the additional stat()?

Comment 6 Milind Changire 2016-03-09 04:42:19 UTC
No performance characterization tests have been done specifically.
However, the lstat() is done for _every_ entry creation i.e. 100% of the time, since there's no way to identify if the logs are being played for the first time or after a georep restart to conditionally lstat()

Comment 7 Kaushal 2016-04-19 07:21:07 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.9, please open a new bug report.

glusterfs-3.7.9 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://www.gluster.org/pipermail/gluster-users/2016-March/025922.html
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Note You need to log in before you can comment on or make changes to this bug.