Bug 1296208 - Geo-Replication Session goes "FAULTY" when application logs rolled on master
Geo-Replication Session goes "FAULTY" when application logs rolled on master
Status: CLOSED CURRENTRELEASE
Product: GlusterFS
Classification: Community
Component: geo-replication (Show other bugs)
3.7.7
All Linux
high Severity high
: ---
: ---
Assigned To: Milind Changire
: ZStream
Depends On: 1264986
Blocks: 1296206 glusterfs-3.7.9
  Show dependency treegraph
 
Reported: 2016-01-06 10:31 EST by Milind Changire
Modified: 2016-04-19 03:21 EDT (History)
12 users (show)

See Also:
Fixed In Version: glusterfs-3.7.9
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1264986
Environment:
Last Closed: 2016-03-22 04:15:30 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Comment 2 Vijay Bellur 2016-03-01 23:50:12 EST
REVIEW: http://review.gluster.org/13571 (georep: avoid creating multiple entries with same gfid) posted (#1) for review on release-3.7 by Milind Changire (mchangir@redhat.com)
Comment 3 Kotresh HR 2016-03-02 00:16:51 EST
Description of problem: 
When application rolls logs on Master, the session goes FAULTY.

Investigation revealed that there is an issue with CREATE + RENAME log replay from geo-rep.
Comment 4 Vijay Bellur 2016-03-08 11:35:03 EST
COMMIT: http://review.gluster.org/13571 committed in release-3.7 by Vijay Bellur (vbellur@redhat.com) 
------
commit 16f42cdef539d5c63784f989af9ae877a94d72e7
Author: Milind Changire <mchangir@redhat.com>
Date:   Fri Jan 29 13:53:07 2016 +0530

    georep: avoid creating multiple entries with same gfid
    
    Problem:
    CREATE + RENAME changelogs replayed by geo-replication cause
    stale old-name entries with same gfid on slave nodes.
    A gfid is a unique key in the file-system and should not be
    assigned to multiple entries.
    
    Solution:
    Create entry on slave only if lstat(gfid) at aux-mount fails.
    This applies to files as well as directories.
    
    Change-Id: Ice3340f4ae1251c2dcef024a2388c4d33b5d4919
    BUG: 1296208
    Signed-off-by: Milind Changire <mchangir@redhat.com>
    Reviewed-on: http://review.gluster.org/13316
    Smoke: Gluster Build System <jenkins@build.gluster.com>
    Reviewed-by: Kotresh HR <khiremat@redhat.com>
    Reviewed-by: Aravinda VK <avishwan@redhat.com>
    NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
    CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
    (cherry picked from commit 87d93fac9fcc4b258b7eb432ac4151cdd043534f)
    Reviewed-on: http://review.gluster.org/13571
Comment 5 Vijay Bellur 2016-03-08 14:40:00 EST
Has any performance characterization been done to ascertain the percentage of creates being affected due to the additional stat()?
Comment 6 Milind Changire 2016-03-08 23:42:19 EST
No performance characterization tests have been done specifically.
However, the lstat() is done for _every_ entry creation i.e. 100% of the time, since there's no way to identify if the logs are being played for the first time or after a georep restart to conditionally lstat()
Comment 7 Kaushal 2016-04-19 03:21:07 EDT
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.9, please open a new bug report.

glusterfs-3.7.9 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://www.gluster.org/pipermail/gluster-users/2016-March/025922.html
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Note You need to log in before you can comment on or make changes to this bug.