Bug 1601314 - [geo-rep]: Geo-replication not syncing renamed symlink
Summary: [geo-rep]: Geo-replication not syncing renamed symlink
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: geo-replication
Version: rhgs-3.4
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: RHGS 3.4.0
Assignee: Kotresh HR
QA Contact: Rochelle
URL:
Whiteboard:
Depends On: 1599587 1600405 1611113
Blocks: 1503137
TreeView+ depends on / blocked
 
Reported: 2018-07-16 04:46 UTC by Kotresh HR
Modified: 2018-09-14 03:54 UTC (History)
12 users (show)

Fixed In Version: glusterfs-3.12.2-14
Doc Type: Bug Fix
Doc Text:
Multiple operators of geo-replication processes directory operations. When symlink was created and renamed followed by directory creation with the same name as the original symlink, file caused out of order syncing. This caused directory to be synced first failing to sync the rename of the symlink. As a result, geo-replication sometimes failed to sync the rename of the symlink. With this fix, while processing out of order RENAME, if the name already exists, users can use gfid to check the identity of the file and sync.
Clone Of: 1600405
Environment:
Last Closed: 2018-09-04 06:50:24 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2018:2607 0 None None None 2018-09-04 06:51:58 UTC

Description Kotresh HR 2018-07-16 04:46:51 UTC
+++ This bug was initially created as a clone of Bug #1600405 +++

Description of problem:
Geo-rep sometimes fails to sync the rename of symlink
    if the I/O is as follows
    
      1. touch file1
      2. ln -s "./file1" sym_400
      3. mv sym_400 renamed_sym_400
      4. mkdir sym_400

The file 'renamed_sym_400' failed to sync to slave
    


Version-Release number of selected component (if applicable):
mainline

How reproducible:
Few times, looks like race

Steps to Reproduce:
1. setup geo-rep, start it.
2. Stop geo-rep 
3. On master do following I/O
        1. touch file1
        2. ln -s "./file1" sym_400
        3. mv sym_400 renamed_sym_400
        4. mkdir sym_400
4. Find the brick on which rename_sym_400 is present on master
   and kill that brick
5. Start geo-rep so that other bricks processes there changelog first
6. Once other bricks are in changelog crawl, bring back brick which was down.
7. It also moves to changelog but 'renamed_sym_400' doesn't sync


Actual results:
'renamed_sym_400' doesn't sync

Expected results:
'renamed_sym_400' should sync

Additional info:

--- Additional comment from Worker Ant on 2018-07-12 04:34:29 EDT ---

REVIEW: https://review.gluster.org/20496 (geo-rep: Fix symlink rename syncing issue) posted (#1) for review on master by Kotresh HR

--- Additional comment from Worker Ant on 2018-07-12 10:46:17 EDT ---

COMMIT: https://review.gluster.org/20496 committed in master by "Kotresh HR" <khiremat> with a commit message- geo-rep: Fix symlink rename syncing issue

Problem:
   Geo-rep sometimes fails to sync the rename of symlink
if the I/O is as follows

  1. touch file1
  2. ln -s "./file1" sym_400
  3. mv sym_400 renamed_sym_400
  4. mkdir sym_400

 The file 'renamed_sym_400' failed to sync to slave

Cause:
  Assume there are three distribute subvolume (brick1, brick2, brick3).
  The changelogs are recorded as follows for above I/O pattern.
  Note that the MKDIR is recorded on all bricks.

  1. brick1:
     -------

     CREATE file1
     SYMLINK sym_400
     RENAME sym_400 renamed_sym_400
     MKDIR sym_400

  2. brick2:
     -------

     MKDIR sym_400

  3. brick3:
     -------

     MKDIR sym_400

  The operations on 'brick1' should be processed sequentially. But
  since MKDIR is recorded on all the bricks, The brick 'brick2/brick3'
  processed MKDIR first before 'brick1' causing out of order syncing
  and created directory sym_400 first.

  Now 'brick1' processed it's changelog.

     CREATE file1 -> succeeds
     SYMLINK sym_400 -> No longer present in master. Ignored
     RENAME sym_400 renamed_sym_400
            While processing RENAME, if source('sym_400') doesn't
            present, destination('renamed_sym_400') is created. But
            geo-rep stats the name 'sym_400' to confirm source file's
            presence. In this race, since source name 'sym_400' is
            present as directory, it doesn't create destination.
            Hence RENAME is ignored.

Fix:
  The fix is not rely only on stat of source name during RENAME.
  It should stat the name and if the name is present, gfid should
  be same. Only then it can conclude the presence of source.

fixes: bz#1600405
Change-Id: I9fbec4f13ca6a182798a7f81b356fe2003aff969
Signed-off-by: Kotresh HR <khiremat>

Comment 12 errata-xmlrpc 2018-09-04 06:50:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2607


Note You need to log in before you can comment on or make changes to this bug.