+++ This bug was initially created as a clone of Bug #1600405 +++ Description of problem: Geo-rep sometimes fails to sync the rename of symlink if the I/O is as follows 1. touch file1 2. ln -s "./file1" sym_400 3. mv sym_400 renamed_sym_400 4. mkdir sym_400 The file 'renamed_sym_400' failed to sync to slave Version-Release number of selected component (if applicable): mainline How reproducible: Few times, looks like race Steps to Reproduce: 1. setup geo-rep, start it. 2. Stop geo-rep 3. On master do following I/O 1. touch file1 2. ln -s "./file1" sym_400 3. mv sym_400 renamed_sym_400 4. mkdir sym_400 4. Find the brick on which rename_sym_400 is present on master and kill that brick 5. Start geo-rep so that other bricks processes there changelog first 6. Once other bricks are in changelog crawl, bring back brick which was down. 7. It also moves to changelog but 'renamed_sym_400' doesn't sync Actual results: 'renamed_sym_400' doesn't sync Expected results: 'renamed_sym_400' should sync Additional info: --- Additional comment from Worker Ant on 2018-07-12 04:34:29 EDT --- REVIEW: https://review.gluster.org/20496 (geo-rep: Fix symlink rename syncing issue) posted (#1) for review on master by Kotresh HR --- Additional comment from Worker Ant on 2018-07-12 10:46:17 EDT --- COMMIT: https://review.gluster.org/20496 committed in master by "Kotresh HR" <khiremat> with a commit message- geo-rep: Fix symlink rename syncing issue Problem: Geo-rep sometimes fails to sync the rename of symlink if the I/O is as follows 1. touch file1 2. ln -s "./file1" sym_400 3. mv sym_400 renamed_sym_400 4. mkdir sym_400 The file 'renamed_sym_400' failed to sync to slave Cause: Assume there are three distribute subvolume (brick1, brick2, brick3). The changelogs are recorded as follows for above I/O pattern. Note that the MKDIR is recorded on all bricks. 1. brick1: ------- CREATE file1 SYMLINK sym_400 RENAME sym_400 renamed_sym_400 MKDIR sym_400 2. brick2: ------- MKDIR sym_400 3. brick3: ------- MKDIR sym_400 The operations on 'brick1' should be processed sequentially. But since MKDIR is recorded on all the bricks, The brick 'brick2/brick3' processed MKDIR first before 'brick1' causing out of order syncing and created directory sym_400 first. Now 'brick1' processed it's changelog. CREATE file1 -> succeeds SYMLINK sym_400 -> No longer present in master. Ignored RENAME sym_400 renamed_sym_400 While processing RENAME, if source('sym_400') doesn't present, destination('renamed_sym_400') is created. But geo-rep stats the name 'sym_400' to confirm source file's presence. In this race, since source name 'sym_400' is present as directory, it doesn't create destination. Hence RENAME is ignored. Fix: The fix is not rely only on stat of source name during RENAME. It should stat the name and if the name is present, gfid should be same. Only then it can conclude the presence of source. fixes: bz#1600405 Change-Id: I9fbec4f13ca6a182798a7f81b356fe2003aff969 Signed-off-by: Kotresh HR <khiremat>
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:2607