Bug 1468200

Summary: [Geo-rep]: entry failed to sync to slave with ENOENT errror
Product: [Community] GlusterFS Reporter: Kotresh HR <khiremat>
Component: geo-replicationAssignee: Kotresh HR <khiremat>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 3.11CC: bugs, rhinduja
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.11.2 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1467718 Environment:
Last Closed: 2017-08-12 13:07:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1467718    
Bug Blocks: 1468186, 1468198    

Description Kotresh HR 2017-07-06 09:44:59 UTC
+++ This bug was initially created as a clone of Bug #1467718 +++

Description of problem:
When running iozone, bonnie, smallfiles workload on master, the entry failed to sync to slave with ENOENT on slave (Parent directory does not exist on slave)

The errors is like below.

[2017-06-16 14:54:26.1849] E [master(/gluster/brick1/brick):785:log_failures] _GMaster: ENTRY FAILED: ({'uid': 0, 'gfid': '4d16fd49-591d-4088-8f87-e75c081ca2f9', 'gid': 0, 'mode': 33152, 'entry': '.gfid/abe8c2f6-210b-4ac3-8c05-a84d44c3b5b1/dovecot.index', 'op': 'MKNOD'}, 2) 

Version-Release number of selected component (if applicable):
mainline

How reproducible:
Saw only once

Steps to Reproduce:
1. Setup geo-rep and run iozone, bonnie, smallfile workload on master

Actual results:
Entry failure error with ENOENT

Expected results:
Entry failures should not happen

Additional info:

--- Additional comment from Kotresh HR on 2017-07-04 15:39:32 EDT ---

Analysis:
It was seen that the RMDIR followed by MKDIR is recorded in changelog on
a particular subvolume with same gfid and pargfid/bname but not on all subvolumes as below.
    
    E 61c67a2e-07f2-45a9-95cf-d8f16a5e9c36 RMDIR \
    9cc51be8-91c3-4ef4-8ae3-17596fcfed40%2Ffedora2
    E 61c67a2e-07f2-45a9-95cf-d8f16a5e9c36 MKDIR 16877 0 0 \
    9cc51be8-91c3-4ef4-8ae3-17596fcfed40%2Ffedora2
    
While processing this changelog, geo-rep thinks RMDIR is successful and does recursive rmdir on slave. But in the master the directory still exists. Further entry creation under this directory which hashed to that particular subvol failed with ENOENT.

--- Additional comment from Worker Ant on 2017-07-04 15:43:30 EDT ---

REVIEW: https://review.gluster.org/17695 (geo-rep: Fix entry failure because parent dir doesn't exist) posted (#1) for review on master by Kotresh HR (khiremat)

--- Additional comment from Kotresh HR on 2017-07-04 15:45:16 EDT ---

Cause:
    RMDIR-MKDIR pair gets recorded so in changelog when the
    directory removal is successful on cached subvolume and
    failed in one of hashed subvol for some reason
    (may be down). In this case, the directory is re-created
    on cached subvol which gets recorded as MKDIR again in
    changelog.

 Solution:
    So while processing RMDIR geo-replication should stat on
    master with gfid and should not delete it if it's present.

--- Additional comment from Worker Ant on 2017-07-05 11:44:27 EDT ---

COMMIT: https://review.gluster.org/17695 committed in master by Aravinda VK (avishwan) 
------
commit b25bf64f3a3520a96ad557daa4903c0ceba96d72
Author: Kotresh HR <khiremat>
Date:   Tue Jul 4 08:46:06 2017 -0400

    geo-rep: Fix entry failure because parent dir doesn't exist
    
    In a distributed volume on master, it can so happen that
    the RMDIR followed by MKDIR is recorded in changelog on
    a particular subvolume with same gfid and pargfid/bname
    but not on all subvolumes as below.
    
    E 61c67a2e-07f2-45a9-95cf-d8f16a5e9c36 RMDIR \
    9cc51be8-91c3-4ef4-8ae3-17596fcfed40%2Ffedora2
    E 61c67a2e-07f2-45a9-95cf-d8f16a5e9c36 MKDIR 16877 0 0 \
    9cc51be8-91c3-4ef4-8ae3-17596fcfed40%2Ffedora2
    
    While processing this changelog, geo-rep thinks RMDIR is
    successful and does recursive rmdir on slave. But in the
    master the directory still exists. This could lead to
    data discrepancy between master and slave.
    
    Cause:
    RMDIR-MKDIR pair gets recorded so in changelog when the
    directory removal is successful on cached subvolume and
    failed in one of hashed subvol for some reason
    (may be down). In this case, the directory is re-created
    on cached subvol which gets recorded as MKDIR again in
    changelog.
    
    Solution:
    So while processing RMDIR geo-replication should stat on
    master with gfid and should not delete it if it's present.
    
    Change-Id: If5da1d6462eb4d9ebe2e88b3a70cc454411a133e
    BUG: 1467718
    Signed-off-by: Kotresh HR <khiremat>
    Reviewed-on: https://review.gluster.org/17695
    Smoke: Gluster Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Aravinda VK <avishwan>

--- Additional comment from Rahul Hinduja on 2017-07-06 04:46:25 EDT ---

Its a very corner case and a race between how 2 changelogs are processed during rmdir and mkdir. Following is one of the case: 

1. Two Subvolume having dir d1. d1 having files f1,f2 in first subvolume and f3,f4 in second subvolume. 
2. rmdir of d1 is issued and mkdir with same name (d1) is issued. New files created with name f5,f6,f7,f8. 

If rmdir failed on one subvolume (A) for any reason, recursive rmdir is retried. At the same time some of the new files are hashed to different subvolume (B). Once the rmdir is reprocessed at A, it would delete the newly created files at B and will have only the files created after changelog processed mkdir on A.

Comment 1 Worker Ant 2017-07-06 09:46:48 UTC
REVIEW: https://review.gluster.org/17715 (geo-rep: Fix entry failure because parent dir doesn't exist) posted (#1) for review on release-3.11 by Kotresh HR (khiremat)

Comment 2 Worker Ant 2017-07-10 13:59:10 UTC
COMMIT: https://review.gluster.org/17715 committed in release-3.11 by Shyamsundar Ranganathan (srangana) 
------
commit 0812d960e5a02bd2021233e5ef09a1139705a88f
Author: Kotresh HR <khiremat>
Date:   Tue Jul 4 08:46:06 2017 -0400

    geo-rep: Fix entry failure because parent dir doesn't exist
    
    In a distributed volume on master, it can so happen that
    the RMDIR followed by MKDIR is recorded in changelog on
    a particular subvolume with same gfid and pargfid/bname
    but not on all subvolumes as below.
    
    E 61c67a2e-07f2-45a9-95cf-d8f16a5e9c36 RMDIR \
    9cc51be8-91c3-4ef4-8ae3-17596fcfed40%2Ffedora2
    E 61c67a2e-07f2-45a9-95cf-d8f16a5e9c36 MKDIR 16877 0 0 \
    9cc51be8-91c3-4ef4-8ae3-17596fcfed40%2Ffedora2
    
    While processing this changelog, geo-rep thinks RMDIR is
    successful and does recursive rmdir on slave. But in the
    master the directory still exists. This could lead to
    data discrepancy between master and slave.
    
    Cause:
    RMDIR-MKDIR pair gets recorded so in changelog when the
    directory removal is successful on cached subvolume and
    failed in one of hashed subvol for some reason
    (may be down). In this case, the directory is re-created
    on cached subvol which gets recorded as MKDIR again in
    changelog.
    
    Solution:
    So while processing RMDIR geo-replication should stat on
    master with gfid and should not delete it if it's present.
    
    > Change-Id: If5da1d6462eb4d9ebe2e88b3a70cc454411a133e
    > BUG: 1467718
    > Signed-off-by: Kotresh HR <khiremat>
    > Reviewed-on: https://review.gluster.org/17695
    > Smoke: Gluster Build System <jenkins.org>
    > CentOS-regression: Gluster Build System <jenkins.org>
    > Reviewed-by: Aravinda VK <avishwan>
    (cherry picked from commit b25bf64f3a3520a96ad557daa4903c0ceba96d72)
    
    Change-Id: If5da1d6462eb4d9ebe2e88b3a70cc454411a133e
    BUG: 1468200
    Signed-off-by: Kotresh HR <khiremat>
    Reviewed-on: https://review.gluster.org/17715
    Smoke: Gluster Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Shyamsundar Ranganathan <srangana>

Comment 3 Shyamsundar 2017-08-12 13:07:33 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.11.2, please open a new bug report.

glusterfs-3.11.2 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/gluster-users/2017-July/031908.html
[2] https://www.gluster.org/pipermail/gluster-users/