Bug 1114501 - Dist-geo-rep : deletion of files on master, geo-rep fails to propagate to slaves.
Summary: Dist-geo-rep : deletion of files on master, geo-rep fails to propagate to sl...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: core
Version: mainline
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
Assignee: Pranith Kumar K
QA Contact:
URL:
Whiteboard:
Depends On: 1111587
Blocks: glusterfs-3.5.2 1112531
TreeView+ depends on / blocked
 
Reported: 2014-06-30 09:09 UTC by Pranith Kumar K
Modified: 2014-09-01 11:07 UTC (History)
9 users (show)

Fixed In Version: glusterfs-3.5.2beta1
Clone Of: 1111587
Environment:
Last Closed: 2014-07-31 11:43:34 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Comment 1 Pranith Kumar K 2014-06-30 10:24:59 UTC
Without the fix:
afr does comparison with virtual gfid: e992cc46-7761-4311-a4cf-8fdde34636a5
[2014-06-30 10:08:23.727572] I [afr-lk-common.c:73:afr_entry_lockee_cmp] 0-CMP: e992cc46-7761-4311-a4cf-8fdde34636a5 - fc5d9f18-eac9-4f49-92ea-9bf27df31a30, -1
[2014-06-30 10:08:23.727631] I [client-rpc-fops.c:5501:client3_3_entrylk] 0-r2-client-0: 5980de94-0f80-4eba-b2d4-52a1abf9056a(b)
[2014-06-30 10:08:23.727735] I [client-rpc-fops.c:5501:client3_3_entrylk] 0-r2-client-1: 5980de94-0f80-4eba-b2d4-52a1abf9056a(b)
[2014-06-30 10:08:23.727839] I [client-rpc-fops.c:5501:client3_3_entrylk] 0-r2-client-0: fc5d9f18-eac9-4f49-92ea-9bf27df31a30()
[2014-06-30 10:08:23.727912] I [client-rpc-fops.c:5501:client3_3_entrylk] 0-r2-client-1: fc5d9f18-eac9-4f49-92ea-9bf27df31a30()
[2014-06-30 10:08:23.728775] I [client-rpc-fops.c:5501:client3_3_entrylk] 0-r2-client-0: 5980de94-0f80-4eba-b2d4-52a1abf9056a(b)
[2014-06-30 10:08:23.728863] I [client-rpc-fops.c:5501:client3_3_entrylk] 0-r2-client-1: 5980de94-0f80-4eba-b2d4-52a1abf9056a(b)
[2014-06-30 10:08:23.728941] I [client-rpc-fops.c:5501:client3_3_entrylk] 0-r2-client-0: fc5d9f18-eac9-4f49-92ea-9bf27df31a30()
[2014-06-30 10:08:23.729006] I [client-rpc-fops.c:5501:client3_3_entrylk] 0-r2-client-1: fc5d9f18-eac9-4f49-92ea-9bf27df31a30()

With the fix:
Comparisons are happening with real-gfid.
[2014-06-30 10:14:58.884052] I [afr-lk-common.c:73:afr_entry_lockee_cmp] 0-CMP: 5ea076c7-0373-4914-9484-9423a265b519 - fe5362ee-e06c-4fd2-9108-f14058cdeb81, -1
[2014-06-30 10:14:58.884104] I [client-rpc-fops.c:5501:client3_3_entrylk] 0-r2-client-0: 5ea076c7-0373-4914-9484-9423a265b519(b)
[2014-06-30 10:14:58.884199] I [client-rpc-fops.c:5501:client3_3_entrylk] 0-r2-client-1: 5ea076c7-0373-4914-9484-9423a265b519(b)
[2014-06-30 10:14:58.884325] I [client-rpc-fops.c:5501:client3_3_entrylk] 0-r2-client-0: fe5362ee-e06c-4fd2-9108-f14058cdeb81()
[2014-06-30 10:14:58.884401] I [client-rpc-fops.c:5501:client3_3_entrylk] 0-r2-client-1: fe5362ee-e06c-4fd2-9108-f14058cdeb81()
[2014-06-30 10:14:58.885544] I [client-rpc-fops.c:5501:client3_3_entrylk] 0-r2-client-0: 5ea076c7-0373-4914-9484-9423a265b519(b)
[2014-06-30 10:14:58.885624] I [client-rpc-fops.c:5501:client3_3_entrylk] 0-r2-client-1: 5ea076c7-0373-4914-9484-9423a265b519(b)
[2014-06-30 10:14:58.885702] I [client-rpc-fops.c:5501:client3_3_entrylk] 0-r2-client-0: fe5362ee-e06c-4fd2-9108-f14058cdeb81()
[2014-06-30 10:14:58.885773] I [client-rpc-fops.c:5501:client3_3_entrylk] 0-r2-client-1: fe5362ee-e06c-4fd2-9108-f14058cdeb81()

Writing some regression tests. Will send the patch out soon.

Pranith

Comment 2 Anand Avati 2014-06-30 15:38:05 UTC
REVIEW: http://review.gluster.org/8204 (features/gfid-access: Fix entry operations) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 3 Shalaka 2014-07-01 06:36:28 UTC
Please add doc text for this Known Issue.

Comment 4 Anand Avati 2014-07-03 01:33:09 UTC
REVIEW: http://review.gluster.org/8204 (features/gfid-access: Fix entry operations) posted (#2) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 5 Anand Avati 2014-07-03 04:54:50 UTC
REVIEW: http://review.gluster.org/8204 (features/gfid-access: Fix entry operations) posted (#3) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 6 Anand Avati 2014-07-07 04:01:26 UTC
COMMIT: http://review.gluster.org/8204 committed in master by Vijay Bellur (vbellur) 
------
commit 8202705f98d139ef7d691587b9f68cf1db2e397a
Author: Pranith Kumar K <pkarampu>
Date:   Thu Jul 3 06:50:56 2014 +0530

    features/gfid-access: Fix entry operations
    
    Problem:
    When more than one aux-mounts are performing rmdir .gfid/<pargfid>/dir
    simultaneously, then sometimes a hang is observed.  In gfid-access xlator When
    virtual parent/inode are replaced with real parent/inode in loc, virtual
    pargfid/gfid are not replaced with real pargfid/gfid respectively. Afr is using
    parent_loc->gfids to order the entry locks. But parent_loc->gfid contains
    random/virtual gfid generated by gfid-access xlator. Entrylk in client xlator
    is using loc->inod->gfid for sending entrylk which has 'real' gfid. Because the
    ordering is happening based on random gfids, One mount orders the locks as (L1,
    L2) where as the other orders them as (L2, L1) leading to a dead-lock thus
    a hang.
    
    Fix:
    Replace virtual pargfid/gfid with real pargfid/gfid when virtual-inodes are
    replaced with real-inodes in loc.
    
    BUG: 1114501
    Change-Id: Ie94e816122ef9e7aad51605adbf49291de60827e
    Signed-off-by: Pranith Kumar K <pkarampu>
    Reviewed-on: http://review.gluster.org/8204
    Reviewed-by: Kotresh HR <khiremat>
    Reviewed-by: Vijay Bellur <vbellur>
    Tested-by: Vijay Bellur <vbellur>

Comment 7 Anand Avati 2014-07-07 08:30:17 UTC
REVIEW: http://review.gluster.org/8251 (features/gfid-access: Fix entry operations) posted (#1) for review on release-3.5 by Pranith Kumar Karampuri (pkarampu)

Comment 8 Anand Avati 2014-07-08 08:13:25 UTC
COMMIT: http://review.gluster.org/8251 committed in release-3.5 by Niels de Vos (ndevos) 
------
commit 828fe8068de0f1357e5c26097e45d752b3f7f6c4
Author: Pranith Kumar K <pkarampu>
Date:   Thu Jul 3 06:50:56 2014 +0530

    features/gfid-access: Fix entry operations
    
            Backport of http://review.gluster.org/8204
    
    Problem:
    When more than one aux-mounts are performing rmdir .gfid/<pargfid>/dir
    simultaneously, then sometimes a hang is observed.  In gfid-access xlator When
    virtual parent/inode are replaced with real parent/inode in loc, virtual
    pargfid/gfid are not replaced with real pargfid/gfid respectively. Afr is using
    parent_loc->gfids to order the entry locks. But parent_loc->gfid contains
    random/virtual gfid generated by gfid-access xlator. Entrylk in client xlator
    is using loc->inod->gfid for sending entrylk which has 'real' gfid. Because the
    ordering is happening based on random gfids, One mount orders the locks as (L1,
    L2) where as the other orders them as (L2, L1) leading to a dead-lock thus
    a hang.
    
    Fix:
    Replace virtual pargfid/gfid with real pargfid/gfid when virtual-inodes are
    replaced with real-inodes in loc.
    
    BUG: 1114501
    Change-Id: I13016de1da11762e0697792d76e6e946d991c0a4
    Signed-off-by: Pranith Kumar K <pkarampu>
    Reviewed-on: http://review.gluster.org/8251
    Reviewed-by: Kotresh HR <khiremat>
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Niels de Vos <ndevos>

Comment 9 Niels de Vos 2014-07-21 15:41:59 UTC
The first (and last?) Beta for GlusterFS 3.5.2 has been released [1]. Please verify if the release solves this bug report for you. In case the glusterfs-3.5.2beta1 release does not have a resolution for this issue, leave a comment in this bug and move the status to ASSIGNED. If this release fixes the problem for you, leave a note and change the status to VERIFIED.

Packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update (possibly an "updates-testing" repository) infrastructure for your distribution.

[1] http://supercolony.gluster.org/pipermail/gluster-devel/2014-July/041636.html
[2] http://supercolony.gluster.org/pipermail/gluster-users/

Comment 10 Niels de Vos 2014-07-31 11:43:34 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.5.2, please reopen this bug report.

glusterfs-3.5.2 has been announced on the Gluster Users mailinglist [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://supercolony.gluster.org/pipermail/gluster-users/2014-July/041217.html
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user


Note You need to log in before you can comment on or make changes to this bug.