Bug 1144485 - DHT : - two directories have same gfid
Summary: DHT : - two directories have same gfid
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: distribute
Version: 3.6.0
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
Assignee: Shyamsundar
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: glusterfs-3.6.0
TreeView+ depends on / blocked
 
Reported: 2014-09-19 14:00 UTC by Shyamsundar
Modified: 2014-11-11 08:39 UTC (History)
8 users (show)

Fixed In Version: glusterfs-3.6.0beta1
Doc Type: Bug Fix
Doc Text:
Clone Of: 1105082
Environment:
Last Closed: 2014-11-11 08:39:24 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Shyamsundar 2014-09-19 14:00:10 UTC
+++ This bug was initially created as a clone of Bug #1105082 +++

+++ This bug was initially created as a clone of Bug #1092510 +++

Description of problem:
=======================
rename Directory (in both case, Destination exists and Destination does not exist), Take a snapshot when rename operation is not completed on all sub-volume.

On snapshot restore, lookup heals both directory and ends up having same fid for both source and Directory. 

How reproducible:
================
always


Steps to Reproduce:
===================

Case 1:- Destination does not exist
1. create Distributed volume, start it and FUSE mount it.
2. create Directory from mount point 
3. Rename Directory from mount point(destination does not exist) and make sure you take a snap of volume when Directory is renamed on one or more sub-volume and not on all (mv src dest)
4. stop volume and restore snap
5. mount volume again and send a lookup
6. verify gfid  of source and destination Directory in backend

Step 6:-
[root@OVM5 ~]# getfattr -d -m . -e hex /brick3/*/dest
getfattr: Removing leading '/' from absolute path names
# file: brick3/1/dest
trusted.gfid=0xba51b0e324fc46198cf909727081d4d5
trusted.glusterfs.dht=0x0000000100000000aaaaaaaaffffffff

# file: brick3/2/dest
trusted.gfid=0xba51b0e324fc46198cf909727081d4d5
trusted.glusterfs.dht=0x00000001000000000000000055555554

# file: brick3/3/dest
trusted.gfid=0xba51b0e324fc46198cf909727081d4d5
trusted.glusterfs.dht=0x000000010000000055555555aaaaaaa9

[root@OVM5 ~]# getfattr -d -m . -e hex /brick3/*/src
getfattr: Removing leading '/' from absolute path names
# file: brick3/1/src
trusted.gfid=0xba51b0e324fc46198cf909727081d4d5
trusted.glusterfs.dht=0x0000000100000000aaaaaaaaffffffff

# file: brick3/2/src
trusted.gfid=0xba51b0e324fc46198cf909727081d4d5
trusted.glusterfs.dht=0x00000001000000000000000055555554

# file: brick3/3/src
trusted.gfid=0xba51b0e324fc46198cf909727081d4d5
trusted.glusterfs.dht=0x000000010000000055555555aaaaaaa9


Case 2:-
Destination exist
1. create Distributed volume, start it and FUSE mount it.
2. create Directory from mount point 
3. Rename Directory from mount point(destination should exist) and make sure you take a snap of volume when Directory is renamed on one or more sub-volume  and not on all (mv src dest)
4. stop volume and restore snap
5. mount volume again and send a lookup
6. verify gfid  of source and destination Directory in backend

Step 6:-
[root@OVM5 ~]# getfattr -d -m . -e hex /brick3/*/src
getfattr: Removing leading '/' from absolute path names
# file: brick3/1/src
trusted.gfid=0x2ceac0f928c94437b7b7b985739dc74e
trusted.glusterfs.dht=0x0000000100000000aaaaaaaaffffffff

# file: brick3/2/src
trusted.gfid=0x2ceac0f928c94437b7b7b985739dc74e
trusted.glusterfs.dht=0x00000001000000000000000055555554

# file: brick3/3/src
trusted.gfid=0x2ceac0f928c94437b7b7b985739dc74e
trusted.glusterfs.dht=0x000000010000000055555555aaaaaaa9

[root@OVM5 ~]# getfattr -d -m . -e hex /brick3/*/dest/src
getfattr: Removing leading '/' from absolute path names
# file: brick3/1/dest/src
trusted.gfid=0x2ceac0f928c94437b7b7b985739dc74e
trusted.glusterfs.dht=0x0000000100000000aaaaaaaaffffffff

# file: brick3/2/dest/src
trusted.gfid=0x2ceac0f928c94437b7b7b985739dc74e
trusted.glusterfs.dht=0x00000001000000000000000055555554

# file: brick3/3/dest/src
trusted.gfid=0x2ceac0f928c94437b7b7b985739dc74e
trusted.glusterfs.dht=0x000000010000000055555555aaaaaaa9



Actual results:
===============
two directories have same gfid


Expected results:
=================
gfid should be unique for all Directories

--- Additional comment from Anand Avati on 2014-06-07 15:39:24 EDT ---

REVIEW: http://review.gluster.org/8008 (storage/posix: Treat mkdir on an existing gfid as rename.) posted (#1) for review on master by Raghavendra G (rgowdapp)

--- Additional comment from Anand Avati on 2014-09-14 14:58:32 EDT ---

REVIEW: http://review.gluster.org/8008 (storage/posix: Log when mkdir is on an existing gfid but non-existent path.) posted (#2) for review on master by Raghavendra G (rgowdapp)

--- Additional comment from Anand Avati on 2014-09-17 02:39:18 EDT ---

REVIEW: http://review.gluster.org/8008 (storage/posix: Log when mkdir is on an existing gfid but non-existent path.) posted (#3) for review on master by Raghavendra G (rgowdapp)

--- Additional comment from Anand Avati on 2014-09-18 07:11:07 EDT ---

REVIEW: http://review.gluster.org/8008 (storage/posix: Log when mkdir is on an existing gfid but non-existent path.) posted (#4) for review on master by Raghavendra G (rgowdapp)

--- Additional comment from Anand Avati on 2014-09-18 07:49:14 EDT ---

REVIEW: http://review.gluster.org/8008 (storage/posix: Log when mkdir is on an existing gfid but non-existent path.) posted (#5) for review on master by Raghavendra G (rgowdapp)

--- Additional comment from Anand Avati on 2014-09-18 13:16:04 EDT ---

COMMIT: http://review.gluster.org/8008 committed in master by Vijay Bellur (vbellur) 
------
commit 1e1b709a4b438dfa768fd4c645e081ede06e7e14
Author: Raghavendra G <rgowdapp>
Date:   Sun Jun 8 00:46:29 2014 +0530

    storage/posix: Log when mkdir is on an existing gfid but non-existent
    path.
    
    consider following steps on a distribute volume
    
    1. rename (src, dst) on hashed subvolume
    2. snapshot taken
    3. restore snapshots and do stat on src and dst
    
    Now, we end up with two directories src and dst having same gfid,
    because of distribute creating directories on non-existent subvolumes
    as part of directory healing.
    
    This can happen even with race between rename and directory healing in
    dht-lookup. This can lead to undefined behaviour while accessing any
    of both directories. Hence, we are logging paths of both
    directories, so that a sysadmin can take some corrective action when
    (s)he sees this log. One of the corrective action can be to copy
    contents of both directories from backend into a new directory and
    delete both directories.
    
    Since effort involved to fix this issue is non-trivial, giving this
    workaround till we come up with a fix.
    
    Change-Id: I38f4520e6787ee33180a9cd1bf2f36f46daea1ea
    BUG: 1105082
    Signed-off-by: Raghavendra G <rgowdapp>
    Reviewed-on: http://review.gluster.org/8008
    Reviewed-by: Pranith Kumar Karampuri <pkarampu>
    Reviewed-by: Vijay Bellur <vbellur>
    Tested-by: Vijay Bellur <vbellur>

Comment 1 Anand Avati 2014-09-19 14:09:00 UTC
REVIEW: http://review.gluster.org/8783 (storage/posix: Log when mkdir is on an existing gfid but non-existent path.) posted (#1) for review on release-3.6 by Shyamsundar Ranganathan (srangana)

Comment 2 Anand Avati 2014-09-19 16:44:14 UTC
REVIEW: http://review.gluster.org/8781 (protocol: Log ENODATA & ENOATTR logs at DEBUG loglevel in removexattr_cbk.) posted (#3) for review on master by Vijay Bellur (vbellur)

Comment 3 Anand Avati 2014-09-19 16:46:15 UTC
COMMIT: http://review.gluster.org/8783 committed in release-3.6 by Vijay Bellur (vbellur) 
------
commit a2e0602c0910ee448b4e8badeb00eed2a78ea452
Author: Raghavendra G <rgowdapp>
Date:   Fri Sep 19 10:07:53 2014 -0400

    storage/posix: Log when mkdir is on an existing gfid but non-existent
    path.
    
    consider following steps on a distribute volume
    
    1. rename (src, dst) on hashed subvolume
    2. snapshot taken
    3. restore snapshots and do stat on src and dst
    
    Now, we end up with two directories src and dst having same gfid,
    because of distribute creating directories on non-existent subvolumes
    as part of directory healing.
    
    This can happen even with race between rename and directory healing in
    dht-lookup. This can lead to undefined behaviour while accessing any
    of both directories. Hence, we are logging paths of both
    directories, so that a sysadmin can take some corrective action when
    (s)he sees this log. One of the corrective action can be to copy
    contents of both directories from backend into a new directory and
    delete both directories.
    
    Since effort involved to fix this issue is non-trivial, giving this
    workaround till we come up with a fix.
    
    Change-Id: I38f4520e6787ee33180a9cd1bf2f36f46daea1ea
    BUG: 1144485
    Signed-off-by: Raghavendra G <rgowdapp>
    Reviewed-on-master: http://review.gluster.org/8008
    Reviewed-by: Pranith Kumar Karampuri <pkarampu>
    Reviewed-by: Vijay Bellur <vbellur>
    Tested-by: Vijay Bellur <vbellur>
    Reviewed-on: http://review.gluster.org/8783
    Tested-by: Gluster Build System <jenkins.com>

Comment 4 Niels de Vos 2014-09-22 12:46:39 UTC
A beta release for GlusterFS 3.6.0 has been released. Please verify if the release solves this bug report for you. In case the glusterfs-3.6.0beta1 release does not have a resolution for this issue, leave a comment in this bug and move the status to ASSIGNED. If this release fixes the problem for you, leave a note and change the status to VERIFIED.

Packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update (possibly an "updates-testing" repository) infrastructure for your distribution.

[1] http://supercolony.gluster.org/pipermail/gluster-users/2014-September/018836.html
[2] http://supercolony.gluster.org/pipermail/gluster-users/

Comment 5 Niels de Vos 2014-11-11 08:39:24 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.6.1, please reopen this bug report.

glusterfs-3.6.1 has been announced [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://supercolony.gluster.org/pipermail/gluster-users/2014-November/019410.html
[2] http://supercolony.gluster.org/mailman/listinfo/gluster-users


Note You need to log in before you can comment on or make changes to this bug.