Bug 1324381

Summary: DHT : If Directory creation is in progress and rename of that Directory comes from another mount point then after both operation few files are not accessible and not listed on mount and more than one Directory have same gfid
Product: [Community] GlusterFS Reporter: Sakshi <sabansal>
Component: distributeAssignee: Sakshi <sabansal>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.7.10CC: bugs, mzywusko, nbalacha, racpatel, smohan
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-3.7.11 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1252244 Environment:
Last Closed: 2016-04-19 07:13:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1092510, 1118770, 1118780, 1139172, 1252244, 1328473    
Bug Blocks:    

Description Sakshi 2016-04-06 08:39:11 UTC
+++ This bug was initially created as a clone of Bug #1252244 +++

+++ This bug was initially created as a clone of Bug #1118770 +++

Description of problem:
=======================
Create Directory from mountpoint and while creation is in progress (Directory created only on hashed sub-volume), execute rename of that Directory(destination Directory does not exist and both Source and Destination hash to same sub-volume here)

i.e. from one mout point :- mkfir dir1
from another mount point mv dir1 dir2

After both operation are finished:-
- same gfid for different Directories (at same level)
- sometimes few files inside those directories are not listed on mount and not accessible


Version-Release number :
=========================
3.6.0.24-1.el6rhs.x86_64


How reproducible:
=================
always


Steps to Reproduce:
====================
1. create and mount distributed volume. (mount on multiple client)
2. [to reproduce race, we are putting breakpoint at dht_mkdir_hashed_dir_cbk and dht_rename_hashed_dir_cbk]

3. from one mount point execute 

[root@OVM1 race]# mkdir inprogress

bricks:-
[root@OVM5 race]# tree /brick*/race/ 
/brick1/race/ 
/brick2/race/ 
└── inprogress 
/brick3/race/ 

1 directory, 0 files 

from another mount point:-
[root@OVM1 race1]# mv inprogress rename

bricks:-
[root@OVM5 race]# tree /brick*/race/ 
/brick1/race/ 
└── rename 
/brick2/race/ 
└── inprogress 
/brick3/race/ 
└── inprogress 

3 directories, 0 files 

4. now continue bothe operation

5. verify data from another mount and bricks also

mount:-
[root@OVM5 race]# ls -lR 
.: 
total 0 
drwxr-xr-x 2 root root 18 Jul 10 12:50 rename 

./rename: 
total 0 
[root@OVM5 race]# mkdir inprogress 
mkdir: cannot create directory `inprogress': File exists 
[root@OVM5 race]# ls -lR 
.: 
total 0 
drwxr-xr-x 2 root root 18 Jul 10 12:50 inprogress 
drwxr-xr-x 2 root root 18 Jul 10 12:50 rename 

./inprogress: 
total 0 

./rename: 
total 0 

bricks:-
same gfid:-
[root@OVM5 race]# getfattr -d -m . /brick3/race/* -e hex 
getfattr: Removing leading '/' from absolute path names 
# file: brick3/race/inprogress 
trusted.gfid=0x5b3c1a8ca4b84f27912880710a165fb7 
trusted.glusterfs.dht=0x000000010000000055555555aaaaaaa9 
 
# file: brick3/race/rename 
trusted.gfid=0x5b3c1a8ca4b84f27912880710a165fb7 
trusted.glusterfs.dht=0x000000010000000055555555aaaaaaa9 

[root@OVM5 race]# tree /brick*/race/ 
/brick1/race/ 
├── inprogress 
└── rename 
/brick2/race/ 
├── inprogress 
└── rename 
/brick3/race/ 
├── inprogress 
└── rename 

Actual results:
===============
- same gfid for different Directories 
- sometimes files inside those directories are not listed on mount and 

Expected results:
=================
- no two directory should have same gfid
- all files inside those Directories should be accessible from mount point



In case if destination directory exist, output would be

rename1 already exist
and race :-
[root@OVM1 race]# mkdir rename
[root@OVM1 race1]# mv rename rename1


output on mount:-

[root@OVM5 race]# ls -lR 
.: 
total 0 
drwxr-xr-x 2 root root 18 Jul 10 15:00 rename 
drwxr-xr-x 3 root root 57 Jul 10 15:00 rename1 

./rename: 
total 0 

./rename1: 
total 0 
drwxr-xr-x 2 root root 18 Jul 10 15:00 rename 
 
./rename1/rename: 
total 0 


bricks:-
[root@OVM5 race]# tree /brick*/race/ 
/brick1/race/ 
├── rename 
└── rename1 
    └── rename 
/brick2/race/ 
├── rename 
└── rename1 
    └── rename 
/brick3/race/ 
├── rename 
└── rename1 
    └── rename 

9 directories, 0 files 

[root@OVM5 race]# getfattr -d -m . -e hex /brick3/race/* -R 
getfattr: Removing leading '/' from absolute path names 
# file: brick3/race/rename 
trusted.gfid=0xac6b95cb620c400d91a55f3ce66ee005 
trusted.glusterfs.dht=0x0000000100000000aaaaaaaaffffffff 

# file: brick3/race/rename1 
trusted.gfid=0x9482dd3bf0834596bb74d6ffeffa40d2 
trusted.glusterfs.dht=0x00000001000000000000000055555554 

# file: brick3/race/rename1/rename 
trusted.gfid=0xac6b95cb620c400d91a55f3ce66ee005 
trusted.glusterfs.dht=0x0000000100000000aaaaaaaaffffffff

--- Additional comment from Anand Avati on 2015-08-11 03:28:07 EDT ---

REVIEW: http://review.gluster.org/11880 (dht : locks in rename to avoid layout change by lookup selfheal) posted (#1) for review on master by Sakshi Bansal (sabansal)

--- Additional comment from Anand Avati on 2015-08-28 23:30:35 EDT ---

REVIEW: http://review.gluster.org/11880 (dht: locks in rename to avoid layout change by lookup selfheal) posted (#2) for review on master by Sakshi Bansal (sabansal)

--- Additional comment from Vijay Bellur on 2015-09-04 02:10:39 EDT ---

REVIEW: http://review.gluster.org/11880 (dht : locks in rename to avoid layout change by lookup selfheal) posted (#3) for review on master by Sakshi Bansal (sabansal)

--- Additional comment from Vijay Bellur on 2016-03-07 22:58:44 EST ---

REVIEW: http://review.gluster.org/11880 (dht: lock on subvols to prevent rename and lookup selfheal race) posted (#4) for review on master by Sakshi Bansal

--- Additional comment from Vijay Bellur on 2016-03-16 23:24:05 EDT ---

REVIEW: http://review.gluster.org/11880 (dht: lock on subvols to prevent rename and lookup selfheal race) posted (#5) for review on master by Sakshi Bansal

--- Additional comment from Vijay Bellur on 2016-03-19 22:41:31 EDT ---

REVIEW: http://review.gluster.org/11880 (dht: lock on subvols to prevent rename and lookup selfheal race) posted (#6) for review on master by Sakshi Bansal

--- Additional comment from Mike McCune on 2016-03-28 19:31:34 EDT ---

This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions

--- Additional comment from Vijay Bellur on 2016-04-06 01:16:26 EDT ---

REVIEW: http://review.gluster.org/11880 (dht: lock on subvols to prevent rename and lookup selfheal race) posted (#7) for review on master by Sakshi Bansal

Comment 1 Vijay Bellur 2016-04-06 10:19:58 UTC
REVIEW: http://review.gluster.org/13917 (dht: lock on subvols to prevent rename and lookup selfheal race) posted (#1) for review on release-3.7 by Sakshi Bansal

Comment 2 Vijay Bellur 2016-04-11 05:23:56 UTC
COMMIT: http://review.gluster.org/13917 committed in release-3.7 by Raghavendra G (rgowdapp) 
------
commit 0a01154c68cb5eb884096fc67288a71c391d9160
Author: Sakshi <sabansal>
Date:   Wed Aug 5 16:05:22 2015 +0530

    dht: lock on subvols to prevent rename and lookup selfheal race
    
    This patch addresses two races while renaming directories:
    1) While renaming src to dst, if a lookup selfheal is triggered
    it can recreate src on those subvols where rename was successful.
    This leads to multiple directories (src and dst) having same gfid.
    To avoid this we must take locks on all subvols with src.
    
    2) While renaming if the dst exists and a lookup selfheal is
    triggered it will find anomalies in the dst layout and try to
    heal the stale layout. To avoid this we must take lock on any
    one subvol with dst.
    
    Backport of http://review.gluster.org/#/c/11880/
    
    > Change-Id: I637f637d3241d9065cd5be59a671c7e7ca3eed53
    > BUG: 1252244
    > Signed-off-by: Sakshi <sabansal>
    
    Change-Id: I637f637d3241d9065cd5be59a671c7e7ca3eed53
    BUG: 1324381
    Signed-off-by: Sakshi <sabansal>
    Reviewed-on: http://review.gluster.org/13917
    Smoke: Gluster Build System <jenkins.com>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.com>
    Reviewed-by: Raghavendra G <rgowdapp>

Comment 3 Kaushal 2016-04-19 07:13:37 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.11, please open a new bug report.

glusterfs-3.7.11 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://www.gluster.org/pipermail/gluster-users/2016-April/026321.html
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user