1328473 – DHT : If Directory creation is in progress and rename of that Directory comes from another mount point then after both operation few files are not accessible and not listed on mount and more than one Directory have same gfid

Bug 1328473 - DHT : If Directory creation is in progress and rename of that Directory comes from another mount point then after both operation few files are not accessible and not listed on mount and more than one Directory have same gfid

Summary: DHT : If Directory creation is in progress and rename of that Directory comes...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	distribute
Sub Component:
Version:	3.7.11
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Assignee:	Sakshi
QA Contact:
Docs Contact:
URL:
Whiteboard:	triaged,hotfix
Depends On:
Blocks:	1252244 1311817 1324381
TreeView+	depends on / blocked

Reported:	2016-04-19 13:26 UTC by Sakshi
Modified:	2016-08-01 01:22 UTC (History)
CC List:	14 users (show)
Fixed In Version:	glusterfs-3.7.12
Clone Of:	1118770
Environment:
Last Closed:	2016-06-28 12:14:39 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Sakshi 2016-04-19 13:26:31 UTC

+++ This bug was initially created as a clone of Bug #1118770 +++

Description of problem:
=======================
Create Directory from mountpoint and while creation is in progress (Directory created only on hashed sub-volume), execute rename of that Directory(destination Directory does not exist and both Source and Destination hash to same sub-volume here)

i.e. from one mout point :- mkfir dir1
from another mount point mv dir1 dir2

After both operation are finished:-
- same gfid for different Directories (at same level)
- sometimes few files inside those directories are not listed on mount and not accessible


Version-Release number :
=========================
3.6.0.24-1.el6rhs.x86_64


How reproducible:
=================
always


Steps to Reproduce:
====================
1. create and mount distributed volume. (mount on multiple client)
2. [to reproduce race, we are putting breakpoint at dht_mkdir_hashed_dir_cbk and dht_rename_hashed_dir_cbk]

3. from one mount point execute 

[root@OVM1 race]# mkdir inprogress

bricks:-
[root@OVM5 race]# tree /brick*/race/ 
/brick1/race/ 
/brick2/race/ 
└── inprogress 
/brick3/race/ 

1 directory, 0 files 

from another mount point:-
[root@OVM1 race1]# mv inprogress rename

bricks:-
[root@OVM5 race]# tree /brick*/race/ 
/brick1/race/ 
└── rename 
/brick2/race/ 
└── inprogress 
/brick3/race/ 
└── inprogress 

3 directories, 0 files 

4. now continue bothe operation

5. verify data from another mount and bricks also

mount:-
[root@OVM5 race]# ls -lR 
.: 
total 0 
drwxr-xr-x 2 root root 18 Jul 10 12:50 rename 

./rename: 
total 0 
[root@OVM5 race]# mkdir inprogress 
mkdir: cannot create directory `inprogress': File exists 
[root@OVM5 race]# ls -lR 
.: 
total 0 
drwxr-xr-x 2 root root 18 Jul 10 12:50 inprogress 
drwxr-xr-x 2 root root 18 Jul 10 12:50 rename 

./inprogress: 
total 0 

./rename: 
total 0 

bricks:-
same gfid:-
[root@OVM5 race]# getfattr -d -m . /brick3/race/* -e hex 
getfattr: Removing leading '/' from absolute path names 
# file: brick3/race/inprogress 
trusted.gfid=0x5b3c1a8ca4b84f27912880710a165fb7 
trusted.glusterfs.dht=0x000000010000000055555555aaaaaaa9 
 
# file: brick3/race/rename 
trusted.gfid=0x5b3c1a8ca4b84f27912880710a165fb7 
trusted.glusterfs.dht=0x000000010000000055555555aaaaaaa9 

[root@OVM5 race]# tree /brick*/race/ 
/brick1/race/ 
├── inprogress 
└── rename 
/brick2/race/ 
├── inprogress 
└── rename 
/brick3/race/ 
├── inprogress 
└── rename 

Actual results:
===============
- same gfid for different Directories 
- sometimes files inside those directories are not listed on mount and 

Expected results:
=================
- no two directory should have same gfid
- all files inside those Directories should be accessible from mount point

--- Additional comment from Rachana Patel on 2014-07-11 09:43:41 EDT ---

[root@OVM3 ~]# gluster v info race
 
Volume Name: race
Type: Distribute
Volume ID: 30f7ff59-a90b-44e3-a991-223c81c15d67
Status: Started
Snap Volume: no
Number of Bricks: 3
Transport-type: tcp
Bricks:
Brick1: 10.70.35.172:/brick1/race
Brick2: 10.70.35.172:/brick2/race
Brick3: 10.70.35.172:/brick3/race
Options Reconfigured:
performance.readdir-ahead: on
snap-max-hard-limit: 256
snap-max-soft-limit: 90
auto-delete: disable


sosreport @ http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1118762/

--- Additional comment from Rachana Patel on 2014-07-11 09:47:33 EDT ---

In case if destination directory exist, output would be

rename1 already exist
and race :-
[root@OVM1 race]# mkdir rename
[root@OVM1 race1]# mv rename rename1


output on mount:-

[root@OVM5 race]# ls -lR 
.: 
total 0 
drwxr-xr-x 2 root root 18 Jul 10 15:00 rename 
drwxr-xr-x 3 root root 57 Jul 10 15:00 rename1 

./rename: 
total 0 

./rename1: 
total 0 
drwxr-xr-x 2 root root 18 Jul 10 15:00 rename 
 
./rename1/rename: 
total 0 


bricks:-
[root@OVM5 race]# tree /brick*/race/ 
/brick1/race/ 
├── rename 
└── rename1 
    └── rename 
/brick2/race/ 
├── rename 
└── rename1 
    └── rename 
/brick3/race/ 
├── rename 
└── rename1 
    └── rename 

9 directories, 0 files 

[root@OVM5 race]# getfattr -d -m . -e hex /brick3/race/* -R 
getfattr: Removing leading '/' from absolute path names 
# file: brick3/race/rename 
trusted.gfid=0xac6b95cb620c400d91a55f3ce66ee005 
trusted.glusterfs.dht=0x0000000100000000aaaaaaaaffffffff 

# file: brick3/race/rename1 
trusted.gfid=0x9482dd3bf0834596bb74d6ffeffa40d2 
trusted.glusterfs.dht=0x00000001000000000000000055555554 

# file: brick3/race/rename1/rename 
trusted.gfid=0xac6b95cb620c400d91a55f3ce66ee005 
trusted.glusterfs.dht=0x0000000100000000aaaaaaaaffffffff

--- Additional comment from RHEL Product and Program Management on 2014-07-11 10:03:51 EDT ---

Since this issue was entered in bugzilla, the release flag has been
set to ? to ensure that it is properly evaluated for this release.

--- Additional comment from Vivek Agarwal on 2015-02-04 07:52:07 EST ---

Removing Denali from the internal whiteboard,

--- Additional comment from John Skeoch on 2015-04-19 20:24:20 EDT ---

User racpatel's account has been closed

--- Additional comment from John Skeoch on 2015-04-19 20:25:48 EDT ---

User racpatel's account has been closed

--- Additional comment from Susant Kumar Palai on 2015-12-28 01:25:45 EST ---

Triage Update: RCA is known. Design and fix need to be done.

--- Additional comment from Sankarshan Mukhopadhyay on 2016-04-05 06:04:19 EDT ---

The fix content tracked by this RHBZ is being made available as part of a response to customer in the form of an accelerated fix. The RHBZ would need to be part of the release content targeted for 3.1.3

--- Additional comment from Raghavendra G on 2016-04-11 01:25:59 EDT ---

https://code.engineering.redhat.com/gerrit/#/c/71596/

--- Additional comment from Red Hat Bugzilla Rules Engine on 2016-04-11 03:20:44 EDT ---

Since this bug has been approved for the z-stream release of Red Hat Gluster Storage 3, through release flag 'rhgs-3.1.z+', and has been marked for RHGS 3.1 Update 3 release through the Internal Whiteboard entry of '3.1.3', the Target Release is being automatically set to 'RHGS 3.1.3'

--- Additional comment from Rejy M Cyriac on 2016-04-11 03:36:05 EDT ---

Is this a potential Accelerated-Fix (hot-fix) candidate ? As per Comment 8 , it looks to be one.

But I see

1) NO Request for Accelerated-Fix (hot-fix) from GSS at BZ comment
2) Comment 8 can construed as approval for Accelerated-Fix (hot-fix) from RHGS Development Management, but it appears strange to have an approval without a request from GSS for Accelerated-Fix (hot-fix)
3) NO approval for Accelerated-Fix (hot-fix) from RHGS PM

Comment 1 Vijay Bellur 2016-04-19 13:31:18 UTC

REVIEW: http://review.gluster.org/14031 (quota: setting 'read-only' option in xdata to instruct DHT to not heal) posted (#1) for review on release-3.7 by Sakshi Bansal

Comment 2 Vijay Bellur 2016-04-27 03:55:05 UTC

COMMIT: http://review.gluster.org/14031 committed in release-3.7 by Raghavendra G (rgowdapp) 
------
commit 2fff1c41bbe1a355fe398df08f2a27844b925b47
Author: Sakshi Bansal <sabansal>
Date:   Wed Apr 13 16:40:40 2016 +0530

    quota: setting 'read-only' option in xdata to instruct DHT to not heal
    
    When quota is enabled the quota enforcer tries to get the size of the
    source directory by sending nameless lookup to quotad. But if the rename
    is successful even on one subvol or the source layout has anomalies then
    this nameless lookup in quotad tries to heal the directory which requires
    a lock on as many subvols as it can. But src is already locked as part of
    rename. For rename to proceed in brick it needs to complete a cluster-wide
    lookup. But cluster-wide lookup in quotad is blocked on locks held by rename,
    hence a deadlock. To avoid this quota sends an option in xdata which instructs
    DHT not to heal.
    
    Backport of http://review.gluster.org/#/c/13988/
    
    > Change-Id: I792f9322331def0b1f4e16e88deef55d0c9f17f0
    > BUG: 1252244
    > Signed-off-by: Sakshi Bansal <sabansal>
    > Reviewed-on: http://review.gluster.org/13988
    > Smoke: Gluster Build System <jenkins.com>
    > NetBSD-regression: NetBSD Build System <jenkins.org>
    > CentOS-regression: Gluster Build System <jenkins.com>
    > Tested-by: Gluster Build System <jenkins.com>
    > Reviewed-by: Raghavendra G <rgowdapp>
    
    Change-Id: I792f9322331def0b1f4e16e88deef55d0c9f17f0
    BUG: 1328473
    Signed-off-by: Sakshi Bansal <sabansal>
    Reviewed-on: http://review.gluster.org/14031
    Smoke: Gluster Build System <jenkins.com>
    CentOS-regression: Gluster Build System <jenkins.com>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    Reviewed-by: Raghavendra G <rgowdapp>
    Tested-by: Raghavendra G <rgowdapp>

Comment 3 Kaushal 2016-06-28 12:14:39 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.12, please open a new bug report.

glusterfs-3.7.12 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://www.gluster.org/pipermail/gluster-devel/2016-June/049918.html
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Note You need to log in before you can comment on or make changes to this bug.