1411352 – rename of the same file from multiple clients with caching enabled may result in duplicate files

Bug 1411352 - rename of the same file from multiple clients with caching enabled may result in duplicate files

Summary: rename of the same file from multiple clients with caching enabled may result...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	distribute
Sub Component:
Version:	rhgs-3.2
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	urgent
Target Milestone:	---
Target Release:	RHGS 3.3.0
Assignee:	Susant Kumar Palai
QA Contact:	Prasad Desala
Docs Contact:
URL:
Whiteboard:
Depends On:	1412135
Blocks:	1417147 1438820
TreeView+	depends on / blocked

Reported:	2017-01-09 14:34 UTC by Nag Pavan Chilakam
Modified:	2018-11-30 05:39 UTC (History)
CC List:	13 users (show)
Fixed In Version:	glusterfs-3.8.4-25
Doc Type:	Bug Fix
Doc Text:	Renaming a file could result in duplicate files if multiple clients had caching enabled. This occurred because the lookup operation to verify file existence on the volume was based on the file's GFID and therefore always succeeded. The lookup is now based on the file's name instead of GFID, and duplicates no longer occur in this situation.
Clone Of:
Clones:	1412135 (view as bug list)
Environment:
Last Closed:	2017-09-21 04:30:55 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2017:2774	0	normal	SHIPPED_LIVE	glusterfs bug fix and enhancement update	2017-09-21 08:16:29 UTC

Description Nag Pavan Chilakam 2017-01-09 14:34:41 UTC

Description of problem:
==========================
I was verifying 1408836 - [ganesha+ec]: Contents of original file are not seen when hardlink is created

I am seeing this Problem with mdcache+distributed-ecvolume+nfsganesha
I removed nfsganesha out of the equation and didn't seem to hit it(though i would like to give another try)

When I mount a dist-ec volume on two nfs ganesha clients(different vips)

from c1==>I created a file f1, which hashes to 1st dht subvol
c1,c2===>i did a stat of f1 
c2==>renamed file f1 to f2 which hashes to the same dht-subvol
c1--->try to do this immediately, preferably within few seconds, without doing any lookups or anything do a rename of f1 to f3(which must hash to a different dht-subvol

we expect that to error out , but instead the rename passes and you can see both on clients and bricks two file f2, f3

in this case the files are duplicates.
If these files are of huge size, then we may end up consuming double the space unncessarily

I am able to reproduce with about 80% consistency


Version-Release number of selected component (if applicable):
==================================
Volume Name: ecvol
Type: Distributed-Disperse
Volume ID: 55f976ee-0c5e-49dc-9616-1e9abed0c7ec
Status: Started
Snapshot Count: 0
Number of Bricks: 3 x (4 + 2) = 18
Transport-type: tcp
Bricks:
Brick1: 10.70.35.37:/rhs/brick2/ecvol
Brick2: 10.70.35.116:/rhs/brick2/ecvol
Brick3: 10.70.35.239:/rhs/brick2/ecvol
Brick4: 10.70.35.135:/rhs/brick2/ecvol
Brick5: 10.70.35.8:/rhs/brick2/ecvol
Brick6: 10.70.35.196:/rhs/brick2/ecvol
Brick7: 10.70.35.37:/rhs/brick3/ecvol
Brick8: 10.70.35.116:/rhs/brick3/ecvol
Brick9: 10.70.35.239:/rhs/brick3/ecvol
Brick10: 10.70.35.135:/rhs/brick3/ecvol
Brick11: 10.70.35.8:/rhs/brick3/ecvol
Brick12: 10.70.35.196:/rhs/brick3/ecvol
Brick13: 10.70.35.37:/rhs/brick4/ecvol
Brick14: 10.70.35.116:/rhs/brick4/ecvol
Brick15: 10.70.35.239:/rhs/brick4/ecvol
Brick16: 10.70.35.135:/rhs/brick4/ecvol
Brick17: 10.70.35.8:/rhs/brick4/ecvol
Brick18: 10.70.35.196:/rhs/brick4/ecvol
Options Reconfigured:
performance.md-cache-timeout: 600
performance.cache-invalidation: on
performance.stat-prefetch: on
features.cache-invalidation-timeout: 600
ganesha.enable: on
features.cache-invalidation: on
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: on
nfs-ganesha: enable
cluster.enable-shared-storage: enable
[root@dhcp35-37 ~]# rpm -qa|grep gluster
glusterfs-events-3.8.4-11.el7rhgs.x86_64
glusterfs-rdma-3.8.4-11.el7rhgs.x86_64
glusterfs-api-3.8.4-11.el7rhgs.x86_64
glusterfs-server-3.8.4-11.el7rhgs.x86_64
nfs-ganesha-gluster-2.4.1-3.el7rhgs.x86_64
glusterfs-libs-3.8.4-11.el7rhgs.x86_64
glusterfs-cli-3.8.4-11.el7rhgs.x86_64
glusterfs-3.8.4-11.el7rhgs.x86_64
glusterfs-fuse-3.8.4-11.el7rhgs.x86_64
glusterfs-debuginfo-3.8.4-11.el7rhgs.x86_64
glusterfs-client-xlators-3.8.4-11.el7rhgs.x86_64
python-gluster-3.8.4-11.el7rhgs.noarch
glusterfs-ganesha-3.8.4-11.el7rhgs.x86_64
glusterfs-geo-replication-3.8.4-11.el7rhgs.x86_64
[root@dhcp35-37 ~]# rpm -qa|grep ganesha
nfs-ganesha-gluster-2.4.1-3.el7rhgs.x86_64
nfs-ganesha-debuginfo-2.4.1-3.el7rhgs.x86_64
nfs-ganesha-2.4.1-3.el7rhgs.x86_64
glusterfs-ganesha-3.8.4-11.el7rhgs.x86_64
[root@dhcp35-37 ~]# 




How reproducible:
===================
most of the times

Comment 2 Nag Pavan Chilakam 2017-01-09 14:37:11 UTC

client side:
client1:
[root@rhs-client45 dir2]# stat x1
  File: ‘x1’
  Size: 0         	Blocks: 0          IO Block: 1048576 regular empty file
Device: 2ah/42d	Inode: 12109031449775892209  Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Context: system_u:object_r:nfs_t:s0
Access: 2017-01-09 20:00:16.000000000 +0530
Modify: 2017-01-09 20:00:16.000000000 +0530
Change: 2017-01-09 20:00:16.000000000 +0530
 Birth: -
[root@rhs-client45 dir2]# mv x1 x4
[root@rhs-client45 dir2]# ls
x2  x4
[root@rhs-client45 dir2]# 
[root@rhs-client45 dir2]# 
[root@rhs-client45 dir2]# ll
total 0
-rw-r--r--. 2 root root 0 Jan  9  2017 x2
-rw-r--r--. 2 root root 0 Jan  9  2017 x4
[root@rhs-client45 dir2]# 



client2:
root@dhcp35-107 ganesha]# cd dir2
[root@dhcp35-107 dir2]# stat x1
  File: `x1'
  Size: 0         	Blocks: 0          IO Block: 1048576 regular empty file
Device: 1fh/31d	Inode: 12109031449775892209  Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2017-01-09 20:00:16.493823000 +0530
Modify: 2017-01-09 20:00:16.493823000 +0530
Change: 2017-01-09 20:00:18.036401827 +0530
[root@dhcp35-107 dir2]# mv x1 x2
[root@dhcp35-107 dir2]# ll
total 0
-rw-r--r--. 2 root root 0 Jan  9 20:00 x2
-rw-r--r--. 2 root root 0 Jan  9 20:00 x4
[root@dhcp35-107 dir2]# 



one of the ec node bricks:

[root@dhcp35-37 ~]# ll /rhs/brick*/ecvol/ganesha/dir2
/rhs/brick2/ecvol/ganesha/dir2:
total 0

/rhs/brick3/ecvol/ganesha/dir2:
total 0
---------T. 2 root root 0 Jan  9 20:00 x4

/rhs/brick4/ecvol/ganesha/dir2:
total 0
-rw-r--r--. 3 root root 0 Jan  9 20:00 x2
-rw-r--r--. 3 root root 0 Jan  9 20:00 x4
[root@dhcp35-37 ~]# 



In this case it was a zerobyte file, but i saw the problem even with non-zerobyte file which was consuming duplicate space

Comment 3 Nag Pavan Chilakam 2017-01-10 07:22:45 UTC

I am able to reproduce even on distrep  volume with ganesha mount and mdcache settings
Note: both the clients use different ganesha vips to mount
simple steps:
1)create from c1: files x1..x10, note down which file hashes to which dhtsubvol
2)now delete all files
3)now create again a file x1  from c1
4)now stat and ls of x1 on both c1 and c2
5)from c2: rename x1 to another filename which hashes to same subvol
6)from c1: immediately, within few seconds,without doing any kind of lookups, rename x1 to a file which hashes to different subvol.
both 5,6 passes
hence we have duplicate files as below:

dht-subvol1:
[root@dhcp35-37 ~]# ll /rhs/brick*/distrep/ganesha/dir1/
total 0
-rw-r--r--. 3 root root 0 Jan 10 12:47 x10
-rw-r--r--. 3 root root 0 Jan 10 12:47 x3

dht-subvol2:
[root@dhcp35-239 ~]# ll /rhs/brick*/distrep/ganesha/dir1/
total 0
---------T. 2 root root 0 Jan 10 12:47 x3

Comment 4 Nag Pavan Chilakam 2017-01-10 07:53:39 UTC

Renaming the title to reflect the problem only with ganesha and not with mdcache
I have retried this same scenario without mdcache settings.
I am able to reproduce the issue with Ganesha+distributed-(ec/replicate) volume
Also, changing component to ganesha

Comment 5 Nag Pavan Chilakam 2017-01-10 07:55:00 UTC

proposing as blocker, as this can lead to duplicate files and unncessary disk space.
Also, why is ganesha not re-validating the cache when a write operation is happening(in this case rename)???

Comment 6 Soumya Koduri 2017-01-10 08:08:55 UTC

Looks like issue with dht/bricks.. Even if ganesha server1 would have done rename from f1 to f3, brick processes should have returned ENOENT for f1 instead of creating f3 or alteast rename f2 to f3 if only gfid is considered instead of filename. Could you please try the test with pure replicate / disperse volume?

Also CCin dht team to have a look.

Comment 7 Soumya Koduri 2017-01-10 08:19:27 UTC

Also could u please try the test using gNFS/fuse mount with md-cache timout set to 600 sec as well? thanks!

Comment 8 Nag Pavan Chilakam 2017-01-10 09:15:45 UTC

(In reply to Soumya Koduri from comment #7)
> Also could u please try the test using gNFS/fuse mount with md-cache timout
> set to 600 sec as well? thanks!

This problem has nothing to do with mdcache
Also, I see this only on any form of distributed volume(pure distribute/dist-rep/dist-ec) with nfs-ganesha mount type
I have tried with fuse but not able to hit the issue

Comment 9 Nag Pavan Chilakam 2017-01-10 11:19:00 UTC

I even tried with fuse+mdcache, but i couldn't recreate the problem on that setup

Comment 10 Soumya Koduri 2017-01-10 11:45:19 UTC

I could reproduce this issue using two fuse mounts as well. Unlike nfs-ganesha server, fuse mounts doesnt have caching enabled by default. So to reproduce this issue on fuse, I have made use of md-cache.

Have following settings enabled on the volume -

[root@dhcp35-197 ~]# gluster v info brick_vol
 
Volume Name: brick_vol
Type: Distribute
Volume ID: d8f43c74-54c2-4349-a195-5eff2685b707
Status: Started
Snapshot Count: 0
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: 192.168.122.201:/bricks/brick_vol
Brick2: 192.168.122.201:/bricks/brick_vol1
Options Reconfigured:
performance.md-cache-timeout: 600
performance.cache-invalidation: on
performance.stat-prefetch: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: off
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: off
[root@dhcp35-197 ~]# 
[root@dhcp35-197 ~]# 


Note here I turned off 'features.cache-invalidation' so that the requests are cached in md-cache and not invalidated by upcall. 


Have 2 fuse mounts of the same volume - 

[root@dhcp35-197 ~]# mount | grep gluster
localhost:/brick_vol on /fuse-mnt type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,allow_other,max_read=131072)
localhost:/brick_vol on /fuse-mnt1 type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,allow_other,max_read=131072)
[root@dhcp35-197 ~]# 

mount1:
[root@dhcp35-197 fuse-mnt]# touch abc

bricks:
[root@dhcp35-197 ~]# ls -ltr /bricks/brick_vol*
/bricks/brick_vol:
total 0

/bricks/brick_vol1:
total 4
-rw-r--r--. 2 root root 0 Jan 10 17:04 abc
[root@dhcp35-197 ~]# 

mount2:
[root@dhcp35-197 fuse-mnt1]# stat abc
  File: ‘abc’
  Size: 0         	Blocks: 0          IO Block: 131072 regular empty file
Device: 2bh/43d	Inode: 13211023334685081894  Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Context: system_u:object_r:fusefs_t:s0
Access: 2017-01-10 17:04:54.334845000 +0530
Modify: 2017-01-10 17:04:54.334845000 +0530
Change: 2017-01-10 17:04:54.334845640 +0530
 Birth: -
[root@dhcp35-197 fuse-mnt1]# 
[root@dhcp35-197 fuse-mnt1]# mv abc abcd
[root@dhcp35-197 fuse-mnt1]# 


bricks:
[root@dhcp35-197 ~]# ls -ltr /bricks/brick_vol*
/bricks/brick_vol:
total 0

/bricks/brick_vol1:
total 4
-rw-r--r--. 2 root root 0 Jan 10 17:04 abcd
[root@dhcp35-197 ~]# 

mount1:
[root@dhcp35-197 fuse-mnt]# mv abc abdcde
[root@dhcp35-197 fuse-mnt]# 

bricks:
[root@dhcp35-197 ~]# ls -ltr /bricks/brick_vol*
/bricks/brick_vol1:
total 8
-rw-r--r--. 3 root root 0 Jan 10 17:04 abdcde
-rw-r--r--. 3 root root 0 Jan 10 17:04 abcd

/bricks/brick_vol:
total 4
---------T. 2 root root 0 Jan 10 17:05 abdcde
[root@dhcp35-197 ~]# 

[root@dhcp35-197 ~]# getfattr -m . -de hex /bricks/brick_vol*/*
getfattr: Removing leading '/' from absolute path names
# file: bricks/brick_vol1/abcd
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a64656661756c745f743a733000
trusted.gfid=0x558981d7d14243c0b756fbba574d0126

# file: bricks/brick_vol1/abdcde
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a64656661756c745f743a733000
trusted.gfid=0x558981d7d14243c0b756fbba574d0126

# file: bricks/brick_vol/abdcde
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a64656661756c745f743a733000
trusted.gfid=0x558981d7d14243c0b756fbba574d0126
trusted.glusterfs.dht.linkto=0x627269636b5f766f6c2d636c69656e742d3100

[root@dhcp35-197 ~]# 
[root@dhcp35-197 ~]# ls -ltr /bricks/brick_vol*
/bricks/brick_vol1:
total 8
-rw-r--r--. 3 root root 0 Jan 10 17:04 abdcde
-rw-r--r--. 3 root root 0 Jan 10 17:04 abcd

/bricks/brick_vol:
total 4
---------T. 2 root root 0 Jan 10 17:05 abdcde
[root@dhcp35-197 ~]# 


This may not be supported configuration. But these steps may help debugging and to further narrow down the issue. As suspected the issue doesn't seem to be specific to ganesha - could be in dht or posix layer. Susant is currently taking a look at it. Please correct the components when needed.

Comment 14 Susant Kumar Palai 2017-01-11 05:37:09 UTC

Yesterday when I tried saw the issue getting reproduced on nag's setup, I did not see dht_rename log for the 2nd rename. "Why so" is yet to be rcaed.  Will keep posting after trying the above reproducer. 


-Susant

Comment 15 Soumya Koduri 2017-01-11 05:55:59 UTC

Thanks Susant. To add, this issue seem to happen on fuse_mnt even with server-side cache_invalidation on (supported configuration). Just that we need to issue commands very quick (as done in comment#13) to trigger this issue.

Comment 16 Susant Kumar Palai 2017-01-11 06:13:31 UTC

Yes Soumya, able to reproduce with cache_invalidation on. Thanks for the reproducer.

Comment 18 Susant Kumar Palai 2017-01-11 09:26:11 UTC

RCA(dht-part)

Tried to reproduce the issue with the reproducer given by Soumya on a two fuse mounts.

command: touch /fuse-mnt/abc; ls /fuse-mnt2/abc; stat /fuse-mnt2/abc ; mv /fuse-mnt2/abc /fuse-mnt2/abcd ; mv /fuse-mnt/abc /fuse-mnt/abcde 


logs from: fuse-mnt2 (1st rename)
[2017-01-11 07:39:24.565003] I [MSGID: 109066] [dht-rename.c:1576:dht_rename] 0-test1-dht: renaming /abc (hash=test1-client-1/cache=test1-client-1) => /abcd (hash=test1-client-1/cache=<nul>)
[2017-01-11 07:39:24.565022] I [dht-rename.c:1471:dht_rename_lock] 0-DHT: Going rename for path:/abc
[2017-01-11 07:39:24.565205] I [dht-rename.c:1378:dht_rename_lock_cbk] 0-test1-dht: Lock taken
[2017-01-11 07:39:24.565375] I [dht-rename.c:1345:dht_rename_lookup_cbk] 0-test1-dht: lookup done on path:(null)

So hashed and cached are same for both old and new, hence rename was issued and all done.


logs from: fuse-mnt (2nd rename)
[2017-01-11 07:39:24.578832] I [MSGID: 109066] [dht-rename.c:1576:dht_rename] 0-test1-dht: renaming /abc (hash=test1-client-1/cache=test1-client-1) => /abcde (hash=test1-client-0/cache=<nul>)
[2017-01-11 07:39:24.578857] I [dht-rename.c:1471:dht_rename_lock] 0-DHT: Going rename for path:/abc
[2017-01-11 07:39:24.579364] I [dht-rename.c:1378:dht_rename_lock_cbk] 0-test1-dht: Lock taken
[2017-01-11 07:39:24.579943] I [dht-rename.c:1345:dht_rename_lookup_cbk] 0-test1-dht: lookup done on path:(null)
[2017-01-11 07:39:24.579967] I [MSGID: 0] [dht-rename.c:1274:dht_rename_create_links] 0-test1-dht: will create link files on test1-client-1 for path:/abc
[2017-01-11 07:39:24.579988] I [dht-linkfile.c:126:dht_linkfile_create] 0-test1-dht: linkfile_Creation invoked path:/abc subvol:test1-client-0
[2017-01-11 07:39:24.582420] I [dht-rename.c:1142:dht_rename_linkto_cbk] 0-test1-dht: hard link create, subvol:test1-client-1 old:/abc new:/abcde
[2017-01-11 07:39:24.584885] W [MSGID: 109034] [dht-rename.c:647:dht_rename_unlink_cbk] 0-test1-dht: /abc: Rename: unlink on test1-client-1 failed  [No such file or directory]
[2017-01-11 07:39:29.057840] I [xlator.c:831:loc_touchup] 0-server: Doing gfid based resolve for path:/

In this case the destination file hashes to a new brick. So the step is to create a linkto file with the source name on destination hashed brick, create hardlink of the destination name on the source cached brick and unlink the source name on the source hashed brick.

The problem: After taking the inodelk on source cached brick on the source file, rename op sends a lookup on the source name to make sure the file exist. But it does a gfid based lookup, which will be successful as even though the 1st rename  has renamed the file from /abc to /abcd, the gfid for the both is same and this lookup will be successful. 

The solution would be to do a name based lookup. Will run this solution to check whether it solves the issue.


-Susant

Comment 19 Susant Kumar Palai 2017-01-11 09:42:51 UTC

Forgot to add a little more detail.

(In reply to Susant Kumar Palai from comment #18)
> RCA(dht-part)
> 
> Tried to reproduce the issue with the reproducer given by Soumya on a two
> fuse mounts.
> 
> command: touch /fuse-mnt/abc; ls /fuse-mnt2/abc; stat /fuse-mnt2/abc ; mv
> /fuse-mnt2/abc /fuse-mnt2/abcd ; mv /fuse-mnt/abc /fuse-mnt/abcde 
> 
> 
> logs from: fuse-mnt2 (1st rename)
> [2017-01-11 07:39:24.565003] I [MSGID: 109066]
> [dht-rename.c:1576:dht_rename] 0-test1-dht: renaming /abc
> (hash=test1-client-1/cache=test1-client-1) => /abcd
> (hash=test1-client-1/cache=<nul>)
> [2017-01-11 07:39:24.565022] I [dht-rename.c:1471:dht_rename_lock] 0-DHT:
> Going rename for path:/abc
> [2017-01-11 07:39:24.565205] I [dht-rename.c:1378:dht_rename_lock_cbk]
> 0-test1-dht: Lock taken
> [2017-01-11 07:39:24.565375] I [dht-rename.c:1345:dht_rename_lookup_cbk]
> 0-test1-dht: lookup done on path:(null)
> 
> So hashed and cached are same for both old and new, hence rename was issued
> and all done.
> 
> 
> logs from: fuse-mnt (2nd rename)
> [2017-01-11 07:39:24.578832] I [MSGID: 109066]
> [dht-rename.c:1576:dht_rename] 0-test1-dht: renaming /abc
> (hash=test1-client-1/cache=test1-client-1) => /abcde
> (hash=test1-client-0/cache=<nul>)
> [2017-01-11 07:39:24.578857] I [dht-rename.c:1471:dht_rename_lock] 0-DHT:
> Going rename for path:/abc
> [2017-01-11 07:39:24.579364] I [dht-rename.c:1378:dht_rename_lock_cbk]
> 0-test1-dht: Lock taken
> [2017-01-11 07:39:24.579943] I [dht-rename.c:1345:dht_rename_lookup_cbk]
> 0-test1-dht: lookup done on path:(null)
> [2017-01-11 07:39:24.579967] I [MSGID: 0]
> [dht-rename.c:1274:dht_rename_create_links] 0-test1-dht: will create link
> files on test1-client-1 for path:/abc
> [2017-01-11 07:39:24.579988] I [dht-linkfile.c:126:dht_linkfile_create]
> 0-test1-dht: linkfile_Creation invoked path:/abc subvol:test1-client-0
> [2017-01-11 07:39:24.582420] I [dht-rename.c:1142:dht_rename_linkto_cbk]
> 0-test1-dht: hard link create, subvol:test1-client-1 old:/abc new:/abcde
> [2017-01-11 07:39:24.584885] W [MSGID: 109034]
> [dht-rename.c:647:dht_rename_unlink_cbk] 0-test1-dht: /abc: Rename: unlink
> on test1-client-1 failed  [No such file or directory]
> [2017-01-11 07:39:29.057840] I [xlator.c:831:loc_touchup] 0-server: Doing
> gfid based resolve for path:/
> 
> In this case the destination file hashes to a new brick. So the step is to
> create a linkto file with the source name on destination hashed brick,
> create hardlink of the destination name on the source cached brick and
> unlink the source name on the source hashed brick.
> 
> The problem: After taking the inodelk on source cached brick on the source
> file, rename op sends a lookup on the source name to make sure the file
> exist. But it does a gfid based lookup, which will be successful as even
> though the 1st rename  has renamed the file from /abc to /abcd, the gfid for
> the both is same and this lookup will be successful. 

Post this lookup, dht creates hardlink of destination file. 
Since server-xlator does gfid based resolution, even though dht passed "abc" which should not be existing, it will be successful and will fetch the new path which is /abcd. At the end dht does unlink of the old name,  which failed anyway
you can see in log above.

> 
> The solution would be to do a name based lookup. Will run this solution to
> check whether it solves the issue.
> 
> 
> -Susant

Comment 20 Susant Kumar Palai 2017-01-11 10:01:37 UTC

After applying the patch, I am not seeing the issue anymore. 

<patch>
diff --git a/xlators/cluster/dht/src/dht-helper.c b/xlators/cluster/dht/src/dht-helper.c
index ad031b6..f2b946b 100644
--- a/xlators/cluster/dht/src/dht-helper.c
+++ b/xlators/cluster/dht/src/dht-helper.c
@@ -530,7 +530,10 @@ dht_lock_new (xlator_t *this, xlator_t *xl, loc_t *loc, short type,
         */
         lock->loc.inode = inode_ref (loc->inode);
         loc_gfid (loc, lock->loc.gfid);
-
+        lock->loc.path = gf_strdup (loc->path);
+        lock->loc.parent = inode_ref (loc->parent);
+        lock->loc.name = strrchr (loc->path, '/');
+        gf_uuid_copy (lock->loc.pargfid, loc->pargfid);;
 out:
         return lock;
 }
diff --git a/xlators/cluster/dht/src/dht-rename.c b/xlators/cluster/dht/src/dht-rename.c
index 4dfcec7..d5104bc 100644
--- a/xlators/cluster/dht/src/dht-rename.c
+++ b/xlators/cluster/dht/src/dht-rename.c
@@ -1341,8 +1341,9 @@ dht_rename_lookup_cbk (call_frame_t *frame, void *cookie, xlator_t *this,
                 local->is_linkfile = _gf_true;
         }
 
-        gf_log (this->name, GF_LOG_INFO, "lookup done on path:%s",
-                local->lock.locks[0]->loc.path);
+        gf_log (this->name, GF_LOG_INFO, "lookup done on path:%s op_ret:%d "
+                "is_linkfile:%d", local->lock.locks[0]->loc.path, op_ret, 
+                local->is_linkfile);
 


</patch>

logs:
t-1) => /abcde (hash=test1-client-0/cache=<nul>)
[2017-01-11 09:58:28.062472] I [dht-rename.c:1472:dht_rename_lock] 0-DHT: Going rename for path:/abc
[2017-01-11 09:58:28.062851] I [dht-rename.c:1379:dht_rename_lock_cbk] 0-test1-dht: Lock taken
[2017-01-11 09:58:28.063334] I [dht-rename.c:1346:dht_rename_lookup_cbk] 0-test1-dht: lookup done on path:/abc op_ret:-1 is_linkfile:1

Lookup failed which used to be successful all this time.

From the mount point.
-------------------------
[root@vm1 ~]# touch /fuse-mnt/abc; ls /fuse-mnt2/abc; stat /fuse-mnt2/abc ; mv /fuse-mnt2/abc /fuse-mnt2/abcd ; mv /fuse-mnt/abc /fuse-mnt/abcde
/fuse-mnt2/abc
  File: '/fuse-mnt2/abc'
  Size: 0         	Blocks: 0          IO Block: 131072 regular empty file
Device: 2ch/44d	Inode: 9319321045353352602  Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Context: system_u:object_r:fusefs_t:s0
Access: 2017-01-11 15:28:28.029401000 +0530
Modify: 2017-01-11 15:28:28.029401000 +0530
Change: 2017-01-11 15:28:28.030401207 +0530
 Birth: -
mv: cannot move '/fuse-mnt/abc' to '/fuse-mnt/abcde': No such file or directory


Will send a patch soon.

Comment 21 Soumya Koduri 2017-01-11 10:04:22 UTC

Thanks Susant. Correcting the component as the fix shall be available as part of the glusterfs builds.

Comment 22 Susant Kumar Palai 2017-01-11 10:42:42 UTC

upstream patch: http://review.gluster.org/#/c/16375/

Comment 23 Susant Kumar Palai 2017-01-11 10:43:21 UTC

Would like nfs-ganesha team to test the patch once on a nfs-ganesha setup as mentioned by QA.

Comment 29 Prasad Desala 2017-06-12 09:23:15 UTC

Reproduced this issue on 3.2 by following the steps mentioned in Comment 13 and followed the same steps for verifying this BZ against 3.8.4-27.el7rhgs.x86_64.

After the fix,  we are not seeing any duplicate files when renamed the same file from multiple mounts with cache enabled.

Moving this BZ to Verified.

Comment 33 errata-xmlrpc 2017-09-21 04:30:55 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2774

Comment 34 errata-xmlrpc 2017-09-21 04:56:40 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2774

Comment 35 tusj 2017-11-27 07:05:58 UTC

Reproduced this issue on 3.10.7
Is there any plan to fix it?

Comment 36 Nithya Balachandran 2017-11-27 08:25:14 UTC

(In reply to tusj from comment #35)
> Reproduced this issue on 3.10.7
> Is there any plan to fix it?

This is a downstream RHGS BZ. From your comment, it looks like you have reproduced this on an upstream version. Please update the upstream community BZ instead.

Note You need to log in before you can comment on or make changes to this bug.