+++ This bug was initially created as a clone of Bug #1117851 +++ Description of problem: ======================= On Distributed volume tried to rename same file at same time from more than one client and found that file is missing after that. Both source and Destination file is not present on mount point and bricks . How reproducible: ================= Intermittent (got twice out of four time) Steps to Reproduce: ==================== 1. create and mount distributed volume. (mount on multiple client). 2. create few files and verify from mount point mount :- [root@OVM1 ren]# ls b1 b10 b2 b3 b4 b5 b6 b7 b8 b9 3. Now try to rename file from more than one mount at the same time. mount 1:- [root@OVM3 ren]# for i in {1..10} ; do mv b$i c$i; done mv: cannot move `b2' to `c2': File exists mv: cannot move `b3' to `c3': File exists mv: cannot move `b9' to `c9': No such file or directory mount 2:- [root@OVM1 ren]# for i in {1..10} ; do mv b$i c$i; done mv: cannot move `b3' to `c3': No such file or directory mv: cannot move `b4' to `c4': No such file or directory mv: cannot move `b5' to `c5': No such file or directory mv: cannot move `b7' to `c7': No such file or directory mv: cannot move `b10' to `c10': No such file or directory 4. Verify data on mount point. mount:- [root@OVM1 ren]# ls b3 c1 c10 c4 c5 c6 c7 c8 c9 5. File b2 and/or c2 is missing. either source or destination file should be present on mount. Verified on bricks. file is not even present there brick:- [root@OVM3 ren]# ls -l /brick2/* /brick2/r1: total 0 ---------T 3 root root 0 Jul 7 22:13 b1 ---------T 3 root root 0 Jul 7 22:13 c1 -rw-r--r-- 2 root root 0 Jul 7 22:09 c6 -rw-r--r-- 2 root root 0 Jul 7 22:09 c7 ---------T 2 root root 0 Jul 7 22:10 c8 -rw-r--r-- 2 root root 0 Jul 7 22:09 c9 /brick2/r2: total 0 -rw-r--r-- 2 root root 0 Jul 7 22:09 b3 -rw-r--r-- 2 root root 0 Jul 7 22:09 c1 ---------T 2 root root 0 Jul 7 22:10 c2 -rw-r--r-- 2 root root 0 Jul 7 22:09 c4 -rw-r--r-- 2 root root 0 Jul 7 22:09 c8 /brick2/r3: total 0 ---------T 2 root root 0 Jul 7 22:13 b4 -rw-r--r-- 2 root root 0 Jul 7 22:09 c10 -rw-r--r-- 2 root root 0 Jul 7 22:09 c5 Actual results: =============== file is missing - Data loss in case of renaming same file from multiple mount at same time Expected results: ================ If same operation - rename is executed from multiple mount and it should not end in data loss. Source or destination file should exist (depends on rename was successful or fail) Additional info :- mount 1 log :- [root@OVM3 ren]# grep '/c2' /var/log/glusterfs/mnt-ren.log [2014-07-07 16:43:54.489608] D [fuse-resolve.c:83:fuse_resolve_entry_cbk] 0-fuse: 00000000-0000-0000-0000-000000000001/c2: failed to resolve (No such file or directory) [2014-07-07 16:43:54.493494] D [fuse-resolve.c:83:fuse_resolve_entry_cbk] 0-fuse: 00000000-0000-0000-0000-000000000001/c2: failed to resolve (No such file or directory) [2014-07-07 16:43:54.497569] D [fuse-resolve.c:83:fuse_resolve_entry_cbk] 0-fuse: 00000000-0000-0000-0000-000000000001/c2: failed to resolve (No such file or directory) [2014-07-07 16:43:54.501028] D [fuse-resolve.c:83:fuse_resolve_entry_cbk] 0-fuse: 00000000-0000-0000-0000-000000000001/c2: failed to resolve (No such file or directory) [2014-07-07 16:43:54.501445] W [client-rpc-fops.c:2604:client3_3_link_cbk] 0-ren-client-2: remote operation failed: File exists (/b2 -> /c2) [2014-07-07 16:43:54.501878] W [fuse-bridge.c:1727:fuse_rename_cbk] 0-glusterfs-fuse: 149: /b2 -> /c2 => -1 (File exists) [root@OVM3 ren]# grep '/b2' /var/log/glusterfs/mnt-ren.log [2014-07-07 16:40:25.735551] D [fuse-resolve.c:83:fuse_resolve_entry_cbk] 0-fuse: 00000000-0000-0000-0000-000000000001/b2: failed to resolve (No such file or directory) [2014-07-07 16:40:25.738732] D [fuse-resolve.c:83:fuse_resolve_entry_cbk] 0-fuse: 00000000-0000-0000-0000-000000000001/b2: failed to resolve (No such file or directory) [2014-07-07 16:40:25.741072] D [fuse-resolve.c:83:fuse_resolve_entry_cbk] 0-fuse: 00000000-0000-0000-0000-000000000001/b2: failed to resolve (No such file or directory) [2014-07-07 16:40:25.742584] D [MSGID: 0] [dht-common.c:1087:dht_lookup_everywhere_cbk] 0-ren-dht: found on ren-client-2 file /b2 [2014-07-07 16:40:25.742615] D [MSGID: 0] [dht-common.c:972:dht_lookup_everywhere_done] 0-ren-dht: Linking file /b2 on ren-client-2 to ren-client-1 (hash)(gfid = 00000000-0000-0000-0000-000000000000) [2014-07-07 16:40:25.742865] W [client-rpc-fops.c:240:client3_3_mknod_cbk] 0-ren-client-1: remote operation failed: File exists. Path: /b2 [2014-07-07 16:43:54.501445] W [client-rpc-fops.c:2604:client3_3_link_cbk] 0-ren-client-2: remote operation failed: File exists (/b2 -> /c2) [2014-07-07 16:43:54.501878] W [fuse-bridge.c:1727:fuse_rename_cbk] 0-glusterfs-fuse: 149: /b2 -> /c2 => -1 (File exists) mount 2 log:- [root@OVM1 ren]# grep '/c2' /var/log/glusterfs/mnt-ren.log [2014-07-07 16:43:54.816414] D [fuse-resolve.c:83:fuse_resolve_entry_cbk] 0-fuse: 00000000-0000-0000-0000-000000000001/c2: failed to resolve (No such file or directory) [2014-07-07 16:43:54.820530] D [fuse-resolve.c:83:fuse_resolve_entry_cbk] 0-fuse: 00000000-0000-0000-0000-000000000001/c2: failed to resolve (No such file or directory) [2014-07-07 16:43:54.824191] D [fuse-resolve.c:83:fuse_resolve_entry_cbk] 0-fuse: 00000000-0000-0000-0000-000000000001/c2: failed to resolve (No such file or directory) [2014-07-07 16:43:54.827681] D [fuse-resolve.c:83:fuse_resolve_entry_cbk] 0-fuse: 00000000-0000-0000-0000-000000000001/c2: failed to resolve (No such file or directory) [root@OVM1 ren]# grep '/b2' /var/log/glusterfs/mnt-ren.log [2014-07-07 16:40:26.058748] D [fuse-resolve.c:83:fuse_resolve_entry_cbk] 0-fuse: 00000000-0000-0000-0000-000000000001/b2: failed to resolve (No such file or directory) [2014-07-07 16:40:26.062550] D [fuse-resolve.c:83:fuse_resolve_entry_cbk] 0-fuse: 00000000-0000-0000-0000-000000000001/b2: failed to resolve (No such file or directory) [2014-07-07 16:40:26.065694] D [fuse-resolve.c:83:fuse_resolve_entry_cbk] 0-fuse: 00000000-0000-0000-0000-000000000001/b2: failed to resolve (No such file or directory) [2014-07-07 16:40:26.068620] D [fuse-resolve.c:83:fuse_resolve_entry_cbk] 0-fuse: 00000000-0000-0000-0000-000000000001/b2: failed to resolve (No such file or directory) brick log:- [root@OVM3 ren]# grep '/b2' /var/log/glusterfs/bricks/brick2-r* /var/log/glusterfs/bricks/brick2-r2.log:[2014-07-07 16:40:25.742744] I [server-rpc-fops.c:557:server_mknod_cbk] 0-ren-server: 229: MKNOD (null) (00000000-0000-0000-0000-000000000001/b2) ==> (File exists) [root@OVM3 ren]# grep '/c2' /var/log/glusterfs/bricks/brick2-r* /var/log/glusterfs/bricks/brick2-r3.log:[2014-07-07 16:43:54.501210] I [server-rpc-fops.c:1185:server_link_cbk] 0-ren-server: 421: LINK /c2 (b4bc8b38-e17b-449d-8669-ff33a836edd6) -> 00000000-0000-0000-0000-000000000001/c2 ==> (File exists) --- Additional comment from Rachana Patel on 2014-07-08 02:50:10 EDT --- volume info :- Volume Name: ren Type: Distribute Volume ID: ea9b5c23-6de6-4863-bb15-37e3ad57c226 Status: Started Snap Volume: no Number of Bricks: 3 Transport-type: tcp Bricks: Brick1: 10.70.35.198:/brick2/r1 Brick2: 10.70.35.198:/brick2/r2 Brick3: 10.70.35.198:/brick2/r3 Options Reconfigured: diagnostics.client-log-level: DEBUG performance.readdir-ahead: on snap-max-hard-limit: 256 snap-max-soft-limit: 90 auto-delete: disable mount info :- mount 1:- [root@OVM3 ~]# mount | grep ren 10.70.35.198:/ren on /mnt/ren type fuse.glusterfs (rw,default_permissions,allow_other,max_read=131072) mount 2 :- root@OVM1 ~]# mount | grep ren 10.70.35.198:/ren on /mnt/ren type fuse.glusterfs (rw,default_permissions,allow_other,max_read=131072)
REVIEW: http://review.gluster.org/8605 (dht: fix rename race) posted (#1) for review on release-3.6 by Shyamsundar Ranganathan (srangana)
REVIEW: http://review.gluster.org/8604 (storage/posix: removing deleting entries in case of creation failures) posted (#2) for review on release-3.6 by Shyamsundar Ranganathan (srangana)
REVIEW: http://review.gluster.org/8605 (dht: fix rename race) posted (#2) for review on release-3.6 by Shyamsundar Ranganathan (srangana)
REVIEW: http://review.gluster.org/8604 (storage/posix: removing deleting entries in case of creation failures) posted (#3) for review on release-3.6 by Shyamsundar Ranganathan (srangana)
REVIEW: http://review.gluster.org/8605 (dht: fix rename race) posted (#3) for review on release-3.6 by Shyamsundar Ranganathan (srangana)
REVIEW: http://review.gluster.org/8604 (storage/posix: removing deleting entries in case of creation failures) posted (#4) for review on release-3.6 by Shyamsundar Ranganathan (srangana)
REVIEW: http://review.gluster.org/8605 (dht: fix rename race) posted (#4) for review on release-3.6 by Shyamsundar Ranganathan (srangana)
COMMIT: http://review.gluster.org/8604 committed in release-3.6 by Vijay Bellur (vbellur) ------ commit 71c1079fb8e73cd27aa3418b1be21c2507dc24c9 Author: Raghavendra G <rgowdapp> Date: Thu Sep 4 14:03:04 2014 -0400 storage/posix: removing deleting entries in case of creation failures The code is not atomic enough to not to delete a dentry created by a prallel dentry creation operation. Change-Id: I9bd6d2aa9e7a1c0688c0a937b02a4b4f56d7aa3d BUG: 1138387 Signed-off-by: Raghavendra G <rgowdapp> Reviewed-on-master: http://review.gluster.org/8327 Reviewed-by: Pranith Kumar Karampuri <pkarampu> Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Vijay Bellur <vbellur>
COMMIT: http://review.gluster.org/8605 committed in release-3.6 by Vijay Bellur (vbellur) ------ commit d34029cceb290bda27357ca38238a714a0cba286 Author: Nithya Balachandran <nbalacha> Date: Thu Sep 4 14:05:04 2014 -0400 dht: fix rename race Additional check to check if we created the linkto file before deleting it in the rename cleanup function Change-Id: I919cd7cb24f948ba4917eb9cf50d5169bb730a67 BUG: 1138387 Signed-off-by: Nithya Balachandran <nbalacha> Reviewed-on-master: http://review.gluster.org/8338 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Raghavendra G <rgowdapp> Reviewed-by: Vijay Bellur <vbellur> Reviewed-on: http://review.gluster.org/8605 Reviewed-by: Jeff Darcy <jdarcy>
REVIEW: http://review.gluster.org/8691 (storage/posix: removing deleting entries in case of creation failures) posted (#1) for review on release-3.6 by Vijay Bellur (vbellur)
REVIEW: http://review.gluster.org/8693 (storage/posix: removing deleting entries in case of creation failures) posted (#1) for review on release-3.6 by Vijay Bellur (vbellur)
COMMIT: http://review.gluster.org/8693 committed in release-3.6 by Vijay Bellur (vbellur) ------ commit 2460b9175b60981096203a4331e434302da14d4b Author: Raghavendra G <rgowdapp> Date: Thu Sep 4 14:03:04 2014 -0400 storage/posix: removing deleting entries in case of creation failures The code is not atomic enough to not to delete a dentry created by a prallel dentry creation operation. Change-Id: I9bd6d2aa9e7a1c0688c0a937b02a4b4f56d7aa2e BUG: 1138387 Signed-off-by: Raghavendra G <rgowdapp> Reviewed-on-master: http://review.gluster.org/8327 Reviewed-by: Pranith Kumar Karampuri <pkarampu> Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Vijay Bellur <vbellur> Reviewed-on: http://review.gluster.org/8693 Tested-by: Vijay Bellur <vbellur>
A beta release for GlusterFS 3.6.0 has been released. Please verify if the release solves this bug report for you. In case the glusterfs-3.6.0beta1 release does not have a resolution for this issue, leave a comment in this bug and move the status to ASSIGNED. If this release fixes the problem for you, leave a note and change the status to VERIFIED. Packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update (possibly an "updates-testing" repository) infrastructure for your distribution. [1] http://supercolony.gluster.org/pipermail/gluster-users/2014-September/018836.html [2] http://supercolony.gluster.org/pipermail/gluster-users/
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.6.1, please reopen this bug report. glusterfs-3.6.1 has been announced [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://supercolony.gluster.org/pipermail/gluster-users/2014-November/019410.html [2] http://supercolony.gluster.org/mailman/listinfo/gluster-users