+++ This bug was initially created as a clone of Bug #1115367 +++ Description of problem: ======================= In a distribute-replicate volume when 'rm -rf *' is performed from multiple mounts, the directories are removed from some sub-volumes but are not removed from other sub-volumes. Because of this , "rm -rf <directory>" fails with "directory not empty". When we do "ls -l <directory>" the directory is empty. Version-Release number of selected component (if applicable): ============================================================== glusterfs 3.6.0.22 built on Jun 23 2014 10:33:07 How reproducible: ==================== Often Steps to Reproduce: ==================== 1. Create distribute-replicate volume. Start the volume. 2. Create 2 fuse mounts and 2 nfs mount or all 4 fuse mounts. 3. Create directories ( mkdir -p A{1..1000}/B{1..20}/C{1..20} ) 4. From all the mount points execute "rm -rf *" Actual results: ==================== root@dj [Jul-02-2014- 1:14:38] >rm -rf * rm: cannot remove `A11': Directory not empty rm: cannot remove `A111': Directory not empty rm: cannot remove `A137': Directory not empty rm: cannot remove `A151/B18': Directory not empty rm: cannot remove `A153': Directory not empty rm: cannot remove `A163': Directory not empty rm: cannot remove `A204': Directory not empty rm: cannot remove `A480/B16': Directory not empty On sub-volume1: =================== brick1: ~~~~~~~~~ root@rhs-client11 [Jul-02-2014-14:40:48] >ls -l /rhs/device0/rep_brick1/A11 total 0 drwxr-xr-x 3 root root 15 Jul 2 12:32 B19 root@rhs-client11 [Jul-02-2014-14:40:50] > brick2: ~~~~~~~~~ root@rhs-client12 [Jul-02-2014-14:40:48] >ls -l /rhs/device0/rep_brick2/A11 total 0 drwxr-xr-x 3 root root 15 Jul 2 12:32 B19 root@rhs-client12 [Jul-02-2014-14:40:50] > On sub-volume2: ==================== brick3: ~~~~~~~ root@rhs-client13 [Jul-02-2014-14:40:48] >ls -l /rhs/device0/rep_brick3/A11 total 0 root@rhs-client13 [Jul-02-2014-14:40:50] > brick4: ~~~~~~~~ root@rhs-client14 [Jul-02-2014-14:40:48] >ls -l /rhs/device0/rep_brick4/A11 total 0 root@rhs-client14 [Jul-02-2014-14:40:50] > root@rhs-client14 [Jul-02-2014-14:40:51] > Expected results: ================== The directories should be removed from all the subvolumes. Additional info: ================== root@mia [Jul-02-2014-14:42:57] >gluster v info rep Volume Name: rep Type: Distributed-Replicate Volume ID: d8d69cec-8bdd-4c9d-b5f5-972b36716b0b Status: Started Snap Volume: no Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: rhs-client11:/rhs/device0/rep_brick1 Brick2: rhs-client12:/rhs/device0/rep_brick2 Brick3: rhs-client13:/rhs/device0/rep_brick3 Brick4: rhs-client14:/rhs/device0/rep_brick4 Options Reconfigured: features.uss: disable server.statedump-path: /var/run/gluster/statedumps features.barrier: disable performance.readdir-ahead: on snap-max-hard-limit: 256 snap-max-soft-limit: 90 auto-delete: disable root@mia [Jul-02-2014-14:43:01] > root@mia [Jul-02-2014-14:43:02] >gluster v status rep Status of volume: rep Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick rhs-client11:/rhs/device0/rep_brick1 49154 Y 2890 Brick rhs-client12:/rhs/device0/rep_brick2 49154 Y 5472 Brick rhs-client13:/rhs/device0/rep_brick3 49153 Y 2869 Brick rhs-client14:/rhs/device0/rep_brick4 49153 Y 5433 NFS Server on localhost 2049 Y 32441 Self-heal Daemon on localhost N/A Y 27961 NFS Server on rhs-client13 2049 Y 20245 Self-heal Daemon on rhs-client13 N/A Y 2858 NFS Server on 10.70.36.35 2049 Y 20399 Self-heal Daemon on 10.70.36.35 N/A Y 2885 NFS Server on rhs-client12 2049 Y 11226 Self-heal Daemon on rhs-client12 N/A Y 5494 NFS Server on rhs-client14 2049 Y 11211 Self-heal Daemon on rhs-client14 N/A Y 5455 Task Status of Volume rep ------------------------------------------------------------------------------ There are no active volume tasks root@mia [Jul-02-2014-14:43:05] > --- Additional comment from RHEL Product and Program Management on 2014-07-02 05:25:23 EDT --- Since this issue was entered in bugzilla, the release flag has been set to ? to ensure that it is properly evaluated for this release. --- Additional comment from on 2014-07-02 06:26:32 EDT --- SOS Reports: http://rhsqe-repo.lab.eng.blr.redhat.com/bugs_necessary_info/1115367/ --- Additional comment from Vivek Agarwal on 2014-07-03 02:21:05 EDT --- Per discussion, not a blocker.
REVIEW: http://review.gluster.org/11725 (dht : lock on hashed subvol to prevent lookup vs rmdir race) posted (#1) for review on master by Sakshi Bansal (sabansal)
Please provide a public facing description of the issue.
REVIEW: http://review.gluster.org/11725 (dht:lock on hashed subvol to prevent lookup vs rmdir race) posted (#2) for review on master by Sakshi Bansal (sabansal)
REVIEW: http://review.gluster.org/11725 (dht: lock on subvols to prevent lookup vs rmdir race) posted (#5) for review on master by Sakshi Bansal (sabansal)
REVIEW: http://review.gluster.org/11725 (dht: lock on subvols to prevent lookup vs rmdir race) posted (#6) for review on master by Sakshi Bansal (sabansal)
REVIEW: http://review.gluster.org/11725 (dht : lock on subvols to prevent lookup vs rmdir race) posted (#7) for review on master by Sakshi Bansal (sabansal)
REVIEW: http://review.gluster.org/11916 (lock : check if inode exists before granting blocked locks) posted (#1) for review on master by Sakshi Bansal (sabansal)
REVIEW: http://review.gluster.org/11725 (dht: lock on subvols to prevent lookup vs rmdir race) posted (#8) for review on master by Sakshi Bansal (sabansal)
REVIEW: http://review.gluster.org/11725 (dht: lock on subvols to prevent lookup vs rmdir race) posted (#9) for review on master by Raghavendra G (rgowdapp)
REVIEW: http://review.gluster.org/11725 (dht : lock on subvols to prevent lookup vs rmdir race) posted (#10) for review on master by Sakshi Bansal (sabansal)
REVIEW: http://review.gluster.org/11916 (lock : check if inode exists before granting blocked locks) posted (#3) for review on master by Sakshi Bansal (sabansal)
REVIEW: http://review.gluster.org/11725 (dht: lock on subvols to prevent lookup vs rmdir race) posted (#11) for review on master by Sakshi Bansal (sabansal)
REVIEW: http://review.gluster.org/11725 (dht : lock on subvols to prevent lookup vs rmdir race) posted (#12) for review on master by Sakshi Bansal (sabansal)
REVIEW: http://review.gluster.org/12035 (dht: lookup after selfheal acquires lock in the mkdir phase) posted (#1) for review on master by Sakshi Bansal (sabansal)
REVIEW: http://review.gluster.org/12035 (dht: lookup after selfheal acquires lock in the mkdir phase) posted (#2) for review on master by Sakshi Bansal (sabansal)
REVIEW: http://review.gluster.org/12125 (dht : lock on all subvols to prevent rmdir vs lookup selfheal race) posted (#1) for review on master by Sakshi Bansal (sabansal)
REVIEW: http://review.gluster.org/12125 (dht : reverting changes that takes lock on all subvols to prevent rmdir vs lookup selfheal race) posted (#2) for review on master by Sakshi Bansal (sabansal)
REVIEW: http://review.gluster.org/12125 (dht : reverting changes that takes lock on all subvols to prevent rmdir vs lookup selfheal race) posted (#3) for review on master by Dan Lambright (dlambrig)
REVIEW: http://review.gluster.org/12125 (dht: reverting changes that takes lock on all subvols to prevent rmdir vs lookup selfheal race) posted (#4) for review on master by Sakshi Bansal (sabansal)
COMMIT: http://review.gluster.org/12125 committed in master by Raghavendra G (rgowdapp) ------ commit 7b9135045685125d7c94d75f06d762fa1c5ba4b9 Author: Sakshi <sabansal> Date: Mon Aug 31 16:06:35 2015 +0530 dht: reverting changes that takes lock on all subvols to prevent rmdir vs lookup selfheal race Locking on all subvols before an rmdir is unable to remove all directory entries. Hence reverting the patch for now. Change-Id: I31baf2b2fa2f62c57429cd44f3f229c35eff1939 BUG: 1245065 Signed-off-by: Sakshi <sabansal> Reviewed-on: http://review.gluster.org/12125 Tested-by: Gluster Build System <jenkins.com> Tested-by: NetBSD Build System <jenkins.org> Reviewed-by: Raghavendra G <rgowdapp>
REVIEW: http://review.gluster.org/13528 (dht: lock on subvols to prevent lookup vs rmdir race) posted (#1) for review on master by Sakshi Bansal
REVIEW: http://review.gluster.org/13528 (dht: lock on subvols to prevent lookup vs rmdir race) posted (#2) for review on master by Sakshi Bansal
REVIEW: http://review.gluster.org/13528 (dht: lock on subvols to prevent lookup vs rmdir race) posted (#3) for review on master by Sakshi Bansal
REVIEW: http://review.gluster.org/13816 (mount/fuse: report ESTALE as ENOENT) posted (#1) for review on master by Raghavendra G (rgowdapp)
REVIEW: http://review.gluster.org/13818 (gluster-NFS: For remove fop(), report ENOENT for ESTALE) posted (#2) for review on master by Raghavendra G (rgowdapp)
REVIEW: http://review.gluster.org/13816 (mount/fuse: report ESTALE as ENOENT) posted (#2) for review on master by Raghavendra G (rgowdapp)
REVIEW: http://review.gluster.org/13816 (mount/fuse: report ESTALE as ENOENT) posted (#3) for review on master by Raghavendra G (rgowdapp)
This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions
REVIEW: http://review.gluster.org/13528 (dht: lock on subvols to prevent lookup vs rmdir race) posted (#4) for review on master by Raghavendra G (rgowdapp)
REVIEW: http://review.gluster.org/13528 (dht: lock on subvols to prevent lookup vs rmdir race) posted (#5) for review on master by Raghavendra G (rgowdapp)
REVIEW: http://review.gluster.org/13528 (dht: lock on subvols to prevent lookup vs rmdir race) posted (#6) for review on master by Raghavendra G (rgowdapp)
COMMIT: http://review.gluster.org/13528 committed in master by Raghavendra G (rgowdapp) ------ commit c25f88c953215b1bfc135aeafc43dc00a663206d Author: Sakshi <sabansal> Date: Thu Jul 16 14:31:03 2015 +0530 dht: lock on subvols to prevent lookup vs rmdir race There is a possibility that while an rmdir is completed on some non-hashed subvol and proceeding to others, a lookup selfheal can recreate the same directory on those subvols for which the rmdir had succeeded. Now the deletion of the parent directory will fail with an ENOTEMPTY. To fix this take blocking inodelk on the subvols before starting rmdir. Selfheal must also take blocking inodelk before creating the entry. Change-Id: I168a195c35ac1230ba7124d3b0ca157755b3df96 BUG: 1245065 Signed-off-by: Sakshi <sabansal> Reviewed-on: http://review.gluster.org/13528 CentOS-regression: Gluster Build System <jenkins.com> NetBSD-regression: NetBSD Build System <jenkins.org> Smoke: Gluster Build System <jenkins.com> Reviewed-by: Raghavendra G <rgowdapp> Tested-by: Raghavendra G <rgowdapp>
COMMIT: http://review.gluster.org/13816 committed in master by Raghavendra G (rgowdapp) ------ commit 26d16b90ec7f8acbe07e56e8fe1baf9c9fa1519e Author: Raghavendra G <rgowdapp> Date: Wed Mar 23 13:47:27 2016 +0530 mount/fuse: report ESTALE as ENOENT When the inode/gfid is missing, brick report back as an ESTALE error. However, most of the applications don't accept ESTALE as an error for a file-system object missing, changing their behaviour. For eg., rm -rf ignores ENOENT errors during unlink of files/directories. But with ESTALE error it doesn't send rmdir on a directory if unlink had failed with ESTALE for any of the files or directories within it. Thanks to Ravishankar N <ravishankar>, here is a link as to why we split up ENOENT into ESTALE and ENOENT. http://review.gluster.org/#/c/6318/ Change-Id: I467df0fdf22734a8ef20c79ac52606410fad04d1 BUG: 1245065 Signed-off-by: Raghavendra G <rgowdapp> Reviewed-on: http://review.gluster.org/13816 Smoke: Gluster Build System <jenkins.com> Reviewed-by: Jeff Darcy <jdarcy> Reviewed-by: N Balachandran <nbalacha> Reviewed-by: Niels de Vos <ndevos> Tested-by: N Balachandran <nbalacha> CentOS-regression: Gluster Build System <jenkins.com> NetBSD-regression: NetBSD Build System <jenkins.org>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report. glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/ [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user