Bug 1444540 - rm -rf <dir> returns ENOTEMPTY even though ls on the mount point returns no files
Summary: rm -rf <dir> returns ENOTEMPTY even though ls on the mount point returns no f...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: distribute
Version: 3.10
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Nithya Balachandran
QA Contact:
URL:
Whiteboard:
Depends On: 1442724
Blocks: glusterfs-3.10.2
TreeView+ depends on / blocked
 
Reported: 2017-04-22 03:34 UTC by Nithya Balachandran
Modified: 2017-05-31 20:44 UTC (History)
2 users (show)

Fixed In Version: glusterfs-3.10.2
Clone Of: 1442724
Environment:
Last Closed: 2017-05-31 20:44:48 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Nithya Balachandran 2017-04-22 03:34:24 UTC
+++ This bug was initially created as a clone of Bug #1442724 +++

Description of problem:
rm -rf <dir> on a directory which contains a lot of stale linkto files will fail with ENOTEMPTY



Version-Release number of selected component (if applicable):


How reproducible:

Consistently

Steps to Reproduce:
1. Create a 3x1 pure distribute volume and FUSE mount it
2. Run the attached script (test.sh from the mount point). This creates a dir dir1, creates 1000 files inside the directory and renames the files so as to create lots of linkto files on the bricks.
3. Run the remove-brick command to start removing the last brick (vol1-client-2)
Do not commit the remove brick
4. From the mount point, run `rm -rf dir1` 

Actual results:
rm -rf fails with ENOTEMPTY

Expected results:

rm -rf should succeed

Additional info:

--- Additional comment from Worker Ant on 2017-04-17 05:58:14 EDT ---

REVIEW: https://review.gluster.org/17065 (cluster/dht: rm -rf fails if dir has stale linkto files) posted (#1) for review on master by N Balachandran (nbalacha)

--- Additional comment from Worker Ant on 2017-04-18 13:14:54 EDT ---

REVIEW: https://review.gluster.org/17065 (cluster/dht: rm -rf fails if dir has stale linkto files) posted (#2) for review on master by N Balachandran (nbalacha)

--- Additional comment from Worker Ant on 2017-04-19 05:49:20 EDT ---

REVIEW: https://review.gluster.org/17065 (cluster/dht: rm -rf fails if dir has stale linkto files) posted (#3) for review on master by N Balachandran (nbalacha)

--- Additional comment from Nithya Balachandran on 2017-04-20 00:54:38 EDT ---

RCA:

The attached script creates a lot of files and renames them so we end up with multiple linkto files on the bricks.
Start a remove brick and wait until the rebalance completes (do not commit the remove brick). This ensures that the brick is still part of the volume so rmdir commands will be sent to it as well.

rm -rf <dir> will send an unlink for each entry returned followed by an unlink on the directory.

dht_rmdir first performs a readdirp on each subvol to make sure there are no entries in the directories on the bricks. Any stale linkto files found will be deleted. This is particularly useful as would not be cleaned up otherwise.

The issue is that dht sends the readdirp only once to each subvol. If there are more stale linkto files on the brick than can be returned in a single readdirp call, they will not be cleaned up and the rmdir will fail with ENOTEMPTY.

The fix involves sending readdirp to each brick until no more entries are returned.

--- Additional comment from Worker Ant on 2017-04-21 01:49:44 EDT ---

COMMIT: https://review.gluster.org/17065 committed in master by Raghavendra G (rgowdapp) 
------
commit e5f9ba138571bd18226462c49ff6a55f5c3ed3a4
Author: N Balachandran <nbalacha>
Date:   Mon Apr 17 15:21:20 2017 +0530

    cluster/dht: rm -rf fails if dir has stale linkto files
    
    rm -rf <dir> fails with ENOENT if dir contains a lot of
    stale linkto files. This is because a single
    readdirp is sent as part of the rmdir which would return
    and delete only as many linkto files on the bricks as would fit
    in one readdirp buffer. Running rm -rf <dir> multiple times
    will eventually delete all the files. The fix sends readdirp
    on each subvol until no more entries are returned.
    
    Change-Id: I447f2d193de4bd8ac16e4541c6b919d22250e39e
    BUG: 1442724
    Signed-off-by: N Balachandran <nbalacha>
    Reviewed-on: https://review.gluster.org/17065
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Raghavendra G <rgowdapp>

Comment 1 Worker Ant 2017-04-24 03:31:03 UTC
REVIEW: https://review.gluster.org/17102 (cluster/dht: rm -rf fails if dir has stale linkto files) posted (#1) for review on release-3.10 by N Balachandran (nbalacha)

Comment 2 Worker Ant 2017-04-25 05:17:26 UTC
REVIEW: https://review.gluster.org/17102 (cluster/dht: rm -rf fails if dir has stale linkto files) posted (#2) for review on release-3.10 by N Balachandran (nbalacha)

Comment 3 Worker Ant 2017-04-27 10:50:00 UTC
COMMIT: https://review.gluster.org/17102 committed in release-3.10 by Raghavendra Talur (rtalur) 
------
commit 6577b84b0540c2210978e2d250c30f52914b5401
Author: N Balachandran <nbalacha>
Date:   Mon Apr 17 15:21:20 2017 +0530

    cluster/dht: rm -rf fails if dir has stale linkto files
    
    rm -rf <dir> fails with ENOENT if dir contains a lot of
    stale linkto files. This is because a single
    readdirp is sent as part of the rmdir which would return
    and delete only as many linkto files on the bricks as would fit
    in one readdirp buffer. Running rm -rf <dir> multiple times
    will eventually delete all the files. The fix sends readdirp
    on each subvol until no more entries are returned.
    
    > BUG: 1442724
    > Signed-off-by: N Balachandran <nbalacha>
    > Reviewed-on: https://review.gluster.org/17065
    > Smoke: Gluster Build System <jenkins.org>
    > NetBSD-regression: NetBSD Build System <jenkins.org>
    > CentOS-regression: Gluster Build System <jenkins.org>
    > Reviewed-by: Raghavendra G <rgowdapp>
    (cherry picked from commit e5f9ba138571bd18226462c49ff6a55f5c3ed3a4)
    
    Change-Id: I447f2d193de4bd8ac16e4541c6b919d22250e39e
    BUG: 1444540
    Signed-off-by: N Balachandran <nbalacha>
    Reviewed-on: https://review.gluster.org/17102
    NetBSD-regression: NetBSD Build System <jenkins.org>
    Smoke: Gluster Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Raghavendra G <rgowdapp>

Comment 4 Raghavendra Talur 2017-05-31 20:44:48 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.10.2, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.