Bug 1329514

Summary: rm -rf to a dir gives directory not empty(ENOTEMPTY) error
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Bipin Kunal <bkunal>
Component: distributeAssignee: Nithya Balachandran <nbalacha>
Status: CLOSED ERRATA QA Contact: krishnaram Karthick <kramdoss>
Severity: high Docs Contact:
Priority: high    
Version: rhgs-3.1CC: asrivast, bkunal, nbalacha, rabhat, rcyriac, rgowdapp, rhinduja, sabansal, sankarshan, smohan
Target Milestone: ---Keywords: ZStream
Target Release: RHGS 3.1.3   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-3.7.9-4 Doc Type: Bug Fix
Doc Text:
Cause: DHT incorrectly deleted directories from the hashed subvolume during an rmdir operation Consequence: rmdir fails with ENOTEMPTY even though listing the contents does not return any entries Fix: DHT now deletes the directories from the hashed subvolumes only if the operation succeeds on all other subvolumes. Result:
Story Points: ---
Clone Of:
: 1330032 (view as bug list) Environment:
Last Closed: 2016-06-23 05:19:01 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1311817, 1330032, 1331933, 1347529    

Description Bipin Kunal 2016-04-22 07:02:17 UTC
Description of problem:
2x2 distributed replicate volume.

1) Customer were using fuse client. When they were deleting a directory (which looked empty when "ls" was done on the mount point) they were getting ENOTEMPTY errors.

2) Installed glusterfs-debuginfo package, and then attached to the glusterfs client process via gdb.

3) When gdb was attached and breakpoints were put, found that the directory being removed was not empty on all the nodes. 

The directory being removed (name "Dir45") was empty on one distribute subvolumes. The other subvolume had an empty sub directory inside it (the name of the subdirectory is "bucket7").

"ls" of Dir45 was not showing bucket7 to be present. But backend had that subdirectory. When ls on the subdirectory was explicitly done, then it was healed by distribute and "ls" on its parent directory (i.e. Dir45) started showing that entry.

Version-Release number of selected component (if applicable):
3.1.1

Comment 8 Alok 2016-04-28 17:42:07 UTC
Approved for accelerated fix.

Comment 11 Satish Mohan 2016-04-29 08:39:26 UTC
Approved for accelerated fix.

Comment 14 krishnaram Karthick 2016-05-11 11:46:55 UTC
verified the fix in build - glusterfs-server-3.7.9-4.el7rhgs.x86_64

steps followed to verify:

Test1:

1. Create a 2x2 dist-rep volume.
2. Set cluster.quorum-type to auto.
3. NFS mount the volume
4. mkdir -p dir1/dir2/dir3/dir4
5. Kill the first brick process for the non-hashed subvol for dir4
6. Try to delete dir4 : rmdir dir1/dir2/dir3/dir4

result: directory not deleted from any of the sub-vols

Test2:
1. Create a 2x2 dist-rep volume.
2. Set cluster.quorum-type to auto.
3. NFS mount the volume
4. mkdir -p dir1/dir2/dir3/dir4
5. Kill the first brick process for the hashed subvol for dir4
6. Try to delete dir4 : rmdir dir1/dir2/dir3/dir4

without the fix, directory's permissions were messed up. with the fix, permissions were intact.

Moving the bug to verified.

Comment 17 errata-xmlrpc 2016-06-23 05:19:01 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1240