Description of problem: I have few directories that are created by the ltp test. I tried to remove these directories using "rm -rf *" Instead of removing the directories, I get Warning messages in gfapi.log messages are like these ones, [2015-04-07 06:38:17.338845] W [client-rpc-fops.c:1090:client3_3_getxattr_cbk] 0-vol0-client-1: remote operation failed: Operation not permitted. Path: /run30461/pa/c1 (03862d41-ffe0-4d5f-9fc4-86924c458fa0). Key: user.nfsv4_acls So, point to notice is that I had volume mounted with version=3, not with version=3 Version-Release number of selected component (if applicable): glusterfs-3.7dev-0.770.git2035599.el6.x86_64 nfs-ganesha-2.2-0.rc7.el6.x86_64 How reproducible: executed the ltp case once, and "rm -rf" several times, each time the result is same Steps to Reproduce: 1. have a 6x2 glusterfs volume 2. have nfs-ganesha setup done 3. execute ltp tests 4. execute "rm-rf *" to remove the directories created by ltp test. Actual results: [root@rhsauto005 ~]# ls /mnt/run30461/ p0 p10 p11 p12 p13 p15 p2 p3 p4 p6 p8 p9 pa pb pc pd pe [root@rhsauto005 ~]# cd /mnt/run -bash: cd: /mnt/run: No such file or directory [root@rhsauto005 ~]# cd /mnt/run30461/ [root@rhsauto005 run30461]# rm -rf * rm: cannot remove `p0/d7': Directory not empty rm: cannot remove `p0/d6/d7': Directory not empty rm: cannot remove `p10': Directory not empty rm: cannot remove `p11/d2': Directory not empty rm: cannot remove `p12': Directory not empty rm: cannot remove `p13': Directory not empty rm: cannot remove `p15/d1': Directory not empty rm: cannot remove `p2': Directory not empty rm: cannot remove `p3': Directory not empty rm: cannot remove `p4/d4': Directory not empty rm: cannot remove `p4/d5': Directory not empty rm: cannot remove `p6': Directory not empty rm: cannot remove `p8': Directory not empty rm: cannot remove `p9': Directory not empty rm: cannot remove `pa': Directory not empty rm: cannot remove `pb': Directory not empty rm: cannot remove `pc': Directory not empty rm: cannot remove `pd/d4': Directory not empty rm: cannot remove `pe/d1': Directory not empty [root@rhsauto005 run30461]# ls p0 p10 p11 p12 p13 p15 p2 p3 p4 p6 p8 p9 pa pb pc pd pe [2015-04-07 06:38:17.338845] W [client-rpc-fops.c:1090:client3_3_getxattr_cbk] 0-vol0-client-1: remote operation failed: Operation not permitted. Path: /run30461/pa/c1 (03862d41-ffe0-4d5f-9fc4-86924c458fa0). Key: user.nfsv4_acls [2015-04-07 06:38:17.348953] W [client-rpc-fops.c:1090:client3_3_getxattr_cbk] 0-vol0-client-2: remote operation failed: Operation not permitted. Path: /run30461/pb/c2 (01031bd7-d911-44f2-ad5b-e4dcb742f630). Key: user.nfsv4_acls [2015-04-07 06:38:17.349691] W [client-rpc-fops.c:1090:client3_3_getxattr_cbk] 0-vol0-client-3: remote operation failed: Operation not permitted. Path: /run30461/pb/c2 (01031bd7-d911-44f2-ad5b-e4dcb742f630). Key: user.nfsv4_acls [2015-04-07 06:38:17.358784] W [client-rpc-fops.c:1090:client3_3_getxattr_cbk] 0-vol0-client-0: remote operation failed: Operation not permitted. Path: /run30461/pc/l2 (3f65896e-8063-443c-bf6f-4a07a68930cd). Key: user.nfsv4_acls [2015-04-07 06:38:17.361489] W [client-rpc-fops.c:1090:client3_3_getxattr_cbk] 0-vol0-client-1: remote operation failed: Operation not permitted. Path: /run30461/pc/l2 (3f65896e-8063-443c-bf6f-4a07a68930cd). Key: user.nfsv4_acls [2015-04-07 06:38:17.377130] W [client-rpc-fops.c:1090:client3_3_getxattr_cbk] 0-vol0-client-2: remote operation failed: Operation not permitted. Path: /run30461/pd/d4/c5 (e6bead39-80ed-434b-9e1d-91d31b9aa5f5). Key: user.nfsv4_acls [2015-04-07 06:38:17.377847] W [client-rpc-fops.c:1090:client3_3_getxattr_cbk] 0-vol0-client-3: remote operation failed: Operation not permitted. Path: /run30461/pd/d4/c5 (e6bead39-80ed-434b-9e1d-91d31b9aa5f5). Key: user.nfsv4_acls [2015-04-07 06:38:17.390021] W [client-rpc-fops.c:1090:client3_3_getxattr_cbk] 0-vol0-client-0: remote operation failed: Operation not permitted. Path: /run30461/pe/d1/c2 (dc542879-479f-4990-a5ff-ed2e0ed5b4a9). Key: user.nfsv4_acls [2015-04-07 06:38:17.390754] W [client-rpc-fops.c:1090:client3_3_getxattr_cbk] 0-vol0-client-1: remote operation failed: Operation not permitted. Path: /run30461/pe/d1/c2 (dc542879-479f-4990-a5ff-ed2e0ed5b4a9). Key: user.nfsv4_acls Expected results: nfsv4_acls should not create confusion for operation like removal of file and directories. Additional info:
Symlink are not identified correctly when acl are enabled , so that removal of directory will fail due to "directory not empty" error.
Entry representing symlinks in .glusterfs is pointing to invalid entry. So acl fetch for symlinks will fail, correspondingly then all the fops(rm, ls) which requires a acl fetch will fail.
More detailed explaination of above comment. The .glusterfs folder will create links for all the entries inside the brick as gfid as its path, for directories it is a symlink and regular files as hardlinks. for example consider a volume "test" mounted at "/mnt", # gluster v i Volume Name: test Type: Distribute Volume ID: 5cc15dca-9e4c-45ea-913e-f5c26a0c1e7c Status: Started Number of Bricks: 1 Transport-type: tcp Bricks: Brick1: 10.0.0.7:/brick/b1 Options Reconfigured: performance.readdir-ahead: on #mount 10.0.0.7:/test on /mnt type fuse.glusterfs #cd /mnt create a directory "dir" in the mount #mkdir dir So at the backend in .glusterfs folder a hardlink is created for "dir" lrwxrwxrwx 1 root root 52 Jun 24 22:47 f9/ad/f9ad702b-5ef3-40ba-9d04-54b29b266255 -> ../../00/00/00000000-0000-0000-0000-000000000001/dir similar if a file named "foo" is created #touch foo in .glusterfs a hardlink created with path "93/8d/938d5b50-1457-4abb-a425-fbcbfed1c442" but for a symlinks #ln -s dir/ link in .glusterfs a symlink is created like this lrwxrwxrwx 2 root root 4 Jun 24 23:05 c7/ce/c7cea018-84c1-4c0b-954d-418c26a3a08d -> dir/ which is pointing to invalid context in .glusterfs (dead link) These links are cretated in .glusterfs using standard c libary function symlink() for directories and linkat() / link() for other files. In the current implementation of acl in posix xlator, uses standard libacl functions such as acl_get_file() / acl_set_file() . This link is given as input for them and when those functions tries to resolve it, will result in a "ENOENT" . So all the fops(like ls, rm etc) which requires fecthing acl for that symlink will fail.
Thanks for the very clear explanation of the issue! So, we need to get the ACL on a symlink, but we only have the parent gfid and the basename of the symlink (gotten through the handle). The symlink itself can not be traversed (otherwise it would probably be handled as a directory, which does not need to be correct for symlink'd files and such). We have these components: - <parent-gfid> (from the loc_t structure) - <basename> used to build the "real_path" in the posix xlator - real_path (in posix xlator) being a symlink pointing to a non-existing target (readlink returns a string that does not start with "../../") What I think is we need to do: - a readlink() on the "real_path" - get the parent-dir through the parent-gfid (with posix_make_ancestryfromgfid?) - construct the <complete-path> with <parent-dir>/<readlink-result> - possibly repeat the above steps in case the <complete-path> is a symlink? This seems to be a difference between lgetxattr() and acl_get_file(). For ACLs, it is not possible to set them on a symlink, they will always be set on the target file. Reading the ACL expects the symlink to point to the target. This problem now got identified by fetching the ACLs, but I think setting an ACL on a symlink will result in very similar problems. Please have one of the posix-xlator maintainers (added on CC of this bug) check your analysis and my proposed approach. If we are lucky, there is a function in the posix xlator already that can convert a "real_path" broken symlink to a correctly traversed pathname. I did not spot one that quickly, but I also did not search long for it.
REVIEW: http://review.gluster.org/11410 (gfapi : symlink resolution for glfs_object) posted (#1) for review on master by jiffin tony Thottan (jthottan)
In the suggested approached , if it is a symlink are created for directory. Otherwise the link file and target of link can exist in different bricks.So brick side resolution is reliable method. For fuse-client , samba (gfapi) usually clients resolves symlinks and send to server.In my opinion , similar approach can be taken here too.
With approach used in the patch , i face two issues : * It is very much time consuming(if there are lot of symlinks, it will take time to resolve) * Still operations (like rm ,ls) fails for dead links in the mount
REVIEW: http://review.gluster.org/11410 (gfapi : symlink resolution for glfs_object) posted (#3) for review on master by jiffin tony Thottan (jthottan)
REVIEW: http://review.gluster.org/11410 (gfapi : symlink resolution for glfs_object) posted (#4) for review on master by jiffin tony Thottan (jthottan)
REVIEW: http://review.gluster.org/11410 (gfapi : symlink resolution for glfs_object) posted (#5) for review on master by jiffin tony Thottan (jthottan)
REVIEW: http://review.gluster.org/11410 (gfapi : symlink resolution for glfs_object) posted (#6) for review on master by jiffin tony Thottan (jthottan)
REVIEW: http://review.gluster.org/11410 (gfapi : symlink resolution for glfs_object) posted (#7) for review on master by jiffin tony Thottan (jthottan)
REVIEW: http://review.gluster.org/11410 (gfapi : symlink resolution for glfs_object) posted (#8) for review on master by jiffin tony Thottan (jthottan)
COMMIT: http://review.gluster.org/11410 committed in master by Niels de Vos (ndevos) ------ commit 049c8eec304d9548fccb127ee8ce82f179bc41b0 Author: Jiffin Tony Thottan <jthottan> Date: Thu Jun 25 15:04:18 2015 +0530 gfapi : symlink resolution for glfs_object Generally posix expects symlink should be resolved, before performing an acl related operation. This patch introduces a new api glfs_h_resolve_symlink() which will do the same. Change-Id: Ieee645154455a732edfb2c28834021bab4248810 BUG: 1209735 Signed-off-by: Jiffin Tony Thottan <jthottan> Reviewed-on: http://review.gluster.org/11410 Reviewed-by: Niels de Vos <ndevos> Reviewed-by: Raghavendra Talur <rtalur> Tested-by: NetBSD Build System <jenkins.org> Tested-by: Gluster Build System <jenkins.com>
Fix for this BZ is already present in a GlusterFS release. You can find clone of this BZ, fixed in a GlusterFS release and closed. Hence closing this mainline BZ as well.
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report. glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/ [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user