Description of problem: setfacl is a command used to set the acls on files. glusterfs-NFS also supports acls via option "nfs.acl". This option is enabled by default. Take a scenario where the nfs.acl option is set to "disabled", hence setfacl will not work as expected. Problem is seen once the nfs.acl is again set to "enable" and setfacl operation fails. Version-Release number of selected component (if applicable): glusterfs-3.6.0.24-1.el6rhs.x86_64 How reproducible: always Steps to Reproduce: 1. create a 6x2 volume, start it 2. mount it over a nfs 3. create a file in the mount-point 4. use setfacl command to set some acl 5. gluster volume set <volume-name> nfs.acl off 6. trying to setfacl command to set some acl on a file --- operation fails as expected. 7. gluster volume set <volume-name> nfs.acl on 8. trying to setfacl command to set some acl on a file -- FAIL whereas this should pass Actual results: step 8 fails. from client, [root@rhsauto034 ~]# setfacl -m u:acltest_user2:rw /mnt/nfs-test1/aclrun/testfile setfacl: /mnt/nfs-test1/aclrun/testfile: Operation not supported from server, [root@nfs1 ~]# gluster volume set dist-rep1 nfs.acl on volume set: success [root@nfs1 ~]# rpcinfo -p program vers proto port service 100000 4 tcp 111 portmapper 100000 3 tcp 111 portmapper 100000 2 tcp 111 portmapper 100000 4 udp 111 portmapper 100000 3 udp 111 portmapper 100000 2 udp 111 portmapper 100005 3 tcp 38465 mountd 100005 1 tcp 38466 mountd 100003 3 tcp 2049 nfs 100024 1 udp 51048 status 100024 1 tcp 55941 status 100021 4 tcp 38468 nlockmgr 100021 1 udp 870 nlockmgr 100021 1 tcp 872 nlockmgr 100227 3 tcp 2049 nfs_acl [root@nfs1 ~]# gluster volume info dist-rep1 Volume Name: dist-rep1 Type: Distributed-Replicate Volume ID: dab9f592-39b4-428d-bc6d-01a0b7185743 Status: Started Snap Volume: no Number of Bricks: 7 x 2 = 14 Transport-type: tcp Bricks: Brick1: 10.70.37.62:/bricks/d1r11 Brick2: 10.70.37.215:/bricks/d1r21 Brick3: 10.70.37.44:/bricks/d2r11 Brick4: 10.70.37.201:/bricks/dr2r21 Brick5: 10.70.37.62:/bricks/d3r11 Brick6: 10.70.37.215:/bricks/d3r21 Brick7: 10.70.37.44:/bricks/d4r11 Brick8: 10.70.37.201:/bricks/dr4r21 Brick9: 10.70.37.62:/bricks/d5r11 Brick10: 10.70.37.215:/bricks/d5r21 Brick11: 10.70.37.44:/bricks/d6r11 Brick12: 10.70.37.201:/bricks/dr6r21 Brick13: 10.70.37.62:/bricks/d1r12-add-n Brick14: 10.70.37.215:/bricks/d1r22-add-n Options Reconfigured: nfs.acl: on performance.readdir-ahead: on snap-max-hard-limit: 256 snap-max-soft-limit: 90 auto-delete: disable Expected results: the step 8 should be a pass as the nfs.acl has been reset to enabled Additional info: Workaround to problem is to umount and mount the volume again and setfacl opertaion passes.
(1) It should not of high priority because I dont think anybody wants to flip-flop the ACL feature in the server side which would register/deregister with portmapper in the system. Its ok to test this way but does not sound to be very generic use case. (2) It could be an NFS client issue as well which generally caches the NFS acl. packet capture would clarify the things. But as you already mentioned that if you unmount and remount the vol, setacl works fine which means Gluster NFS did not go through any change here. (3) Again you mentioned that rpcinfo -p shows correctly the NFS acl service for both the cases register/de-register part. Will work with Saurabh tomorrow. Thanks, Santosh
I have done a bit more investigation i.e. packet capture. 1. Wireshark on "Packet capture data" does not show anything wrong in the Gluster NFS server side i.e. setfacl fails from the client side and does not even generate any SETACL call on the wire. (Gluster NFS server does not get any SETACL call). 2. NFS Client does caching for NFS v3 ACL and the setfacl call returns from there. As there is no issue in Gluster NFS server side, the defect should be closed. In case, you want more info, nfs-utils (NFS client) team should be contacted for more information. I am lowering the severity/priority of the defect as per the above investigation. Please let me know your view. Thanks, Santosh
As per your investigation may be the client cache is not updated. Now, given the fact that a user may not be knowing about these chages in a case this kind of scenario comes up, so we need to make clear it from nfs-utils team about the intended behaviour of the cache in this of secario. Therefore, I don't think we should close it rather ask the nfs-utils to clarify the cache behaviour. Does it sound useful for us to come to a conclusion?