Bug 1566352

Summary: posix-acl not synchronized between glusterfs clients
Product: [Community] GlusterFS Reporter: zhou lin <zz.sh.cynthia>
Component: posix-aclAssignee: Vijay Bellur <vbellur>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: high    
Version: mainlineCC: atumball, bugs, vbellur
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-6.x Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-06-18 07:26:23 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description zhou lin 2018-04-12 07:04:20 UTC
Description of problem:

I find that the acl on gluster volume root dir is not synchronized between clients. This may cause potentially problems, when one glusterfs client setfacl to volume root dir (E.g. /mnt/mstate), the other client does not has this acl in its local buffer, but when it try to check it using getfacl command, it will retrieve acl from glusterfs server side, so this client will get “very confused”
Version-Release number of selected component (if applicable):

3.12.3
How reproducible:


Steps to Reproduce:
Before test 

on both mn-0 and mn-1 the acl for user _nokrcpsyshealthcheck is “r-x”
[root@mn-0:/root]
# getfacl /mnt/mstate
getfacl: Removing leading '/' from absolute path names
# file: mnt/mstate
# owner: root
# group: root
user::rwx
group::r-x
group:_nokrcpsyshealthcheck:r-x
mask::r-x
other::r-x 

step1
touch files under /mnt/mstate will fail with “Permission denied” on both clients
[client mn-1]:
[_nokrcpsyshealthcheck@mn-1:/mnt/mstate]
$ touch test
touch: cannot touch 'test': Permission denied
[client mn-0]
[_nokrcpsyshealthcheck@mn-0:/root]
$ touch /mnt/mstate/test2
touch: cannot touch '/mnt/mstate/test2': Permission denied

Step2
client mn-1 setfacl -m g:736:rwx /mnt/mstate
[_nokrcpsyshealthcheck@mn-1:/mnt/mstate]
$ exit
exit
[root@mn-1:/mnt/log]
# setfacl -m g:736:rwx /mnt/mstate

Step3
     getfacl on both clients
    [_nokrcpsyshealthcheck@mn-1:/mnt/log]
$ getfacl /mnt/mstate
getfacl: Removing leading '/' from absolute path names
# file: mnt/mstate
# owner: root
# group: root
user::rwx
group::r-x
group:_nokrcpsyshealthcheck:rwx
mask::rwx
other::r-x
    
     [root@mn-0:/root]
# getfacl /mnt/mstate
getfacl: Removing leading '/' from absolute path names
# file: mnt/mstate
# owner: root
# group: root
user::rwx
group::r-x
group:_nokrcpsyshealthcheck:rwx
mask::rwx
other::r-x

step4
touch file in /mnt/mstate is ok on client mn-1 but still fail on client mn-0
[_nokrcpsyshealthcheck@mn-0:/root]
$ touch /mnt/mstate/test2
touch: cannot touch '/mnt/mstate/test2': Permission denied

[root@mn-1:/mnt/log]
# su _nokrcpsyshealthcheck
bash: /dev/null/.bashrc: Not a directory
[_nokrcpsyshealthcheck@mn-1:/mnt/log]
$ cd /mnt/mstate
[_nokrcpsyshealthcheck@mn-1:/mnt/mstate]
$ touch test
[_nokrcpsyshealthcheck@mn-1:/mnt/mstate]
$ ls
as-0  as-1  as-2  cp-0    cp-1  _global  mn-0  mn-1  sn-0  sn-1  sn-2  test


Actual results:

can not touch files under /mnt/mstate even for a long time
Expected results:
could touch files under /mnt/mstate very soon.

Additional info:
there is no such issue in non-volume-root dir.

Comment 1 zhou lin 2018-04-12 07:08:58 UTC
i find that after i revert change https://review.gluster.org/16945
there is no such issue.
following is my analysis on this issue.
When one client setfacl on volume root (E.g /mnt/mstate) it will update local acl buffer and send setxattr OP request to remote glusterfs server side, other gluster clients depend on lookup to update its local acl buffer, but in case of volume root , I do not find any lookup operation for it, except for the first time when client gets up. so , why there is no lookup OP for volume root dir? If root dir is looked up just like other directory, there is no such issue, because the lookup cbk in posix-acl will update its local acl buffer. When other clients try to check their acl, they will send getxattr OP to local posix-acl translator, however, this translator retrieves acl from remote glusterfs server side instead of its local buffer, this makes user very confused, because what they see is not its local acl buffer content! Any other method to update all other client’s local acl buffer?
from current design it seems client will depend on some kinds of fop (E.g lookup op to update local acl buffer) i think it is better if there is some synchronization mechanism between clients.

Comment 2 Shyamsundar 2018-10-23 14:55:31 UTC
Release 3.12 has been EOLd and this bug was still found to be in the NEW state, hence moving the version to mainline, to triage the same and take appropriate actions.

Comment 3 Amar Tumballi 2019-06-18 07:26:23 UTC
https://review.gluster.org/#/c/glusterfs/+/19867/ fixes the part which caused this issue.