Bug 1392772 - [setxattr_cbk] "Permission denied" warning messages are seen in logs while running pjd-fstest suite
Summary: [setxattr_cbk] "Permission denied" warning messages are seen in logs while ru...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: distribute
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
Assignee: Nithya Balachandran
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 1391808 1397252
TreeView+ depends on / blocked
 
Reported: 2016-11-08 08:38 UTC by Nithya Balachandran
Modified: 2017-03-06 17:32 UTC (History)
6 users (show)

Fixed In Version: glusterfs-3.10.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1391808
: 1397252 (view as bug list)
Environment:
Last Closed: 2017-03-06 17:32:36 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Nithya Balachandran 2016-11-08 08:38:36 UTC
+++ This bug was initially created as a clone of Bug #1391808 +++

Description of problem:
=======================
Using pjdfstest filesystem test suite when created a directory with its access and group permissions below warning messages are seen in FUSE and brick logs,

FUSE logs:
=========
[2016-11-03 10:27:10.948703] W [MSGID: 114031] [client-rpc-fops.c:1031:client3_3_setxattr_cbk] 2-distrep-client-7: remote operation failed [Permission denied]
Brick logs:
===========
[2016-11-03 10:26:56.494850] I [MSGID: 115060] [server-rpc-fops.c:865:_gf_server_log_setxattr_failure] 0-distrep-server: 2505: SETXATTR /dir/dir (ef02bc07-ba92-4e40-bd81-1661b0d41582) ==> trusted.glusterfs.dht
[2016-11-03 10:26:56.495084] I [MSGID: 115060] [server-rpc-fops.c:893:server_setxattr_cbk] 0-distrep-server: Permission denied

Version-Release number of selected component (if applicable):
3.8.4-3.el7rhgs.x86_64

How reproducible:
=================
Always

Steps to Reproduce:
===================
1) Create  a distributed replica volume and start it.
2) FUSE mount the volume on a client.
3) As a root user run the below command from mount point to clone pjd-fstest
	git://git.code.sf.net/p/ntfs-3g/pjd-fstest
4) Run the below commands
   a) ./pjd-fstest/fstest mkdir dir 0777
   b) ./pjd-fstest/fstest chown dir 65534 65534
   c) ./pjd-fstest/fstest -u 65534 -g 65534 create dir/file 0464
   d) ./pjd-fstest/fstest -u 65534 -g 65534 mkdir dir/dir 0464

When executed 4d the permission denied messages are seen in FUSE and brick logs.

Also, after issuing 'ls' on the parent directory created in 4a, we are seeing the below info message in FUSE logs,

[2016-11-03 10:30:54.284974] I [MSGID: 109063] [dht-layout.c:713:dht_layout_normalize] 2-distrep-dht: Found anomalies in /dir/dir (gfid = ef02bc07-ba92-4e40-bd81-1661b0d41582). Holes=1 overlaps=0

But, I did not see any holes on the directory created in 4d,

[root@dhcp42-7 ~]# getfattr -d -e hex -m trusted.glusterfs.dht /bricks/brick*/*/dir/dir
getfattr: Removing leading '/' from absolute path names
# file: bricks/brick0/b0/dir/dir
trusted.glusterfs.dht=0x0000000100000000d5555552ffffffff

# file: bricks/brick1/b1/dir/dir
trusted.glusterfs.dht=0x00000001000000002aaaaaaa55555553

# file: bricks/brick2/b2/dir/dir
trusted.glusterfs.dht=0x00000001000000007ffffffeaaaaaaa7

[root@dhcp43-141 ~]# getfattr -d -e hex -m trusted.glusterfs.dht /bricks/brick*/*/dir/dir
getfattr: Removing leading '/' from absolute path names
# file: bricks/brick0/b0/dir/dir
trusted.glusterfs.dht=0x0000000100000000000000002aaaaaa9

# file: bricks/brick1/b1/dir/dir
trusted.glusterfs.dht=0x0000000100000000555555547ffffffd

# file: bricks/brick2/b2/dir/dir
trusted.glusterfs.dht=0x0000000100000000aaaaaaa8d5555551


Actual results:
===============
[setxattr_cbk] "Permission denied" warning messages are seen in logs while running pjd-fstest suite

Expected results:
=================
Permission denied messages should not be seen.

Additional info:

--- Additional comment from Red Hat Bugzilla Rules Engine on 2016-11-04 01:29:36 EDT ---

This bug is automatically being proposed for the current release of Red Hat Gluster Storage 3 under active development, by setting the release flag 'rhgs‑3.2.0' to '?'. 

If this bug should be proposed for a different release, please manually change the proposed release flag.

--- Additional comment from Prasad Desala on 2016-11-04 01:38:32 EDT ---

sosreports@ http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/Prasad/1391808/

Volname: distrep
mounted on: 10.70.42.156 --> mount -t glusterfs 10.70.42.7:/distrep /mnt/fuse

Volume Name: distrep
Type: Distributed-Replicate
Volume ID: 1e411efc-9f16-41cf-99ad-8b28f1c7d935
Status: Started
Snapshot Count: 0
Number of Bricks: 6 x 2 = 12
Transport-type: tcp
Bricks:
Brick1: 10.70.42.7:/bricks/brick0/b0
Brick2: 10.70.41.211:/bricks/brick0/b0
Brick3: 10.70.43.141:/bricks/brick0/b0
Brick4: 10.70.43.156:/bricks/brick0/b0
Brick5: 10.70.42.7:/bricks/brick1/b1
Brick6: 10.70.41.211:/bricks/brick1/b1
Brick7: 10.70.43.141:/bricks/brick1/b1
Brick8: 10.70.43.156:/bricks/brick1/b1
Brick9: 10.70.42.7:/bricks/brick2/b2
Brick10: 10.70.41.211:/bricks/brick2/b2
Brick11: 10.70.43.141:/bricks/brick2/b2
Brick12: 10.70.43.156:/bricks/brick2/b2
Options Reconfigured:
features.quota-deem-statfs: on
features.inode-quota: on
features.quota: on
features.uss: on
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: on


[root@dhcp42-156 fuse]# ll
total 8
drwxrwxrwx. 4 nfsnobody nfsnobody 4096 Nov  4 10:40 dir
drwxr-xr-x. 4 root      root      4096 Nov  3 15:54 pjd-fstest
[root@dhcp42-156 fuse]# ll *
dir:
total 8
dr--rw-r--. 2 nfsnobody nfsnobody 4096 Nov  3 15:56 dir
-r--rw-r--. 1 nfsnobody nfsnobody    0 Nov  3 15:56 file

[root@dhcp42-156 fuse]# cat /etc/passwd
root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
sync:x:5:0:sync:/sbin:/bin/sync
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
halt:x:7:0:halt:/sbin:/sbin/halt
mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
operator:x:11:0:operator:/root:/sbin/nologin
games:x:12:100:games:/usr/games:/sbin/nologin
ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin
nobody:x:99:99:Nobody:/:/sbin/nologin
avahi-autoipd:x:170:170:Avahi IPv4LL Stack:/var/lib/avahi-autoipd:/sbin/nologin
systemd-bus-proxy:x:999:997:systemd Bus Proxy:/:/sbin/nologin
systemd-network:x:998:996:systemd Network Management:/:/sbin/nologin
dbus:x:81:81:System message bus:/:/sbin/nologin
polkitd:x:997:995:User for polkitd:/:/sbin/nologin
tss:x:59:59:Account used by the trousers package to sandbox the tcsd daemon:/dev/null:/sbin/nologin
postfix:x:89:89::/var/spool/postfix:/sbin/nologin
sshd:x:74:74:Privilege-separated SSH:/var/empty/sshd:/sbin/nologin
rpc:x:32:32:Rpcbind Daemon:/var/lib/rpcbind:/sbin/nologin
rpcuser:x:29:29:RPC Service User:/var/lib/nfs:/sbin/nologin
nfsnobody:x:65534:65534:Anonymous NFS User:/var/lib/nfs:/sbin/nologin
testuser1:x:1000:1000::/home/testuser1:/bin/bash
prasad:x:1001:1001::/home/prasad:/bin/bash
user1:x:1002:1002::/home/user1:/bin/bash

--- Additional comment from Shyamsundar on 2016-11-07 13:44:52 EST ---

(In reply to Prasad Desala from comment #0)
> Also, after issuing 'ls' on the parent directory created in 4a, we are
> seeing the below info message in FUSE logs,
> 
> [2016-11-03 10:30:54.284974] I [MSGID: 109063]
> [dht-layout.c:713:dht_layout_normalize] 2-distrep-dht: Found anomalies in
> /dir/dir (gfid = ef02bc07-ba92-4e40-bd81-1661b0d41582). Holes=1 overlaps=0
> 
> But, I did not see any holes on the directory created in 4d,
> 
> [root@dhcp42-7 ~]# getfattr -d -e hex -m trusted.glusterfs.dht
> /bricks/brick*/*/dir/dir

This is because the directory was created without the DHT xattrs and as the client detected an anomaly in the directory, it healed it as root, and so the layout *finally* on disk was not empty.

Before doing the ls on ./dir/ if you would take a look at the xattrs on the brick, you would find them missing.

--- Additional comment from Shyamsundar on 2016-11-07 14:07:13 EST ---

I was able to replicate the problem based on the steps given (thanks).

What occurs here is as follows,
- DHT sends the mkdir call to the bricks
- Post the mkdir, it sends the setxattr call to set the layout xattrs for these directories

The failure is noted in the bricks and on the FUSE client logs, as the setxattr fails.

The setxattr fails, due to the permissions on the directory being 0464 (set during the mkdir as requested by the call), where user has only READ permissions, and posix_acl xlator checks that the UID of the setxattr caller (internal DHT request above) is the same as the UID of the directory inode, and hence evaluates permissions based on this, which evaluates to READ ONLY and denies the setxattr from proceeding.

Code reference:
  - https://github.com/gluster/glusterfs/blob/master/xlators/system/posix-acl/src/posix-acl.c#L2088 posix_acl_setxattr -> calls setxattr_scrutiny
  - https://github.com/gluster/glusterfs/blob/master/xlators/system/posix-acl/src/posix-acl.c#L1920 setxattr_scrutiny -> acl_permits calls
  - acl_permits: Has the POSIX_ACL_USER_OBJ in its list first, and checks the UID equality of the caller to the inode (which matches) and hence uses permissions of the user to proceed doing a perm_check

The downside of all this, is that the DHT layout on this directory is not set, only on a subsequent layout anomaly detection by DHT is this corrected, as anomaly correction is an internal operation that uses UID/GID as root/root and the ACL xlator passes the request without the checks.

General discussion:
The manner in which gluster ACL xlator checks the permissions (user first and then group etc.) is correct as per standards, the reference would be, https://www.gnu.org/software/libc/manual/html_node/Access-Permission.html#Access-Permission

Further on checking the kernel code, the same rules apply there as well, and a directory created with READ for user, subsequently fails with EPERM for any setfattr calls. Kernel code reference: http://lxr.free-electrons.com/source/fs/posix_acl.c#L346

So basically, the implementation in out ACL xlator seems to be right. I.e just because group bits provide WRITE permissions, it is not allowed as the UID of the directory to perform any WRITE operations.

What this means:
So currently other than the logs, there is no real bug to start with. But, the finding is interesting, and we could make the setxattr of DHT xattrs post mkdir an internal operation to be done as root/root (just like healing an anomaly). That would fix the logging problem as well. Is it a security violation is something that is needed to be considered.

For example xattrop that is driven by the client (and is actually a setxattr call in the end) has no ACL xlator checks against that FOP (and as a result does not suffer this problem).

Regression notes:
This should not be a regression, and should be present in the prior releases as well, just a heads up!

Comment 1 Worker Ant 2016-11-08 08:58:55 UTC
REVIEW: http://review.gluster.org/15794 (cluster/dht Set layout after mkdir as root) posted (#1) for review on master by N Balachandran (nbalacha)

Comment 2 Worker Ant 2016-11-21 14:40:19 UTC
COMMIT: http://review.gluster.org/15794 committed in master by Jeff Darcy (jdarcy) 
------
commit 3e405b546e8b9fe15ae477613474e9cd2d2df4e7
Author: N Balachandran <nbalacha>
Date:   Tue Nov 8 14:10:49 2016 +0530

    cluster/dht Set layout after mkdir as root
    
    DHT does not set the layout for newly created
    directories as root. This causes EPERM failures
    when a non-root user with insufficient permissions
    creates directories.
    
    credit: srangana for RCA
    
    Change-Id: Ia646e41665ce172c43c5f01d2707455e8eb374ed
    BUG: 1392772
    Signed-off-by: N Balachandran <nbalacha>
    Reviewed-on: http://review.gluster.org/15794
    Reviewed-by: Susant Palai <spalai>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Smoke: Gluster Build System <jenkins.org>
    Reviewed-by: Raghavendra G <rgowdapp>
    Reviewed-by: Jeff Darcy <jdarcy>

Comment 3 Raghavendra G 2016-11-22 11:04:13 UTC
Seems like more investigation is needed on the RCA. Posting the email conversation we had.


> Sent: Tuesday, November 22, 2016 9:22:09 AM
> Subject: Re: On patch #15794
> 
> On 22 November 2016 at 09:19, Raghavendra Gowdappa <rgowdapp>
> wrote:
> 
> >
> >
> > > Sent: Monday, November 21, 2016 8:36:49 PM
> > > Subject: On patch #15794
> > >
> > > Hi Nithya/Shyam,
> > >
> > > While thinking about patch [1], it occurred to me that if a user has
> > > permission to create directory, same credentials should be good enough to
> > > set layout xattrs too. However, I realized that layout xattrs start with
> > > "trusted." and only superuser can read/set them.
> >
> > Going by the above logic, we should've superuser access for setting all
> > other gluster metadata too (like gfid, afr changelog etc). But we don't
> > sudo to superuser while doing so. Also, from man 7 xattr,
> >
> > <man 7 xattr>
> >  Trusted extended attributes
> >        Trusted extended attributes are visible and accessible only to
> >        processes that have the CAP_SYS_ADMIN capability.  Attributes in
> > this
> >        class are used to implement mechanisms in user space (i.e., outside
> >        the kernel) which keep information in extended attributes to which
> >        ordinary processes should not have access.
> > </man>
> >
> > So, it seems that any process with CAP_SYS_ADMIN capability can set the
> > xattrs. It _seems_ (I am not confirming) that brick process retains that
> > capability even though we do setfsuid/gid (frame->root->uid/gid) (uid/gid
> > of process doing fops like mkdir etc, not the uid/gid of brick process). If
> > this indeed is true, we are missing something as to why mkdir failed with
> > EPERM.
> >
> >
> mkdir is created with the permissions specified in the command. So I guess
> the user would no longer have permissions to modify it (say set an xattr)
> if the op comes in as the same uid.

If that is indeed the case, its puzzling that the test I posted in my first mail passed. I've a feeling that we are missing out something and more investigation is needed here.

> 
> Ideally, all internal ops should be marked internal_fop and no security
> checks should hold in that case. The security implications of this need to
> be determined.
> 
> 
> 
> > As to why lookup path in dht does SUDO but not ops like mkdir/rmdir, it
> > might be because lookup can be executed by a process with just read
> > permission or process with no permission to access the directory. However,
> > that might not be the case with mkdir/rmdir codepath.
> >
> > > However on testing I was
> > > able to create a directory as non-root user. Note that the exported
> > > directory is owned by "raghu:users".
> > >
> > >
> > > [raghu@unused glusterfs]$ mkdir dir
> > >
> > > [raghu@unused glusterfs]$ whoami
> > > raghu
> > >
> > > [raghu@unused glusterfs]$ getfattr -e hex -d -m. /home/export/newptop
> > > getfattr: Removing leading '/' from absolute path names
> > > # file: home/export/newptop
> > > security.selinux=0x756e636f6e66696e65645f753a6f
> > 626a6563745f723a686f6d655f726f6f745f743a733000
> > >
> > > [raghu@unused glusterfs]$ getfattr -e hex -d -m.
> > /home/export/newptop/dir
> > > getfattr: Removing leading '/' from absolute path names
> > > # file: home/export/newptop/dir
> > > security.selinux=0x756e636f6e66696e65645f753a6f
> > 626a6563745f723a686f6d655f726f6f745f743a733000
> > >
> > > [raghu@unused glusterfs]$ sudo getfattr -e hex -d -m.
> > > /home/export/newptop/dir
> > > [sudo] password for raghu:
> > > getfattr: Removing leading '/' from absolute path names
> > > # file: home/export/newptop/dir
> > > security.selinux=0x756e636f6e66696e65645f753a6f
> > 626a6563745f723a686f6d655f726f6f745f743a733000
> > > trusted.gfid=0x17f753292e4b46c58b736d3a975af0dc
> > > trusted.glusterfs.dht=0x000000010000000000000000ffffffff
> > >
> > > [raghu@unused glusterfs]$ cd ..
> > >
> > > [raghu@unused mnt]$ cd /home/
> > >
> > > [raghu@unused home]$ cd
> > >
> > > [raghu@unused ~]$ mkdir tmp
> > >
> > > [raghu@unused ~]$ setfattr -n trusted.glusterfs.dht -v
> > > 0x000000010000000000000000ffffffff tmp
> > > setfattr: tmp: Operation not permitted
> > >
> > > [raghu@unused ~]$ ls -ld tmp
> > > drwxrwxr-x. 2 raghu raghu 4096 Nov 21 20:25 tmp
> > >
> > > Note that last setfattr on backend fs failed with EPERM.
> > >
> > > Now the puzzling thing is how mkdir by a non-root user succeeded on
> > glusterfs
> > > with appropriate xattrs set. Any explanations?
> > >
> > > [1] http://review.gluster.org/#/c/15794/
> > >
> > > regards,
> > > Raghavendra
> >
>

Comment 4 Raghavendra G 2016-11-23 09:32:49 UTC
Except for how mkdir by non-root user succeeded in my test above, other things can be answered.

* gfid is never set as a separate setxattr call sent through posix-acl. All posix-acl sees is a create/mkdir/symlink/mknod operation.
* xattrop is not implemented by posix-acl (rightly so, as it is an internal operation). So, doesn't affect afr changelog etc.
* however it will affect any metadata setting done through setxattr (dht layouts, marker dirty flags etc).

Based on Nithya's analysis, I agree this patch is needed and solves the issue present here (though there are some open questions which are not directly related to this bz).

Comment 5 Raghavendra G 2016-11-23 09:52:21 UTC
(In reply to Raghavendra G from comment #4)
> Except for how mkdir by non-root user succeeded in my test above, other
> things can be answered.

<raghug> I think the bug was a special case
<raghug> look at this comment
<raghug> The setxattr fails, due to the permissions on the directory being 0464 (set during the mkdir as requested by the call), where user has only READ permissions, and posix_acl xlator checks that the UID of the setxattr caller (internal DHT request above) is the same as the UID of the directory inode, and hence evaluates permissions based on this, which evaluates to READ ONLY and denies the setxattr from proceeding.
<nbalacha> the bug was using the mode to create the dir
<nbalacha> yes
<nbalacha> the user creates a dir with read perms
<raghug> that explains
<raghug> thanks
<nbalacha> which is why I asked if you had used the mode :)
<raghug> I didn't get that. Sorry :)
<nbalacha> yesterday - I asked if you had used -m in the mkdir
<nbalacha> because the test on which the bug was reported does
<raghug> no.. I meant I didn't realize the importance of mode :)
<nbalacha> so the mkdir ops in posix uses that and the perms are changed immediately
<nbalacha> ah ok
<nbalacha> so posix acl check fails because the layout set is a write 
<nbalacha> and there is no write perm
<nbalacha> ideally we need to relook at the whole thing
<nbalacha> this is a hacky way to fix things
<raghug> the whole root of confusion was, I was wondering if the user can create a directory he'll have write permissions
<nbalacha> we need special internal ops and a way to bypass checks for those
<raghug> didn't realize that mode alters that

Comment 6 Nithya Balachandran 2017-01-02 06:44:41 UTC
Patch has been merged in master. Moving this to Modified.

Comment 7 Shyamsundar 2017-03-06 17:32:36 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.10.0, please open a new bug report.

glusterfs-3.10.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/gluster-users/2017-February/030119.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.