1184527 – Some newly created folders have root ownership although created by unprivileged user

Bug 1184527 - Some newly created folders have root ownership although created by unprivileged user

Summary: Some newly created folders have root ownership although created by unprivileg...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	posix
Sub Component:
Version:	3.6.1
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Assignee:	Pranith Kumar K
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:	1088649
Blocks:	1171847 glusterfs-3.6.3 1188082
TreeView+	depends on / blocked

Reported:	2015-01-21 15:48 UTC by Pranith Kumar K
Modified:	2016-02-04 15:20 UTC (History)
CC List:	6 users (show)
Fixed In Version:	glusterfs-v3.6.3
Clone Of:	1088649
Environment:
Last Closed:	2016-02-04 15:20:35 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Pranith Kumar K 2015-01-21 15:48:51 UTC

+++ This bug was initially created as a clone of Bug #1088649 +++

Description of problem:

At a random rate, folders created on a Gluster fs through a fuse-mount and by a process with uid=1000 end up with 0:0 uid:gid (as seen from the fuse-mount and the replicas on the bricks).


Version-Release number of selected component (if applicable):

GlusterFS 3.4.2 on Debian 7.0.


How reproducible:

Estimating about 1/1000 newly created folders end up in this situation. Even one event is a naughty security issue for us, and also a blocking bug for the application (it cannot write in the folder it just created).


Steps to Reproduce:

None yet. The application simply creates a <mntpoint>/folder/xxx folder.


Actual results:

Sometimes this folder ends up with unexpected 0:0 uid:gid.


Expected results:

The folder should be owned by 1000:1000, the effective uid/gid of the creating process. We have extensively checked that no root process is interfering.


Additional info:

This is a fresh install : the volume has been in production for 2 months, its configuration has never changed. It never had a server, brick or network failure.

The closer I could find about my problem since to be this thread : http://comments.gmane.org/gmane.comp.file-systems.gluster.user/3819 with that remark "It looks like replicate's self-heal reverted the directory's permissions" from Vikas.

It might be related to #1016482 but since I use a different version and have different log outputs, I created a distinct ticket.

From one of the cluster's server :

storage1:~# gluster volume info
 
Volume Name: map-prod
Type: Distributed-Replicate
Volume ID: 193b461b-b018-4794-bce7-114076ed634f
Status: Started
Number of Bricks: 8 x 2 = 16
Transport-type: tcp
Bricks:
Brick1: map-storage1:/srv/brick1/map-prod
Brick2: map-storage2:/srv/brick1/map-prod
Brick3: map-storage1:/srv/brick2/map-prod
Brick4: map-storage2:/srv/brick2/map-prod
Brick5: map-storage1:/srv/brick3/map-prod
Brick6: map-storage2:/srv/brick3/map-prod
Brick7: map-storage1:/srv/brick4/map-prod
Brick8: map-storage2:/srv/brick4/map-prod
Brick9: map-storage1:/srv/brick5/map-prod
Brick10: map-storage2:/srv/brick5/map-prod
Brick11: map-storage1:/srv/brick6/map-prod
Brick12: map-storage2:/srv/brick6/map-prod
Brick13: map-storage1:/srv/brick7/map-prod
Brick14: map-storage2:/srv/brick7/map-prod
Brick15: map-storage1:/srv/brick8/map-prod
Brick16: map-storage2:/srv/brick8/map-prod
Options Reconfigured:
cluster.eager-lock: on


One of the last faulty folder, as seen from the fuse mount :

# stat /mnt/map-prod/invoices/2B13
  File: `/mnt/map-prod/invoices/2B13'
  Size: 48              Blocks: 0          IO Block: 131072 directory
Device: 12h/18d Inode: 10147438021336887025  Links: 2
Access: (0755/drwxr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2014-04-16 23:47:45.772550000 +0200
Modify: 2014-04-16 23:47:45.772550000 +0200
Change: 2014-04-16 23:47:45.792550250 +0200

The cluster has 7 clients, all using Fuse. 5 are read/write, 2 are read-only.

I correlated the file name and date with the logs and got on client 1 :

[2014-04-16 21:47:45.772824] I [dht-layout.c:638:dht_layout_normalize] 0-map-prod-dht: found anomalies in /invoices/2B13. holes=1 overlaps=0
[2014-04-16 21:47:45.787346] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 88962600: MKDIR() /invoices/2B13/4F88 => -1 (Permission denied)
[2014-04-16 21:48:30.765157] I [afr-self-heal-entry.c:2253:afr_sh_entry_fix] 0-map-prod-replicate-3: /invoices: Performing conservative merge
[2014-04-16 21:48:30.765525] I [afr-self-heal-entry.c:2253:afr_sh_entry_fix] 0-map-prod-replicate-4: /invoices: Performing conservative merge
[2014-04-16 21:48:30.765655] I [afr-self-heal-entry.c:2253:afr_sh_entry_fix] 0-map-prod-replicate-5: /invoices: Performing conservative merge
[2014-04-16 21:48:30.765836] I [afr-self-heal-entry.c:2253:afr_sh_entry_fix] 0-map-prod-replicate-0: /invoices: Performing conservative merge
[2014-04-16 21:48:30.765953] I [afr-self-heal-entry.c:2253:afr_sh_entry_fix] 0-map-prod-replicate-1: /invoices: Performing conservative merge
[2014-04-16 21:48:30.766219] I [afr-self-heal-entry.c:2253:afr_sh_entry_fix] 0-map-prod-replicate-6: /invoices: Performing conservative merge
[2014-04-16 21:48:30.766340] I [afr-self-heal-entry.c:2253:afr_sh_entry_fix] 0-map-prod-replicate-7: /invoices: Performing conservative merge
[2014-04-16 21:48:31.934962] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 88985878: MKDIR() /invoices/2B13/4F88 => -1 (Permission denied)

And on client 2 :

[2014-04-16 21:47:45.789187] W [client-rpc-fops.c:327:client3_3_mkdir_cbk] 0-map-prod-client-0: remote operation failed: File exists. Path: /invoices/2B13 (00000000-0000-0000-0000-000000000000)
(x14)
[2014-04-16 21:47:45.793708] W [fuse-bridge.c:397:fuse_entry_cbk] 0-glusterfs-fuse: Received NULL gfid for /invoices/2B13. Forcing EIO
[2014-04-16 21:47:45.793735] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 13547296: MKDIR() /invoices/2B13 => -1 (Input/output error)
[2014-04-16 21:47:45.799011] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 13547299: MKDIR() /invoices/2B13/4F88 => -1 (Permission denied)

And on both cluster's servers, repeated for each of the 8 bricks :

[2014-04-16 21:47:45.788872] I [server-rpc-fops.c:528:server_mkdir_cbk] 0-map-prod-server: 1984624: MKDIR (null) (9239cdb1-b1bc-4b12-b189-99865dc3b5bd/2B13) ==> (File exists)
[2014-04-16 21:47:45.791745] I [server-rpc-fops.c:898:_gf_server_log_setxattr_failure] 0-map-prod-server: 1984628: SETXATTR /invoices/2B13 (c2dcb9e4-1796-4df0-8cd2-f11df213eef1) ==> trusted.glusterfs.dht

--- Additional comment from Pranith Kumar K on 2014-04-23 00:38:54 EDT ---

Vincent,
      Could you please upload the complete logs of both bricks and clients so that I can take a look.

Pranith

--- Additional comment from Vincent Caron on 2014-04-25 16:25:16 EDT ---



--- Additional comment from Vincent Caron on 2014-04-25 16:27:00 EDT ---

I only keep 8 days worth of logs, so I compiled some logs from today (2014-04-25) where the bug occured at 14:49:04 GMT on /invoices/DA37/1D37/1EEF.

Thanks for your attention !

--- Additional comment from Pranith Kumar K on 2014-04-28 13:53:26 EDT ---

hi Vincent,
   There seems to be some problem with the attachment. I am not able to download the complete logs just for this bug. when I untar I get the following error:
pk@localhost - ~/sos/1088649 
23:18:22 :) ⚡ tar xvzf root-bug-gluster-logs.tar.gz 
client1/
client1/mnt-map-prod.log
client2/
client2/mnt-map-prod.log

gzip: stdin: unexpected end of file
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now

I am able to download logs for other bugs, I am not sure what the problem is. Could you upload it again. Did it cross the upload size limit or something?

Pranith

--- Additional comment from Vincent Caron on 2014-04-28 21:58:10 EDT ---

Hello,

I redownloaded the tarball from this tracker and it untarred fined with 'tar xzf'. Maybe you only partially downloaded it ?

FYI its M5 hash is 25924f84704c8f4ce3ad3d144220de79.

--- Additional comment from Pranith Kumar K on 2014-05-21 05:21:05 EDT ---

Vincent,
       Krutika and I found that the brick has lots of following logs "srv-brick2-map-prod.log:[2014-04-23 16:51:01.791637] W [posix-handle.c:631:posix_handle_soft] 0-map-prod-posix: symlink ../../92/39/9239cdb1-b1bc-4b12-b189-99865dc3b5bd/5F1C -> /srv/brick2/map-prod/.glusterfs/e5/a9/e5a95672-4d87-4290-bf8e-b45839e1bec2 failed (File exists)".
Could you check if the file /srv/brick2/map-prod/.glusterfs/e5/a9/e5a95672-4d87-4290-bf8e-b45839e1bec2 is still softlink or directory. I think these look similar to the ones in bug 859581.

Pranith.

--- Additional comment from Vincent Caron on 2014-05-21 17:35:37 EDT ---

Hello,

it's a symlink :

lrwxrwxrwx 1 root root 53 Apr 23 18:51 /srv/brick2/map-prod/.glusterfs/e5/a9/e5a95672-4d87-4290-bf8e-b45839e1bec2 -> ../../92/39/9239cdb1-b1bc-4b12-b189-99865dc3b5bd/5F1C

--- Additional comment from Vincent Caron on 2014-05-21 17:42:02 EDT ---

Meanwhile I stopped the replication because the cluster was way too slow with 16 bricks. We're operating with half of the cluster (I peer-detach'ed a node), which boils down to 8 distributed bricks. This very bug has not been encountered since, or at least it has not been reported to me.

Sadly I can't verify it easily, because when I run something like 'ionice -c3 find /path/to/brick/ -user root' the GlusterFS performance drops dead. It looks like related to dentry cache thrashing, because the 8 spindles have free I/Os and the server uses most of its 16 GB for cache. The operation search in at least 1 million of files.

--- Additional comment from Anand Avati on 2015-01-12 06:22:19 EST ---

REVIEW: http://review.gluster.org/9434 (storage/posix: Set gfid after all xattrs, uid/gid are set) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu)

--- Additional comment from Anand Avati on 2015-01-13 00:56:59 EST ---

COMMIT: http://review.gluster.org/9434 committed in master by Raghavendra Bhat (raghavendra) 
------
commit f91d19a9ce6ec6659c84c3f907ebc3ecaceba176
Author: Pranith Kumar K <pkarampu>
Date:   Mon Jan 12 12:50:50 2015 +0530

    storage/posix: Set gfid after all xattrs, uid/gid are set
    
    Problem:
    When a new entry is created gfid is set even before uid/gid, xattrs
    are set on the entry. This can lead to dht/afr healing that file/dir
    with the uid/gid it sees just after the gfid is set, i.e. root/root.
    Sometimes setattr/setxattr are failing on that file/dir.
    
    Fix:
    Set gfid of the file/directory only after uid/gid, xattrs are setup
    properly. Readdirp, lookup either wait for the gfid to be assigned
    to the entry or not update the in-memory inode ctx in posix-acl
    xlator which was producing lot EACCESS/EPERM to the application
    or dht/afr self-heals.
    
    Change-Id: I0a6ced579daabe3452f5a85010a00ca6e8579451
    BUG: 1088649
    Signed-off-by: Pranith Kumar K <pkarampu>
    Reviewed-on: http://review.gluster.org/9434
    Reviewed-by: Niels de Vos <ndevos>
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Shyamsundar Ranganathan <srangana>
    Reviewed-by: Raghavendra Bhat <raghavendra>

--- Additional comment from Anand Avati on 2015-01-14 06:49:50 EST ---

REVIEW: http://review.gluster.org/9446 (storage/posix: Don't try to set gfid in case of INTERNAL-mknod) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu)

--- Additional comment from Anand Avati on 2015-01-16 03:55:38 EST ---

COMMIT: http://review.gluster.org/9446 committed in master by Raghavendra Bhat (raghavendra) 
------
commit a3bc6cf85d20ab21380f96f8d11cad29a53b3bb4
Author: Pranith Kumar K <pkarampu>
Date:   Wed Jan 14 17:10:41 2015 +0530

    storage/posix: Don't try to set gfid in case of INTERNAL-mknod
    
    Change-Id: I96540ed07f08e54d2a24a3b22c2437bddd558c85
    BUG: 1088649
    Signed-off-by: Pranith Kumar K <pkarampu>
    Reviewed-on: http://review.gluster.org/9446
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Shyamsundar Ranganathan <srangana>
    Reviewed-by: Raghavendra Bhat <raghavendra>

Comment 1 Anand Avati 2015-01-21 15:55:20 UTC

REVIEW: http://review.gluster.org/9471 (storage/posix: Set gfid after all xattrs, uid/gid are set) posted (#1) for review on release-3.6 by Pranith Kumar Karampuri (pkarampu)

Comment 2 Anand Avati 2015-01-21 15:55:23 UTC

REVIEW: http://review.gluster.org/9472 (storage/posix: Don't try to set gfid in case of INTERNAL-mknod) posted (#1) for review on release-3.6 by Pranith Kumar Karampuri (pkarampu)

Comment 3 Anand Avati 2015-02-11 09:49:06 UTC

COMMIT: http://review.gluster.org/9471 committed in release-3.6 by Raghavendra Bhat (raghavendra) 
------
commit d9125608b33181fed8052ff5f867b2cb5c4e679f
Author: Pranith Kumar K <pkarampu>
Date:   Mon Jan 12 12:50:50 2015 +0530

    storage/posix: Set gfid after all xattrs, uid/gid are set
    
            Backport of http://review.gluster.com/9434
    
    Problem:
    When a new entry is created gfid is set even before uid/gid, xattrs
    are set on the entry. This can lead to dht/afr healing that file/dir
    with the uid/gid it sees just after the gfid is set, i.e. root/root.
    Sometimes setattr/setxattr are failing on that file/dir.
    
    Fix:
    Set gfid of the file/directory only after uid/gid, xattrs are setup
    properly. Readdirp, lookup either wait for the gfid to be assigned
    to the entry or not update the in-memory inode ctx in posix-acl
    xlator which was producing lot EACCESS/EPERM to the application
    or dht/afr self-heals.
    
    BUG: 1184527
    Change-Id: Ia6dfd492e03db2275665e7f63260611b310e38e6
    Signed-off-by: Pranith Kumar K <pkarampu>
    Reviewed-on: http://review.gluster.org/9471
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Raghavendra Bhat <raghavendra>

Comment 4 Anand Avati 2015-02-11 09:49:25 UTC

COMMIT: http://review.gluster.org/9472 committed in release-3.6 by Raghavendra Bhat (raghavendra) 
------
commit 098d823dd771115be5ebbdc4e14ae68451da7f6a
Author: Pranith Kumar K <pkarampu>
Date:   Wed Jan 14 17:10:41 2015 +0530

    storage/posix: Don't try to set gfid in case of INTERNAL-mknod
    
            Backport of http://review.gluster.org/9446
    
    BUG: 1184527
    Change-Id: I3e31b0672704cae09c5097f84229b8264a6c0fbe
    Signed-off-by: Pranith Kumar K <pkarampu>
    Reviewed-on: http://review.gluster.org/9472
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Raghavendra Bhat <raghavendra>

Comment 5 Kaushal 2016-02-04 15:20:35 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-v3.6.3, please open a new bug report.

glusterfs-v3.6.3 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://www.gluster.org/pipermail/gluster-users/2015-April/021669.html
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Note You need to log in before you can comment on or make changes to this bug.