Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1088649

Summary:

Some newly created folders have root ownership although created by unprivileged user

Product:

[Community] GlusterFS

Reporter:

Vincent Caron <vcaron>

Component:

replicate

Assignee:

Pranith Kumar K <pkarampu>

Status:

CLOSED CURRENTRELEASE

QA Contact:

Severity:

high

Docs Contact:

Priority:

unspecified

Version:

mainline

CC:

bugs, gluster-bugs, kdhananj, pkarampu

Target Milestone:

---

Target Release:

---

Hardware:

x86_64

OS:

Linux

Whiteboard:

Fixed In Version:

glusterfs-3.7.0

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Clones:

1171847 1184527 1184528 (view as bug list)

Environment:

Last Closed:

2015-05-14 17:25:40 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

Bug Blocks:

1171847, 1184527, 1184528, 1188082

Attachments:

Description	Flags
logs where the 'root owner' bug occured at 14:49:04 GMT	none

Description Vincent Caron 2014-04-16 23:38:23 UTC

Description of problem:

At a random rate, folders created on a Gluster fs through a fuse-mount and by a process with uid=1000 end up with 0:0 uid:gid (as seen from the fuse-mount and the replicas on the bricks).


Version-Release number of selected component (if applicable):

GlusterFS 3.4.2 on Debian 7.0.


How reproducible:

Estimating about 1/1000 newly created folders end up in this situation. Even one event is a naughty security issue for us, and also a blocking bug for the application (it cannot write in the folder it just created).


Steps to Reproduce:

None yet. The application simply creates a <mntpoint>/folder/xxx folder.


Actual results:

Sometimes this folder ends up with unexpected 0:0 uid:gid.


Expected results:

The folder should be owned by 1000:1000, the effective uid/gid of the creating process. We have extensively checked that no root process is interfering.


Additional info:

This is a fresh install : the volume has been in production for 2 months, its configuration has never changed. It never had a server, brick or network failure.

The closer I could find about my problem since to be this thread : http://comments.gmane.org/gmane.comp.file-systems.gluster.user/3819 with that remark "It looks like replicate's self-heal reverted the directory's permissions" from Vikas.

It might be related to #1016482 but since I use a different version and have different log outputs, I created a distinct ticket.

From one of the cluster's server :

storage1:~# gluster volume info
 
Volume Name: map-prod
Type: Distributed-Replicate
Volume ID: 193b461b-b018-4794-bce7-114076ed634f
Status: Started
Number of Bricks: 8 x 2 = 16
Transport-type: tcp
Bricks:
Brick1: map-storage1:/srv/brick1/map-prod
Brick2: map-storage2:/srv/brick1/map-prod
Brick3: map-storage1:/srv/brick2/map-prod
Brick4: map-storage2:/srv/brick2/map-prod
Brick5: map-storage1:/srv/brick3/map-prod
Brick6: map-storage2:/srv/brick3/map-prod
Brick7: map-storage1:/srv/brick4/map-prod
Brick8: map-storage2:/srv/brick4/map-prod
Brick9: map-storage1:/srv/brick5/map-prod
Brick10: map-storage2:/srv/brick5/map-prod
Brick11: map-storage1:/srv/brick6/map-prod
Brick12: map-storage2:/srv/brick6/map-prod
Brick13: map-storage1:/srv/brick7/map-prod
Brick14: map-storage2:/srv/brick7/map-prod
Brick15: map-storage1:/srv/brick8/map-prod
Brick16: map-storage2:/srv/brick8/map-prod
Options Reconfigured:
cluster.eager-lock: on


One of the last faulty folder, as seen from the fuse mount :

# stat /mnt/map-prod/invoices/2B13
  File: `/mnt/map-prod/invoices/2B13'
  Size: 48              Blocks: 0          IO Block: 131072 directory
Device: 12h/18d Inode: 10147438021336887025  Links: 2
Access: (0755/drwxr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2014-04-16 23:47:45.772550000 +0200
Modify: 2014-04-16 23:47:45.772550000 +0200
Change: 2014-04-16 23:47:45.792550250 +0200

The cluster has 7 clients, all using Fuse. 5 are read/write, 2 are read-only.

I correlated the file name and date with the logs and got on client 1 :

[2014-04-16 21:47:45.772824] I [dht-layout.c:638:dht_layout_normalize] 0-map-prod-dht: found anomalies in /invoices/2B13. holes=1 overlaps=0
[2014-04-16 21:47:45.787346] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 88962600: MKDIR() /invoices/2B13/4F88 => -1 (Permission denied)
[2014-04-16 21:48:30.765157] I [afr-self-heal-entry.c:2253:afr_sh_entry_fix] 0-map-prod-replicate-3: /invoices: Performing conservative merge
[2014-04-16 21:48:30.765525] I [afr-self-heal-entry.c:2253:afr_sh_entry_fix] 0-map-prod-replicate-4: /invoices: Performing conservative merge
[2014-04-16 21:48:30.765655] I [afr-self-heal-entry.c:2253:afr_sh_entry_fix] 0-map-prod-replicate-5: /invoices: Performing conservative merge
[2014-04-16 21:48:30.765836] I [afr-self-heal-entry.c:2253:afr_sh_entry_fix] 0-map-prod-replicate-0: /invoices: Performing conservative merge
[2014-04-16 21:48:30.765953] I [afr-self-heal-entry.c:2253:afr_sh_entry_fix] 0-map-prod-replicate-1: /invoices: Performing conservative merge
[2014-04-16 21:48:30.766219] I [afr-self-heal-entry.c:2253:afr_sh_entry_fix] 0-map-prod-replicate-6: /invoices: Performing conservative merge
[2014-04-16 21:48:30.766340] I [afr-self-heal-entry.c:2253:afr_sh_entry_fix] 0-map-prod-replicate-7: /invoices: Performing conservative merge
[2014-04-16 21:48:31.934962] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 88985878: MKDIR() /invoices/2B13/4F88 => -1 (Permission denied)

And on client 2 :

[2014-04-16 21:47:45.789187] W [client-rpc-fops.c:327:client3_3_mkdir_cbk] 0-map-prod-client-0: remote operation failed: File exists. Path: /invoices/2B13 (00000000-0000-0000-0000-000000000000)
(x14)
[2014-04-16 21:47:45.793708] W [fuse-bridge.c:397:fuse_entry_cbk] 0-glusterfs-fuse: Received NULL gfid for /invoices/2B13. Forcing EIO
[2014-04-16 21:47:45.793735] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 13547296: MKDIR() /invoices/2B13 => -1 (Input/output error)
[2014-04-16 21:47:45.799011] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 13547299: MKDIR() /invoices/2B13/4F88 => -1 (Permission denied)

And on both cluster's servers, repeated for each of the 8 bricks :

[2014-04-16 21:47:45.788872] I [server-rpc-fops.c:528:server_mkdir_cbk] 0-map-prod-server: 1984624: MKDIR (null) (9239cdb1-b1bc-4b12-b189-99865dc3b5bd/2B13) ==> (File exists)
[2014-04-16 21:47:45.791745] I [server-rpc-fops.c:898:_gf_server_log_setxattr_failure] 0-map-prod-server: 1984628: SETXATTR /invoices/2B13 (c2dcb9e4-1796-4df0-8cd2-f11df213eef1) ==> trusted.glusterfs.dht

Comment 1 Pranith Kumar K 2014-04-23 04:38:54 UTC

Vincent,
      Could you please upload the complete logs of both bricks and clients so that I can take a look.

Pranith

Comment 2 Vincent Caron 2014-04-25 20:25:16 UTC

Created attachment 889897 [details]
logs where the 'root owner' bug occured at 14:49:04 GMT

Comment 3 Vincent Caron 2014-04-25 20:27:00 UTC

I only keep 8 days worth of logs, so I compiled some logs from today (2014-04-25) where the bug occured at 14:49:04 GMT on /invoices/DA37/1D37/1EEF.

Thanks for your attention !

Comment 4 Pranith Kumar K 2014-04-28 17:53:26 UTC

hi Vincent,
   There seems to be some problem with the attachment. I am not able to download the complete logs just for this bug. when I untar I get the following error:
pk@localhost - ~/sos/1088649 
23:18:22 :) ⚡ tar xvzf root-bug-gluster-logs.tar.gz 
client1/
client1/mnt-map-prod.log
client2/
client2/mnt-map-prod.log

gzip: stdin: unexpected end of file
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now

I am able to download logs for other bugs, I am not sure what the problem is. Could you upload it again. Did it cross the upload size limit or something?

Pranith

Comment 5 Vincent Caron 2014-04-29 01:58:10 UTC

Hello,

I redownloaded the tarball from this tracker and it untarred fined with 'tar xzf'. Maybe you only partially downloaded it ?

FYI its M5 hash is 25924f84704c8f4ce3ad3d144220de79.

Comment 6 Pranith Kumar K 2014-05-21 09:21:05 UTC

Vincent,
       Krutika and I found that the brick has lots of following logs "srv-brick2-map-prod.log:[2014-04-23 16:51:01.791637] W [posix-handle.c:631:posix_handle_soft] 0-map-prod-posix: symlink ../../92/39/9239cdb1-b1bc-4b12-b189-99865dc3b5bd/5F1C -> /srv/brick2/map-prod/.glusterfs/e5/a9/e5a95672-4d87-4290-bf8e-b45839e1bec2 failed (File exists)".
Could you check if the file /srv/brick2/map-prod/.glusterfs/e5/a9/e5a95672-4d87-4290-bf8e-b45839e1bec2 is still softlink or directory. I think these look similar to the ones in bug 859581.

Pranith.

Comment 7 Vincent Caron 2014-05-21 21:35:37 UTC

Hello,

it's a symlink :

lrwxrwxrwx 1 root root 53 Apr 23 18:51 /srv/brick2/map-prod/.glusterfs/e5/a9/e5a95672-4d87-4290-bf8e-b45839e1bec2 -> ../../92/39/9239cdb1-b1bc-4b12-b189-99865dc3b5bd/5F1C

Comment 8 Vincent Caron 2014-05-21 21:42:02 UTC

Meanwhile I stopped the replication because the cluster was way too slow with 16 bricks. We're operating with half of the cluster (I peer-detach'ed a node), which boils down to 8 distributed bricks. This very bug has not been encountered since, or at least it has not been reported to me.

Sadly I can't verify it easily, because when I run something like 'ionice -c3 find /path/to/brick/ -user root' the GlusterFS performance drops dead. It looks like related to dentry cache thrashing, because the 8 spindles have free I/Os and the server uses most of its 16 GB for cache. The operation search in at least 1 million of files.

Comment 9 Anand Avati 2015-01-12 11:22:19 UTC

REVIEW: http://review.gluster.org/9434 (storage/posix: Set gfid after all xattrs, uid/gid are set) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 10 Anand Avati 2015-01-13 05:56:59 UTC

COMMIT: http://review.gluster.org/9434 committed in master by Raghavendra Bhat (raghavendra) 
------
commit f91d19a9ce6ec6659c84c3f907ebc3ecaceba176
Author: Pranith Kumar K <pkarampu>
Date:   Mon Jan 12 12:50:50 2015 +0530

    storage/posix: Set gfid after all xattrs, uid/gid are set
    
    Problem:
    When a new entry is created gfid is set even before uid/gid, xattrs
    are set on the entry. This can lead to dht/afr healing that file/dir
    with the uid/gid it sees just after the gfid is set, i.e. root/root.
    Sometimes setattr/setxattr are failing on that file/dir.
    
    Fix:
    Set gfid of the file/directory only after uid/gid, xattrs are setup
    properly. Readdirp, lookup either wait for the gfid to be assigned
    to the entry or not update the in-memory inode ctx in posix-acl
    xlator which was producing lot EACCESS/EPERM to the application
    or dht/afr self-heals.
    
    Change-Id: I0a6ced579daabe3452f5a85010a00ca6e8579451
    BUG: 1088649
    Signed-off-by: Pranith Kumar K <pkarampu>
    Reviewed-on: http://review.gluster.org/9434
    Reviewed-by: Niels de Vos <ndevos>
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Shyamsundar Ranganathan <srangana>
    Reviewed-by: Raghavendra Bhat <raghavendra>

Comment 11 Anand Avati 2015-01-14 11:49:50 UTC

REVIEW: http://review.gluster.org/9446 (storage/posix: Don't try to set gfid in case of INTERNAL-mknod) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 12 Anand Avati 2015-01-16 08:55:38 UTC

COMMIT: http://review.gluster.org/9446 committed in master by Raghavendra Bhat (raghavendra) 
------
commit a3bc6cf85d20ab21380f96f8d11cad29a53b3bb4
Author: Pranith Kumar K <pkarampu>
Date:   Wed Jan 14 17:10:41 2015 +0530

    storage/posix: Don't try to set gfid in case of INTERNAL-mknod
    
    Change-Id: I96540ed07f08e54d2a24a3b22c2437bddd558c85
    BUG: 1088649
    Signed-off-by: Pranith Kumar K <pkarampu>
    Reviewed-on: http://review.gluster.org/9446
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Shyamsundar Ranganathan <srangana>
    Reviewed-by: Raghavendra Bhat <raghavendra>

Comment 13 Niels de Vos 2015-05-14 17:25:40 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 14 Niels de Vos 2015-05-14 17:35:26 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 15 Niels de Vos 2015-05-14 17:37:48 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 16 Niels de Vos 2015-05-14 17:42:18 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user