+++ This bug was initially created as a clone of Bug #1088649 +++ Description of problem: At a random rate, folders created on a Gluster fs through a fuse-mount and by a process with uid=1000 end up with 0:0 uid:gid (as seen from the fuse-mount and the replicas on the bricks). Version-Release number of selected component (if applicable): GlusterFS 3.4.2 on Debian 7.0. How reproducible: Estimating about 1/1000 newly created folders end up in this situation. Even one event is a naughty security issue for us, and also a blocking bug for the application (it cannot write in the folder it just created). Steps to Reproduce: None yet. The application simply creates a <mntpoint>/folder/xxx folder. Actual results: Sometimes this folder ends up with unexpected 0:0 uid:gid. Expected results: The folder should be owned by 1000:1000, the effective uid/gid of the creating process. We have extensively checked that no root process is interfering. Additional info: This is a fresh install : the volume has been in production for 2 months, its configuration has never changed. It never had a server, brick or network failure. The closer I could find about my problem since to be this thread : http://comments.gmane.org/gmane.comp.file-systems.gluster.user/3819 with that remark "It looks like replicate's self-heal reverted the directory's permissions" from Vikas. It might be related to #1016482 but since I use a different version and have different log outputs, I created a distinct ticket. From one of the cluster's server : storage1:~# gluster volume info Volume Name: map-prod Type: Distributed-Replicate Volume ID: 193b461b-b018-4794-bce7-114076ed634f Status: Started Number of Bricks: 8 x 2 = 16 Transport-type: tcp Bricks: Brick1: map-storage1:/srv/brick1/map-prod Brick2: map-storage2:/srv/brick1/map-prod Brick3: map-storage1:/srv/brick2/map-prod Brick4: map-storage2:/srv/brick2/map-prod Brick5: map-storage1:/srv/brick3/map-prod Brick6: map-storage2:/srv/brick3/map-prod Brick7: map-storage1:/srv/brick4/map-prod Brick8: map-storage2:/srv/brick4/map-prod Brick9: map-storage1:/srv/brick5/map-prod Brick10: map-storage2:/srv/brick5/map-prod Brick11: map-storage1:/srv/brick6/map-prod Brick12: map-storage2:/srv/brick6/map-prod Brick13: map-storage1:/srv/brick7/map-prod Brick14: map-storage2:/srv/brick7/map-prod Brick15: map-storage1:/srv/brick8/map-prod Brick16: map-storage2:/srv/brick8/map-prod Options Reconfigured: cluster.eager-lock: on One of the last faulty folder, as seen from the fuse mount : # stat /mnt/map-prod/invoices/2B13 File: `/mnt/map-prod/invoices/2B13' Size: 48 Blocks: 0 IO Block: 131072 directory Device: 12h/18d Inode: 10147438021336887025 Links: 2 Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2014-04-16 23:47:45.772550000 +0200 Modify: 2014-04-16 23:47:45.772550000 +0200 Change: 2014-04-16 23:47:45.792550250 +0200 The cluster has 7 clients, all using Fuse. 5 are read/write, 2 are read-only. I correlated the file name and date with the logs and got on client 1 : [2014-04-16 21:47:45.772824] I [dht-layout.c:638:dht_layout_normalize] 0-map-prod-dht: found anomalies in /invoices/2B13. holes=1 overlaps=0 [2014-04-16 21:47:45.787346] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 88962600: MKDIR() /invoices/2B13/4F88 => -1 (Permission denied) [2014-04-16 21:48:30.765157] I [afr-self-heal-entry.c:2253:afr_sh_entry_fix] 0-map-prod-replicate-3: /invoices: Performing conservative merge [2014-04-16 21:48:30.765525] I [afr-self-heal-entry.c:2253:afr_sh_entry_fix] 0-map-prod-replicate-4: /invoices: Performing conservative merge [2014-04-16 21:48:30.765655] I [afr-self-heal-entry.c:2253:afr_sh_entry_fix] 0-map-prod-replicate-5: /invoices: Performing conservative merge [2014-04-16 21:48:30.765836] I [afr-self-heal-entry.c:2253:afr_sh_entry_fix] 0-map-prod-replicate-0: /invoices: Performing conservative merge [2014-04-16 21:48:30.765953] I [afr-self-heal-entry.c:2253:afr_sh_entry_fix] 0-map-prod-replicate-1: /invoices: Performing conservative merge [2014-04-16 21:48:30.766219] I [afr-self-heal-entry.c:2253:afr_sh_entry_fix] 0-map-prod-replicate-6: /invoices: Performing conservative merge [2014-04-16 21:48:30.766340] I [afr-self-heal-entry.c:2253:afr_sh_entry_fix] 0-map-prod-replicate-7: /invoices: Performing conservative merge [2014-04-16 21:48:31.934962] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 88985878: MKDIR() /invoices/2B13/4F88 => -1 (Permission denied) And on client 2 : [2014-04-16 21:47:45.789187] W [client-rpc-fops.c:327:client3_3_mkdir_cbk] 0-map-prod-client-0: remote operation failed: File exists. Path: /invoices/2B13 (00000000-0000-0000-0000-000000000000) (x14) [2014-04-16 21:47:45.793708] W [fuse-bridge.c:397:fuse_entry_cbk] 0-glusterfs-fuse: Received NULL gfid for /invoices/2B13. Forcing EIO [2014-04-16 21:47:45.793735] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 13547296: MKDIR() /invoices/2B13 => -1 (Input/output error) [2014-04-16 21:47:45.799011] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 13547299: MKDIR() /invoices/2B13/4F88 => -1 (Permission denied) And on both cluster's servers, repeated for each of the 8 bricks : [2014-04-16 21:47:45.788872] I [server-rpc-fops.c:528:server_mkdir_cbk] 0-map-prod-server: 1984624: MKDIR (null) (9239cdb1-b1bc-4b12-b189-99865dc3b5bd/2B13) ==> (File exists) [2014-04-16 21:47:45.791745] I [server-rpc-fops.c:898:_gf_server_log_setxattr_failure] 0-map-prod-server: 1984628: SETXATTR /invoices/2B13 (c2dcb9e4-1796-4df0-8cd2-f11df213eef1) ==> trusted.glusterfs.dht --- Additional comment from Pranith Kumar K on 2014-04-23 00:38:54 EDT --- Vincent, Could you please upload the complete logs of both bricks and clients so that I can take a look. Pranith --- Additional comment from Vincent Caron on 2014-04-25 16:25:16 EDT --- --- Additional comment from Vincent Caron on 2014-04-25 16:27:00 EDT --- I only keep 8 days worth of logs, so I compiled some logs from today (2014-04-25) where the bug occured at 14:49:04 GMT on /invoices/DA37/1D37/1EEF. Thanks for your attention ! --- Additional comment from Pranith Kumar K on 2014-04-28 13:53:26 EDT --- hi Vincent, There seems to be some problem with the attachment. I am not able to download the complete logs just for this bug. when I untar I get the following error: pk@localhost - ~/sos/1088649 23:18:22 :) ⚡ tar xvzf root-bug-gluster-logs.tar.gz client1/ client1/mnt-map-prod.log client2/ client2/mnt-map-prod.log gzip: stdin: unexpected end of file tar: Unexpected EOF in archive tar: Unexpected EOF in archive tar: Error is not recoverable: exiting now I am able to download logs for other bugs, I am not sure what the problem is. Could you upload it again. Did it cross the upload size limit or something? Pranith --- Additional comment from Vincent Caron on 2014-04-28 21:58:10 EDT --- Hello, I redownloaded the tarball from this tracker and it untarred fined with 'tar xzf'. Maybe you only partially downloaded it ? FYI its M5 hash is 25924f84704c8f4ce3ad3d144220de79. --- Additional comment from Pranith Kumar K on 2014-05-21 05:21:05 EDT --- Vincent, Krutika and I found that the brick has lots of following logs "srv-brick2-map-prod.log:[2014-04-23 16:51:01.791637] W [posix-handle.c:631:posix_handle_soft] 0-map-prod-posix: symlink ../../92/39/9239cdb1-b1bc-4b12-b189-99865dc3b5bd/5F1C -> /srv/brick2/map-prod/.glusterfs/e5/a9/e5a95672-4d87-4290-bf8e-b45839e1bec2 failed (File exists)". Could you check if the file /srv/brick2/map-prod/.glusterfs/e5/a9/e5a95672-4d87-4290-bf8e-b45839e1bec2 is still softlink or directory. I think these look similar to the ones in bug 859581. Pranith. --- Additional comment from Vincent Caron on 2014-05-21 17:35:37 EDT --- Hello, it's a symlink : lrwxrwxrwx 1 root root 53 Apr 23 18:51 /srv/brick2/map-prod/.glusterfs/e5/a9/e5a95672-4d87-4290-bf8e-b45839e1bec2 -> ../../92/39/9239cdb1-b1bc-4b12-b189-99865dc3b5bd/5F1C --- Additional comment from Vincent Caron on 2014-05-21 17:42:02 EDT --- Meanwhile I stopped the replication because the cluster was way too slow with 16 bricks. We're operating with half of the cluster (I peer-detach'ed a node), which boils down to 8 distributed bricks. This very bug has not been encountered since, or at least it has not been reported to me. Sadly I can't verify it easily, because when I run something like 'ionice -c3 find /path/to/brick/ -user root' the GlusterFS performance drops dead. It looks like related to dentry cache thrashing, because the 8 spindles have free I/Os and the server uses most of its 16 GB for cache. The operation search in at least 1 million of files. --- Additional comment from Anand Avati on 2015-01-12 06:22:19 EST --- REVIEW: http://review.gluster.org/9434 (storage/posix: Set gfid after all xattrs, uid/gid are set) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu) --- Additional comment from Anand Avati on 2015-01-13 00:56:59 EST --- COMMIT: http://review.gluster.org/9434 committed in master by Raghavendra Bhat (raghavendra) ------ commit f91d19a9ce6ec6659c84c3f907ebc3ecaceba176 Author: Pranith Kumar K <pkarampu> Date: Mon Jan 12 12:50:50 2015 +0530 storage/posix: Set gfid after all xattrs, uid/gid are set Problem: When a new entry is created gfid is set even before uid/gid, xattrs are set on the entry. This can lead to dht/afr healing that file/dir with the uid/gid it sees just after the gfid is set, i.e. root/root. Sometimes setattr/setxattr are failing on that file/dir. Fix: Set gfid of the file/directory only after uid/gid, xattrs are setup properly. Readdirp, lookup either wait for the gfid to be assigned to the entry or not update the in-memory inode ctx in posix-acl xlator which was producing lot EACCESS/EPERM to the application or dht/afr self-heals. Change-Id: I0a6ced579daabe3452f5a85010a00ca6e8579451 BUG: 1088649 Signed-off-by: Pranith Kumar K <pkarampu> Reviewed-on: http://review.gluster.org/9434 Reviewed-by: Niels de Vos <ndevos> Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Shyamsundar Ranganathan <srangana> Reviewed-by: Raghavendra Bhat <raghavendra> --- Additional comment from Anand Avati on 2015-01-14 06:49:50 EST --- REVIEW: http://review.gluster.org/9446 (storage/posix: Don't try to set gfid in case of INTERNAL-mknod) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu) --- Additional comment from Anand Avati on 2015-01-16 03:55:38 EST --- COMMIT: http://review.gluster.org/9446 committed in master by Raghavendra Bhat (raghavendra) ------ commit a3bc6cf85d20ab21380f96f8d11cad29a53b3bb4 Author: Pranith Kumar K <pkarampu> Date: Wed Jan 14 17:10:41 2015 +0530 storage/posix: Don't try to set gfid in case of INTERNAL-mknod Change-Id: I96540ed07f08e54d2a24a3b22c2437bddd558c85 BUG: 1088649 Signed-off-by: Pranith Kumar K <pkarampu> Reviewed-on: http://review.gluster.org/9446 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Shyamsundar Ranganathan <srangana> Reviewed-by: Raghavendra Bhat <raghavendra>
REVIEW: http://review.gluster.org/9471 (storage/posix: Set gfid after all xattrs, uid/gid are set) posted (#1) for review on release-3.6 by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/9472 (storage/posix: Don't try to set gfid in case of INTERNAL-mknod) posted (#1) for review on release-3.6 by Pranith Kumar Karampuri (pkarampu)
COMMIT: http://review.gluster.org/9471 committed in release-3.6 by Raghavendra Bhat (raghavendra) ------ commit d9125608b33181fed8052ff5f867b2cb5c4e679f Author: Pranith Kumar K <pkarampu> Date: Mon Jan 12 12:50:50 2015 +0530 storage/posix: Set gfid after all xattrs, uid/gid are set Backport of http://review.gluster.com/9434 Problem: When a new entry is created gfid is set even before uid/gid, xattrs are set on the entry. This can lead to dht/afr healing that file/dir with the uid/gid it sees just after the gfid is set, i.e. root/root. Sometimes setattr/setxattr are failing on that file/dir. Fix: Set gfid of the file/directory only after uid/gid, xattrs are setup properly. Readdirp, lookup either wait for the gfid to be assigned to the entry or not update the in-memory inode ctx in posix-acl xlator which was producing lot EACCESS/EPERM to the application or dht/afr self-heals. BUG: 1184527 Change-Id: Ia6dfd492e03db2275665e7f63260611b310e38e6 Signed-off-by: Pranith Kumar K <pkarampu> Reviewed-on: http://review.gluster.org/9471 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Raghavendra Bhat <raghavendra>
COMMIT: http://review.gluster.org/9472 committed in release-3.6 by Raghavendra Bhat (raghavendra) ------ commit 098d823dd771115be5ebbdc4e14ae68451da7f6a Author: Pranith Kumar K <pkarampu> Date: Wed Jan 14 17:10:41 2015 +0530 storage/posix: Don't try to set gfid in case of INTERNAL-mknod Backport of http://review.gluster.org/9446 BUG: 1184527 Change-Id: I3e31b0672704cae09c5097f84229b8264a6c0fbe Signed-off-by: Pranith Kumar K <pkarampu> Reviewed-on: http://review.gluster.org/9472 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Raghavendra Bhat <raghavendra>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-v3.6.3, please open a new bug report. glusterfs-v3.6.3 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] https://www.gluster.org/pipermail/gluster-users/2015-April/021669.html [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user