| Summary: | nonroot user getting permission denied error inconsistently when trying to access a directory (bricks are down) | ||
|---|---|---|---|
| Product: | Red Hat Gluster Storage | Reporter: | Nag Pavan Chilakam <nchilaka> |
| Component: | distribute | Assignee: | Susant Kumar Palai <spalai> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Prasad Desala <tdesala> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | rhgs-3.2 | CC: | amukherj, nbalacha, nchilaka, rhinduja, rhs-bugs, sheggodu, spalai, storage-qa-internal |
| Target Milestone: | --- | Keywords: | ZStream |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | dht-perms | ||
| Fixed In Version: | glusterfs-3.12.2-14 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2018-11-26 10:09:48 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
|
Description
Nag Pavan Chilakam
2016-10-06 07:18:49 UTC
An update, seeing this problem even on rhel 7.2 client Your comment was:
One More observation:
Following Logs are seen on only one brick in the 8x2 volume:: the first brick on 4th node...Which is up and running
i.e Brick 10.70.37.150:/rhs/brick1/distrepvol 49154 0 Y 3918
(could be the brick to which the dir is hashing to by default)
[2016-10-06 09:01:56.539646] E [MSGID: 115056] [server-rpc-fops.c:667:server_opendir_cbk] 0-distrepvol-server: 14115751: OPENDIR /rootdir1/renames/dir_samenames/level1.1/level2.2/level3.14/level4.62/level5.73 (8aa88cb3-d299-49f6-a821-a097c5d4c97a) ==> (Permission denied) [Permission denied]
[2016-10-06 09:01:56.542680] E [MSGID: 115056] [server-rpc-fops.c:1806:server_readdirp_cbk] 0-distrepvol-server: 14115755: READDIRP -2 (8aa88cb3-d299-49f6-a821-a097c5d4c97a) ==> (Permission denied) [Permission denied]
[root@dhcp35-191 ~]# gluster v status
Status of volume: distrepvol
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick 10.70.35.191:/rhs/brick1/distrepvol N/A N/A N N/A
Brick 10.70.37.187:/rhs/brick1/distrepvol 49154 0 Y 3867
Brick 10.70.35.3:/rhs/brick1/distrepvol N/A N/A N N/A
Brick 10.70.37.150:/rhs/brick1/distrepvol 49154 0 Y 3918
Brick 10.70.35.191:/rhs/brick2/distrepvol 49155 0 Y 7568
Brick 10.70.37.187:/rhs/brick2/distrepvol N/A N/A N N/A
Brick 10.70.35.3:/rhs/brick2/distrepvol 49155 0 Y 5341
Brick 10.70.37.150:/rhs/brick2/distrepvol N/A N/A N N/A
Snapshot Daemon on localhost 49152 0 Y 8358
Self-heal Daemon on localhost N/A N/A Y 7588
Quota Daemon on localhost N/A N/A Y 8211
Snapshot Daemon on 10.70.37.187 49152 0 Y 4428
Self-heal Daemon on 10.70.37.187 N/A N/A Y 3907
Quota Daemon on 10.70.37.187 N/A N/A Y 4330
Snapshot Daemon on 10.70.37.150 49152 0 Y 4477
Self-heal Daemon on 10.70.37.150 N/A N/A Y 3957
Quota Daemon on 10.70.37.150 N/A N/A Y 4380
Snapshot Daemon on 10.70.35.3 49152 0 Y 5858
Self-heal Daemon on 10.70.35.3 N/A N/A Y 5361
Quota Daemon on 10.70.35.3 N/A N/A Y 5762
Task Status of Volume distrepvol
------------------------------------------------------------------------------
There are no active volume tasks
This sounds like the permissions for this directory are different on this brick but I cannot confirm as the brick is no longer available due to BZ#1385606. Nag, Is there a way this can be recreated? With out that this BZ can not be further looked at as there is no other relevant information available in the bug. I am able to recreate this. You can log on rhs-client24 machine and as a noon-root user nchilaka(pswd:redhat) and do a ls -lRt of the whole mount. It will take a lot of time, so better to o/p error to a file and run in screen I see lot of permision denied error. For eg do ls /mnt/sysvol//test-arena/kernel-untar/dir.26/linux-4.8.6/Documentation/devicetree/bindings/timer and check the fuse mount log. you can see permission denied error (In reply to nchilaka from comment #6) > I am able to recreate this. > You can log on rhs-client24 machine and as a noon-root user > nchilaka(pswd:redhat) > and do a ls -lRt of the whole mount. It will take a lot of time, so better > to o/p error to a file and run in screen > I see lot of permision denied error. > For eg do ls > /mnt/sysvol//test-arena/kernel-untar/dir.26/linux-4.8.6/Documentation/ > devicetree/bindings/timer > and check the fuse mount log. you can see permission denied error Had a look at Nag's system. There is a permission split brain among different subvols. Subvol replicate-3 (10.70.35.3:/rhs/brick2/sysvol 10.70.37.66:/rhs/brick2/sysvol) has permission "drwxrwxr-x". While replicate-1 (10.70.35.3:/rhs/brick1/sysvol 10.70.37.66:/rhs/brick1/sysvol) has permission "drwx------". The permission of the directories were changed most likely when one of the subvol was down. Hence, permission error is seen "sometimes". If the permission update happens from replicate-3, then there will be no problem, but if the permission update happens from replicate-1, user will see a permission error. The fix for the above scenario depends on the ctime of the directories. Now which can change with different factors e.g afr selfheal or quotad etc. If the ctime of a directory with bad attr is bumped up as a consequence of above reasons, we may heal a wrong attr. Given the fix would be intrusive, would like to move this bug out of 3.2.0 scope. seeing this issue even when all bricks were up on final regression run of systemic testing on 3.8.4-14 Moving this to ON_QA as I found patch: https://review.gluster.org/#/c/glusterfs/+/20108/ should be fixing this. Tested locally. |