RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1976829 - [RHEL-9.0] dd failed as "Permission denied" on nfs mount point with exports as root_squash and set Sticky bit
Summary: [RHEL-9.0] dd failed as "Permission denied" on nfs mount point with exports a...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: kernel
Version: 9.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: beta
: ---
Assignee: Jeff Layton
QA Contact: Yongcheng Yang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-06-28 10:37 UTC by Yongcheng Yang
Modified: 2022-11-01 07:29 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-11-01 07:29:22 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Yongcheng Yang 2021-06-28 10:37:26 UTC
Description of problem:
One of our testcase is to export (using "-o rw,root_squash") and mount the directory /mnt/testarea via NFS and then process `dd` on that mountpoint. But since rhel9, the dd always gets failed as "Permission denied".

I checked by hand and this failure can be reproducible. But other I/O like `echo` and `touch` can success. I'm not sure if this is a NFS problem. Please help provide some guide how to debug it.


Version-Release number of selected component (if applicable):
kernel-5.13.0-0.rc7.51.el9
nfs-utils-2.5.2-1.rc4.el9
coreutils-8.32-28.el9

How reproducible:
always (on newly installed machine)

Steps to Reproduce:
1. exportfs -i $HOSTNAME:/mnt/testarea -o rw
2. mount $HOSTNAME:/mnt/testarea /mnt/mnt_test
3. dd bs=512 count=1 if=/dev/zero of=/mnt/mnt_test/testfile
dd: failed to open '/mnt/mnt_test/testfile': Permission denied

Actual results:
[root@hp-dl380-gen9-7 ~]# ls -ld /mnt/testarea/
drwxrwxrwt. 2 root root 79 Jun 28 06:07 /mnt/testarea/
[root@hp-dl380-gen9-7 ~]# rpm -qf /mnt/testarea/
restraint-rhts-0.3.3-1.el9.x86_64
[root@hp-dl380-gen9-7 ~]# df /mnt/testarea/
Filesystem                               1K-blocks    Used Available Use% Mounted on
/dev/mapper/rhel_hp--dl380--gen9--7-root  73364480 2217648  71146832   4% /
[root@hp-dl380-gen9-7 ~]# 
[root@hp-dl380-gen9-7 ~]# ./repro.sh 
/mnt/testarea   hp-dl380-gen9-7.gsslab.rdu2.redhat.com(sync,wdelay,hide,no_subtree_check,sec=sys,rw,secure,root_squash,no_all_squash)
/root/mnt_test from hp-dl380-gen9-7.gsslab.rdu2.redhat.com:/mnt/testarea
 Flags: rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.10.180.172,local_lock=none,addr=10.10.180.172

-rw-r--r--. 1 nobody nobody 32 Jun 28 06:09 /root/mnt_test/testfile1
dd: failed to open '/root/mnt_test/testfile2': Permission denied    <<<<<<<<
-rw-r--r--. 1 nobody nobody 0 Jun 28 06:09 /root/mnt_test/testfile2
[root@hp-dl380-gen9-7 ~]# cat repro.sh 
#!/bin/bash

# provided by package restraint-rhts
EXP=/mnt/testarea
MNT=/root/mnt_test
[ -d $MNT ] || mkdir $MNT

echo "$EXP $HOSTNAME(rw)" > /etc/exports
systemctl restart nfs-server
exportfs -v
mount $HOSTNAME:$EXP $MNT
nfsstat -m

# success
date > $MNT/testfile1
ls -l $MNT/testfile1

# failed
dd bs=512 count=1 if=/dev/zero of=$MNT/testfile2
ls -l $MNT/testfile2

rm -f $MNT/testfile*
umount $MNT

[root@hp-dl380-gen9-7 ~]# 


Expected results:


Additional info:

Comment 1 J. Bruce Fields 2021-07-01 20:01:21 UTC
For what it's worth, I tried with recent upstream (df04fbe8680b) and couldn't reproduce.

Comment 2 Yongcheng Yang 2021-07-13 01:45:36 UTC
(In reply to J. Bruce Fields from comment #1)
> For what it's worth, I tried with recent upstream (df04fbe8680b) and
> couldn't reproduce.

Thanks for the information.

Have also checked with (self-built) 5.14.0-0.rc1.15.bx.el9 and this problem get resolved. Just close this one for now.

Comment 8 RHEL Program Management 2022-04-28 07:30:16 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.

Comment 10 Yongcheng Yang 2022-04-29 10:22:17 UTC
I have a new found that this related to the Sticky bit of nfs exports.

As this can pass in rhel8 but get failed since rhel9 (see the attached reproducer), I'm re-opening this again to get more information/investigation.

Comment 14 Jeff Layton 2022-07-26 14:49:34 UTC
I can reproduce this. What I see is that the client is sending a GETATTR for the containing directory, but nothing else. There are no errors returned by the server in this reproducer, it all seems to be happening on the client. Here's what the tracepoints show:

              dd-1460    [010] .....   293.790460: nfs_access_enter: fileid=00:32:128 fhandle=0xba5019a4 version=1781172066073452463 
              dd-1460    [010] .....   293.790467: nfs_access_exit: error=-10 (CHILD) fileid=00:32:128 fhandle=0xba5019a4 type=4 (DIR) version=1781172066073452463 size=40 cache_validity=0x0 () nfs_flags=0x4 (ACL_LRU_SET) mask=0x81 permitted=0xffffffff
              dd-1460    [010] .....   293.790468: nfs_access_enter: fileid=00:32:128 fhandle=0xba5019a4 version=1781172066073452463 
              dd-1460    [010] .....   293.790470: nfs_revalidate_inode_enter: fileid=00:32:128 fhandle=0xba5019a4 version=1781172066073452463 
              dd-1460    [010] .....   293.790485: nfs4_setup_sequence: session=0xee461a67 slot_nr=0 seq_nr=27 highest_used_slotid=0
              dd-1460    [010] .....   293.791298: nfs4_map_name_to_uid: error=0 (OK) id=0 name=0
              dd-1460    [010] .....   293.791301: nfs4_map_group_to_gid: error=0 (OK) id=0 name=0
              dd-1460    [010] .....   293.791304: nfs4_sequence_done: error=0 (OK) session=0xee461a67 slot_nr=0 seq_nr=27 highest_slotid=29 target_highest_slotid=29 status_flags=0x0 ()
              dd-1460    [010] .....   293.791315: nfs4_getattr: error=0 (OK) fileid=00:32:128 fhandle=0xba5019a4 valid=TYPE|MODE|NLINK|OWNER|GROUP|RDEV|SIZE|FSID|FILEID|ATIME|MTIME|CTIME|CHANGE|0x400200
              dd-1460    [010] ...1.   293.791318: nfs_refresh_inode_enter: fileid=00:32:128 fhandle=0xba5019a4 version=1781172066073452463 
              dd-1460    [010] ...1.   293.791321: nfs_set_cache_invalid: error=0 (OK) fileid=00:32:128 fhandle=0xba5019a4 type=4 (DIR) version=1781172066073452463 size=40 cache_validity=0x0 () nfs_flags=0x4 (ACL_LRU_SET)
              dd-1460    [010] ...1.   293.791322: nfs_refresh_inode_exit: error=0 (OK) fileid=00:32:128 fhandle=0xba5019a4 type=4 (DIR) version=1781172066073452463 size=40 cache_validity=0x0 () nfs_flags=0x4 (ACL_LRU_SET)
              dd-1460    [010] .....   293.791324: nfs_revalidate_inode_exit: error=0 (OK) fileid=00:32:128 fhandle=0xba5019a4 type=4 (DIR) version=1781172066073452463 size=40 cache_validity=0x0 () nfs_flags=0x4 (ACL_LRU_SET)
              dd-1460    [010] .....   293.791326: nfs_access_exit: error=0 (OK) fileid=00:32:128 fhandle=0xba5019a4 type=4 (DIR) version=1781172066073452463 size=40 cache_validity=0x0 () nfs_flags=0x4 (ACL_LRU_SET) mask=0x1 permitted=0x7

I'm still looking over this, but it looks like the problem is down in the client's cached open code.

Comment 15 Jeff Layton 2022-07-26 17:19:43 UTC
The problem seems to be due to may_create_in_sticky returning -EACCES during the pathwalk. Basically, we do the lookup of the parent and then issue an atomic open. That creates the file and returns its attributes. The file has an owner:group of "nobody:nobody" because of the root squashing. After that, the VFS calls do_open. That eventually calls may_create_in_sticky, which rejects it because it falls afoul of these checks:

            uid_eq(i_uid_into_mnt(mnt_userns, inode), dir_uid) ||
            uid_eq(current_fsuid(), i_uid_into_mnt(mnt_userns, inode)))

The uid is not equal. One is root and the other is "nobody". Worse, the file still ends up being created -- we just can't write to it.

Doing this works around the problem:

    # echo 0 > /proc/sys/fs/protected_regular

I think I'm going to need to take this upstream, as I'm not clear on what the right fix is.

Comment 16 Jeff Layton 2022-08-02 12:05:09 UTC
I proposed a patch and went through a couple of different iterations of it. The latest one is here:

    https://lore.kernel.org/linux-nfs/20220727140014.69091-1-jlayton@kernel.org/

At this point, I'm waiting for Al to (hopefully) take this patch in.

Comment 19 RHEL Program Management 2022-11-01 07:29:22 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.


Note You need to log in before you can comment on or make changes to this bug.