Bug 1873720 - kernel-5.7.17 and kernel-5.8.4: NFS client can't see files from NFS mount
Summary: kernel-5.7.17 and kernel-5.8.4: NFS client can't see files from NFS mount
Keywords:
Status: ON_QA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 32
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 1878813 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-08-29 15:10 UTC by Anthony Messina
Modified: 2020-09-28 12:29 UTC (History)
28 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: Bug


Attachments (Terms of Use)
Commit b4487b93545214a9db8cbf32e86411677b0cca21 which should be reverted (1.60 KB, patch)
2020-09-08 18:10 UTC, Edgar Hoch
no flags Details | Diff
My example patch for kernel.spec of kernel-5.8.7-200.fc32 (1.48 KB, patch)
2020-09-08 18:12 UTC, Edgar Hoch
no flags Details | Diff
Patch for kernel.spec of kernel-5.8.9-200.fc32.src.rpm to include fix for nfs problem (948 bytes, patch)
2020-09-15 11:34 UTC, Edgar Hoch
no flags Details | Diff
[PATCH] nfs: Fix security label length not being reset (1.78 KB, patch)
2020-09-15 11:39 UTC, Edgar Hoch
no flags Details | Diff

Description Anthony Messina 2020-08-29 15:10:04 UTC
Using an updated 5.7.17 NFS client, I am unable to list files/directories from a 5.7.15 NFS server (NFS v4.2 / Kerberos mounted /home). The files/directories seem to be accessible if I type in their names, but are not available via ls -l.


Same issue with/since 5.7.17 client: Using an updated 5.8.4 NFS client, I am unable to list files/directories from a 5.7.15 NFS server (NFS v4.2 / Kerberos mounted /home). The files/directories seem to be accessible if I type in their names, but are not available via ls -l.

Comment 1 Anthony Messina 2020-08-29 22:33:39 UTC
The issue remains with the client on kernel-5.8.4 and after the NFS server was upgraded to kernel-5.8.4:  Unable to "see" files or directories (via ls -l) in the /home/<username> directory *except* the first "dot" directory, in my case ".aqbanking"  If I manually enter a subdirectory, /home/<username>/subdirectory, I am able to see files with ls -l

Comment 2 Edgar Hoch 2020-08-31 04:29:36 UTC
I have the same problem with kernel-5.7.17-200.fc32.x86_64 and kernel-5.8.4-200.fc32.x86_64 and kernel-5.8.5-200.fc32.x86_64 (on nfs client, nfs server still runs 5.7.12-200.fc32.x86_64).

The problem does not occur with kernel-5.7.12-200.fc32.x86_64 and kernel-5.7.16-200.fc32.x86_64.

It seams something in the patches between kernel-5.7.16-200.fc32.x86_64 and kernel-5.7.17-200.fc32.x86_64 seams the reason for this nfs problem.

Comment 3 Edgar Hoch 2020-08-31 05:33:23 UTC
(In reply to Edgar Hoch from comment #2)
> I have the same problem with kernel-5.7.17-200.fc32.x86_64 and
> kernel-5.8.4-200.fc32.x86_64 and kernel-5.8.5-200.fc32.x86_64 (on nfs
> client, nfs server still runs 5.7.12-200.fc32.x86_64).

On Fedora 32 with home directory in nfs I see only some of the files on NFS client when using kernel 5.7.17. When using kernel 5.7.16 I see all files (same as on local (exported) filesystem on nfs server).

How to test:

1. Export a directory with some files in it from a nfs server. (I have not tested if it needs to be a home directory or any directory in nfs.) Use kernel 5.7.16 or older.
2. Use a nfs client with kernel 5.7.16 or older and list the nfs directory with "ls -l".
3. Use a nfs client with kernel 5.7.17 or newer (up to 5.8.5) and list the nfs directory with "ls -l".

Current result:
In step 3 there are some files not listed compared to step 2.

Expected result:
Listing in step 2 and 3 contains the same files.

Comment 4 Francesco Simula 2020-09-01 13:29:19 UTC
Same thing here:

NFS server (exporting folders with protocol version 4.2) is CentOS 7, untouched in a very long time and perfectly working against other CentOS NFS clients; this means that the problem is on the Fedora 32 clients - I observe this exact behaviour (ls -l does not completely list the home folder's content even if the files accessed directly seem to be all there) for those rebooted with 5.7.17 OR 5.8.4 kernels.

Comment 5 Edgar Hoch 2020-09-01 23:00:37 UTC
I have tested kernel-5.9.0-0.rc3.1.fc32.x86_64 (build from kernel-5.9.0-0.rc3.1.fc34.src.rpm from koji website). The problem still exists with this kernel.

Comment 6 Edgar Hoch 2020-09-03 17:43:12 UTC
I have tested kernel-5.8.6-200.fc32.x86_64. The problems still exists.

1.) One problem is that on a newly booted machine with autofs on and the home directory is in nfs with autofs, then the nfs autofs directory is not mountet. So ssh login asks for a password (instead of using a ssh key), and if the password is entered, then the home directory is missing and the user login process has pwd of "/".

2.) After "setenforce 0; systemctl restart autofs" on the machine of the example above, then login is possible with ssh key, the home directory is the current working directory, but there "ls" still lists only a subset of the available files in the home directory (compared to the files on the nfs server, witch still runs kernel-5.7.12-200.fc32.x86_64).

3.) Booting kernel-5.6.16-300.fc32.x86_64 without change on selinux instead of a newer kernel works normal. All files are listed on the nfs client as on the nfs server.

Since SELinux stays the same during the tests, something may have changed in kernel between 5.7.16 and 5.7.17, which changes the interpretation or impact of SELinux rules etc. Or some code used by the kernel for nfs (or other helper tools) has changed so it will now blocked by some security rule. Just some ideas where to search for a solution for the problem.

Comment 7 Edgar Hoch 2020-09-06 03:45:24 UTC
I did a binary search of all commits between v5.7.16 and v5.7.17.

The problem is commit 4476b8282f0bdbf21c8a1e5d783ee11a0edfcaf2,
Upstream commit b4487b93545214a9db8cbf32e86411677b0cca21:
nfs: Fix getxattr kernel panic and memory overflow

https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=b4487b93545214a9db8cbf32e86411677b0cca21

The problem still exists with kernel 5.8.7.

I have build modified kernel 5.8.7 packages which has reverted the commit listed above. Reverting this patch solves the problem that some files are missing in the directory listing.


But there also exists an SELinux problem with 5.8.x kernels. The implementation seems changed so a new policy rule is requires.
See bug 1874338.

Comment 8 Edgar Hoch 2020-09-08 18:10:14 UTC
Created attachment 1714163 [details]
Commit b4487b93545214a9db8cbf32e86411677b0cca21 which should be reverted

This is the commit which should be reverted. Then all nfs files are listed again.

I know that reverting the commit does not solve the reason why this commit was created for. But as it is it causes problems.

Comment 9 Edgar Hoch 2020-09-08 18:12:19 UTC
Created attachment 1714164 [details]
My example patch for kernel.spec of kernel-5.8.7-200.fc32

Here is my example patch for the spec file of kernel 5.8.7-200.fc32. It reverses the commit mentioned above.

Comment 10 Bert DeKnuydt 2020-09-10 08:53:36 UTC
I've followed comment #8 and #9, and built a patched kernel. 

For me, this solves the problems only partially.  I still get OOPS from fs/cachefiles/rdwr.c
which are somehow related to this.  Like rhbz 1827549.

Comment 11 Edgar Hoch 2020-09-10 09:02:48 UTC
(In reply to Bert DeKnuydt from comment #10)
> For me, this solves the problems only partially.  I still get OOPS from
> fs/cachefiles/rdwr.c
> which are somehow related to this.  Like rhbz 1827549.

We don't use cachefilesd. This may be the reason why we don't get this OOPS.

Comment 12 Matt Kinni 2020-09-13 05:05:32 UTC
I'm experiencing the same issue using kerberized nfs mounts, and traced it to the "security_label" export parameter.  

In kernel-5.7.16-200.fc32.x86_64 on the client side everything works as expected, but with 5.8.4-200.fc32.x86_64 and later I only get partial directory listings with ls, find, dir, etc. commands (though all files can still be accessed manually using their full paths). 

I found that by removing "security_label" from the server's /etc/exports file, full directory listings are restored even in the latest kernel-5.8.7-200.fc32.x86_64 kernel.

Using wireshark, I observed that the nfs "READDIR" response packet from the server has identical directory listings after issuing "ls" on a folder regardless of whether security_label is exported or not, but for some reason most files/folders appear invisible to the client unless accessing them with the full file path

Comment 13 Edgar Hoch 2020-09-13 16:55:29 UTC
I have tested kernel-5.8.8-200.fc32.x86_64 and kernel-5.9.0-0.rc4.20200911git581cb3a26baf.8.fc32.x86_64 (build from kernel-5.9.0-0.rc4.20200911git581cb3a26baf.8.fc34.src.rpm). With both I still get a partial directory listing.

I also use the "security_label" export parameter.

Comment 14 Enrico Scholz 2020-09-15 08:52:15 UTC
probably solved by https://www.spinics.net/lists/linux-nfs/msg79253.html

Comment 15 Edgar Hoch 2020-09-15 11:34:55 UTC
Created attachment 1714917 [details]
Patch for kernel.spec of kernel-5.8.9-200.fc32.src.rpm to include fix for nfs problem

I confirm that the patch from https://www.spinics.net/lists/linux-nfs/msg79253.html solves the problem for me.

I build kernel 5.8.9-200.fc32 with this patch applied, and all files of the nfs directory are listed.

Comment 16 Edgar Hoch 2020-09-15 11:39:44 UTC
Created attachment 1714918 [details]
[PATCH] nfs: Fix security label length not being reset

This file contains the patch from https://www.spinics.net/lists/linux-nfs/msg79253.html .

But I'm not on the linux-nfs mailing list, so I have extracted it from the web page. It has not the optimal format as the other patches of the package. Someone on the mailing list may apply it to the appropriate git repository, then this commit will be a better patch file.

Comment 17 J. Bruce Fields 2020-09-15 14:29:19 UTC
*** Bug 1878813 has been marked as a duplicate of this bug. ***

Comment 18 Fedora Update System 2020-09-17 20:19:51 UTC
FEDORA-2020-957351614b has been submitted as an update to Fedora 33. https://bodhi.fedoraproject.org/updates/FEDORA-2020-957351614b

Comment 19 Fedora Update System 2020-09-17 20:20:18 UTC
FEDORA-2020-9f10c3dfae has been submitted as an update to Fedora 32. https://bodhi.fedoraproject.org/updates/FEDORA-2020-9f10c3dfae

Comment 20 Fedora Update System 2020-09-17 20:20:35 UTC
FEDORA-2020-a3b3084904 has been submitted as an update to Fedora 31. https://bodhi.fedoraproject.org/updates/FEDORA-2020-a3b3084904

Comment 21 Fedora Update System 2020-09-18 16:41:23 UTC
FEDORA-2020-a3b3084904 has been pushed to the Fedora 31 testing repository.
In short time you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2020-a3b3084904`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2020-a3b3084904

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 22 Fedora Update System 2020-09-18 16:42:34 UTC
FEDORA-2020-9f10c3dfae has been pushed to the Fedora 32 testing repository.
In short time you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2020-9f10c3dfae`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2020-9f10c3dfae

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 23 Fedora Update System 2020-09-18 18:59:16 UTC
FEDORA-2020-957351614b has been pushed to the Fedora 33 testing repository.
In short time you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2020-957351614b`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2020-957351614b

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 24 Fedora Update System 2020-09-20 23:58:52 UTC
FEDORA-2020-a3b3084904 has been pushed to the Fedora 31 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 25 Fedora Update System 2020-09-21 00:00:59 UTC
FEDORA-2020-9f10c3dfae has been pushed to the Fedora 32 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 26 Fedora Update System 2020-09-25 17:00:57 UTC
FEDORA-2020-957351614b has been pushed to the Fedora 33 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 27 Francesco Simula 2020-09-28 12:29:31 UTC
(In reply to Bert DeKnuydt from comment #10)
> I've followed comment #8 and #9, and built a patched kernel. 
> 
> For me, this solves the problems only partially.  I still get OOPS from
> fs/cachefiles/rdwr.c
> which are somehow related to this.  Like rhbz 1827549.

In response to this, here the new kernel 5.8.11 fixes the 'missing files in ls' problem but the crash in the 'cachefilesd' daemon remains, so it is NOT completely fixed for me.


Note You need to log in before you can comment on or make changes to this bug.