2335129 – NFS is slow to mount on 6.12 kernels

Bug 2335129 - NFS is slow to mount on 6.12 kernels

Summary: NFS is slow to mount on 6.12 kernels

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	kernel
Sub Component:
Version:	41
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Assignee:	Kernel Maintainer List
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2025-01-01 07:36 UTC by Ryan
Modified:	2025-06-04 17:34 UTC (History)
CC List:	19 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2025-06-04 17:34:37 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
dmesg.txt (51.51 KB, text/plain) 2025-01-01 07:37 UTC, Ryan	no flags	Details
network trace (6.83 MB, application/octet-stream) 2025-01-03 04:06 UTC, Ryan	no flags	Details
View All

Description Ryan 2025-01-01 07:36:51 UTC

1. Please describe the problem:

NFS mount of an AWS EFS volume consistently takes 3 minutes on a 6.12.x kernel versus immediately mounting on a 6.11.x kernel.


2. What is the Version-Release number of the kernel:

6.12.6-200.fc41.x86_6


3. Did it work previously in Fedora? If so, what kernel version did the issue
   *first* appear?  Old kernels are available for download at
   https://koji.fedoraproject.org/koji/packageinfo?packageID=8 :

It was working fine in the 6.11.x kernels.

The problem first appears in 6.12.0-65.fc42.x86_64.


4. Can you reproduce this issue? If so, please provide the steps to reproduce
   the issue below:

[fedora@ip-10-75-180-241 ~]$ uname -a
Linux ip-10-75-180-241 6.11.11-300.fc41.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Dec  5 18:38:25 UTC 2024 x86_64 GNU/Linux

[root@ip-10-75-180-241 fedora]# time mount -t nfs -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport fs-XXXX.efs.us-east-1.amazonaws.com:/ /var/www/html/ 

real    0m0.311s
user    0m0.003s
sys     0m0.005s

** REBOOT **

[fedora@ip-10-75-180-241 ~]$ uname -a
Linux ip-10-75-180-241 6.12.0-65.fc42.x86_64 #1 SMP PREEMPT_DYNAMIC Mon Nov 18 13:39:20 UTC 2024 x86_64 GNU/Linux

[root@ip-10-75-180-241 fedora]# time mount -t nfs -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport fs-XXXX.efs.us-east-1.amazonaws.com:/ /var/www/html/ 

real    3m0.333s
user    0m0.004s
sys     0m0.483s

** REBOOT **

[root@ip-10-75-180-241 fedora]# uname -a
Linux ip-10-75-180-241 6.12.6-200.fc41.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Dec 19 21:06:34 UTC 2024 x86_64 GNU/Linux

[root@ip-10-75-180-241 fedora]# time mount -t nfs -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport fs-XXXX.efs.us-east-1.amazonaws.com:/ /var/www/html/ 

real    3m0.308s
user    0m0.002s
sys     0m0.523s


5. Does this problem occur with the latest Rawhide kernel? To install the
   Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by
   ``sudo dnf update --enablerepo=rawhide kernel``:

Yes.


6. Are you running any modules that not shipped with directly Fedora's kernel?:

No.


7. Please attach the kernel logs. You can get the complete kernel log
   for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the
   issue occurred on a previous boot, use the journalctl ``-b`` flag.

Reproducible: Always

Comment 1 Ryan 2025-01-01 07:37:36 UTC

Created attachment 2064407 [details]
dmesg.txt

Comment 2 Ryan 2025-01-01 16:54:35 UTC

Adjusting the "timeo" and "retrans" values yield different, but reproducible, results.

timeo=600 (60 seconds, the default),retrans=2 results in a 3-minute mount.
timeo=600 (60 seconds, the default),retrans=0 results in a 60-second mount.
timeo=150 (15 seconds),retrans=2 results in a 45-second mount.
timeo=150 (15 seconds),retrans=0 results in a 15-second mount.
timeo=900 (90 seconds),retrans=2 results in a 4.5-minute mount.
timeo=900 (90 seconds),retrans=0 results in a 90-second mount.
timeo=10 (1 second),retrans=2 results in a 3-second mount.
timeo=1 (0.1 second),retrans=2 results in a 0.3-second mount.

It seems like the the mount is taking timeo*(retrans+1) seconds to succeed.


[root@ip-10-75-180-241 fedora]# time mount -v -t nfs -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport fs-XXXX.efs.us-east-1.amazonaws.com:/ /var/www/html/ 
mount.nfs: timeout set for Wed Jan  1 16:11:33 2025
mount.nfs: trying text-based options 'nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport,addr=10.75.180.183,clientaddr=10.75.180.241'

real    3m0.323s
user    0m0.003s
sys     0m0.520s

[root@ip-10-75-180-241 fedora]# umount /var/www/html 
[root@ip-10-75-180-241 fedora]# time mount -v -t nfs -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport fs-XXXX.efs.us-east-1.amazonaws.com:/ /var/www/html/ 
mount.nfs: timeout set for Wed Jan  1 16:14:46 2025
mount.nfs: trying text-based options 'nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport,addr=10.75.180.183,clientaddr=10.75.180.241'

real    3m0.043s
user    0m0.004s
sys     0m0.810s

[root@ip-10-75-180-241 fedora]# umount /var/www/html
[root@ip-10-75-180-241 fedora]# date ; time mount -v -t nfs -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=150,retrans=2,noresvport fs-XXXX.efs.us-east-1.amazonaws.com:/ /var/www/html/ 
Wed Jan  1 04:20:02 PM UTC 2025
mount.nfs: timeout set for Wed Jan  1 16:22:02 2025
mount.nfs: trying text-based options 'nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=150,retrans=2,noresvport,addr=10.75.180.183,clientaddr=10.75.180.241'

real    0m45.041s
user    0m0.002s
sys     0m0.127s

[root@ip-10-75-180-241 fedora]# umount /var/www/html
[root@ip-10-75-180-241 fedora]# date ; time mount -v -t nfs -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=150,retrans=2,noresvport fs-XXXX.efs.us-east-1.amazonaws.com:/ /var/www/html/ 
Wed Jan  1 04:22:12 PM UTC 2025
mount.nfs: timeout set for Wed Jan  1 16:24:12 2025
mount.nfs: trying text-based options 'nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=150,retrans=2,noresvport,addr=10.75.180.183,clientaddr=10.75.180.241'

real    0m45.038s
user    0m0.000s
sys     0m0.137s

[root@ip-10-75-180-241 fedora]# umount /var/www/html
[root@ip-10-75-180-241 fedora]# date ; time mount -v -t nfs -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,retrans=2,noresvport fs-XXXX.efs.us-east-1.amazonaws.com:/ /var/www/html/ 
Wed Jan  1 04:23:07 PM UTC 2025
mount.nfs: timeout set for Wed Jan  1 16:25:07 2025
mount.nfs: trying text-based options 'nfsvers=4.1,rsize=1048576,wsize=1048576,hard,retrans=2,noresvport,addr=10.75.180.183,clientaddr=10.75.180.241'

real    3m0.039s
user    0m0.003s
sys     0m0.503s

[root@ip-10-75-180-241 fedora]# umount /var/www/html
[root@ip-10-75-180-241 fedora]# date ; time mount -v -t nfs -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=150,retrans=2,noresvport fs-XXXX.efs.us-east-1.amazonaws.com:/ /var/www/html/ 
Wed Jan  1 04:26:40 PM UTC 2025
mount.nfs: timeout set for Wed Jan  1 16:28:40 2025
mount.nfs: trying text-based options 'nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=150,retrans=2,noresvport,addr=10.75.180.183,clientaddr=10.75.180.241'

real    0m45.037s
user    0m0.001s
sys     0m0.235s

[root@ip-10-75-180-241 fedora]# umount /var/www/html
[root@ip-10-75-180-241 fedora]# date ; time mount -v -t nfs -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=900,retrans=2,noresvport fs-XXXX.efs.us-east-1.amazonaws.com:/ /var/www/html/ 
Wed Jan  1 04:28:02 PM UTC 2025
mount.nfs: timeout set for Wed Jan  1 16:30:02 2025
mount.nfs: trying text-based options 'nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=900,retrans=2,noresvport,addr=10.75.180.183,clientaddr=10.75.180.241'

real    4m30.042s
user    0m0.002s
sys     0m0.751s

[root@ip-10-75-180-241 fedora]# date ; time mount -v -t nfs -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=1,retrans=2,noresvport fs-XXXX.efs.us-east-1.amazonaws.com:/ /var/www/html/ 
Wed Jan  1 04:42:50 PM UTC 2025
mount.nfs: timeout set for Wed Jan  1 16:44:50 2025
mount.nfs: trying text-based options 'nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=1,retrans=2,noresvport,addr=10.75.180.183,clientaddr=10.75.180.241'

real    0m0.339s
user    0m0.002s
sys     0m0.010s

[root@ip-10-75-180-241 fedora]# date ; time mount -v -t nfs -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=10,retrans=2,noresvport fs-XXXX.efs.us-east-1.amazonaws.com:/ /var/www/html/ 
Wed Jan  1 04:43:17 PM UTC 2025
mount.nfs: timeout set for Wed Jan  1 16:45:17 2025
mount.nfs: trying text-based options 'nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=10,retrans=2,noresvport,addr=10.75.180.183,clientaddr=10.75.180.241'

real    0m3.033s
user    0m0.001s
sys     0m0.041s

[root@ip-10-75-180-241 fedora]# date ; time mount -v -t nfs -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=0,noresvport fs-XXXX.efs.us-east-1.amazonaws.com:/ /var/www/html/ 
Wed Jan  1 04:49:58 PM UTC 2025
mount.nfs: timeout set for Wed Jan  1 16:51:58 2025
mount.nfs: trying text-based options 'nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=0,noresvport,addr=10.75.180.183,clientaddr=10.75.180.241'

real    1m0.039s
user    0m0.002s
sys     0m0.171s

[root@ip-10-75-180-241 fedora]# date ; time mount -v -t nfs -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=900,retrans=0,noresvport fs-XXXX.efs.us-east-1.amazonaws.com:/ /var/www/html/ 
Wed Jan  1 04:51:53 PM UTC 2025
mount.nfs: timeout set for Wed Jan  1 16:53:53 2025
mount.nfs: trying text-based options 'nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=900,retrans=0,noresvport,addr=10.75.180.183,clientaddr=10.75.180.241'

real    1m30.042s
user    0m0.004s
sys     0m0.420s

Comment 3 Steve Dickson 2025-01-02 11:21:30 UTC

Would it be possible to get a network trace via 'sudo tshark -o /tmp/trace.pcap host <hostname>'

Comment 4 Ryan 2025-01-03 04:06:46 UTC

Created attachment 2064569 [details]
network trace

[root@ip-10-75-180-241 fedora]# date ; time mount -v -t nfs -o nfsvers=4.1,rsize=1048576,wsize=1048576,soft,timeo=150,retrans=2,noresvport fs-XXXX.efs.us-east-1.amazonaws.com:/ /var/www/html/ 
Fri Jan  3 03:59:39 AM UTC 2025
mount.nfs: timeout set for Fri Jan  3 04:01:39 2025
mount.nfs: trying text-based options 'nfsvers=4.1,rsize=1048576,wsize=1048576,soft,timeo=150,retrans=2,noresvport,addr=10.75.180.183,clientaddr=10.75.180.241'

real    0m45.038s
user    0m0.003s
sys     0m0.127s

Comment 5 Steve Dickson 2025-01-03 09:44:45 UTC

(In reply to Ryan from comment #4)
> Created attachment 2064569 [details]
> network trace
> 
> [root@ip-10-75-180-241 fedora]# date ; time mount -v -t nfs -o
> nfsvers=4.1,rsize=1048576,wsize=1048576,soft,timeo=150,retrans=2,noresvport
> fs-XXXX.efs.us-east-1.amazonaws.com:/ /var/www/html/ 
> Fri Jan  3 03:59:39 AM UTC 2025
> mount.nfs: timeout set for Fri Jan  3 04:01:39 2025
> mount.nfs: trying text-based options
> 'nfsvers=4.1,rsize=1048576,wsize=1048576,soft,timeo=150,retrans=2,noresvport,
> addr=10.75.180.183,clientaddr=10.75.180.241'
> 
> real    0m45.038s
> user    0m0.003s
> sys     0m0.127s

The server (10.75.180.183) is rejecting the TCP
connection due to a RECLAIM_COMPLETE loop... 

Then the session times out (NFS4ERR_BADSESSION) 
and the mount continues... 

Hmm... this only started to happen on a 6.12 kernel.

Comment 6 Benjamin Coddington 2025-01-03 13:08:41 UTC

I don't see a RECLAIM_COMPLETE loop, rather the server seems triggered by something the client sends and just starts resetting any TCP connection attempts.  The last thing the client sent was a RECLAIM_COMPLETE.

I think you'll need to bisect the client to find whats upsetting the server, but more likely it seems theres some very sensitive IPS-like behavior on the server side.  It would be much easier to check out the server logs to find why the server never responds to RECLAIM_COMPLETE and instead starts resetting all connection attempts.

Comment 7 Olga Kornievskaia 2025-01-03 16:08:35 UTC

Looks like a broken client. It starts off by sending exchange_id, create_session (all great), then it skips sending putrootfh compound (?). Then sends a reclaim_complete followed by some garbage rpc bytes which I think makes the server to reset the connection. After TCP handshake is done (and on each of the connection attempts) the client resends reclaim_complete compound with some garbage at the end which wireshark doesn't event parse as an NFS operation (but frame 417 is a good one to see wireshark parsing but you still see garbage bytes in the end). I dont know why this keeps happening and then suddenly in frame 53921 client sends a properly formatted reclaim_complete which gets badsession and etc since (from server's point of view lease hasn't been renewed).

Comment 8 Scott Mayhew 2025-01-03 18:13:01 UTC

Just an observation but the last RPC operation before things go off the rails isn't the RECLAIM_COMPLETE in frame 14, but a LOCALIOPROC_NULL in frame 15.  It seems wireshark isn't able to decode those packets yet (they show up as continuation packets on my system):

0000   80 00 00 28 70 9e 22 ef 00 00 00 00 00 00 00 02
       fraghdr     xid         msgtyp      rpcver
0010   00 06 1a fa 00 00 00 01 00 00 00 00 00 00 00 00
       rpcprog     rpcprogver  flavor      len
0020   00 00 00 00 00 00 00 00 00 00 00 00
       flavor      len         localioproc_null

The xid (0x709e22ef) is the 1 after the xid in the RECLAIM_COMPLETE (0x6f9e22ef) and the program number is 0x61afa (400122) which is NFS_LOCALIO_PROGRAM, so maybe the NFS server doesn't like the LOCALIO probe?

Comment 9 Scott Mayhew 2025-01-03 21:40:43 UTC

Piggybacking on what Ben said, maybe there's some sort of IPS feature on the NFS server that's looking at packets being sent to port 2049 and resetting the connection because it's seeing something other than 0x186a3 (100003) in the RPC program number field.

It might be worthwhile disabling the NFS LOCALIO feature and see if that makes the mounts happen faster.

# echo "options nfs localio_enabled=0" > /etc/modprobe.d/disable-nfs-localio.conf
# reboot
# cat /sys/module/nfs/parameters/localio_enabled

Make sure it says 'N' and then try mounting again.

Comment 10 Ryan 2025-01-03 21:54:19 UTC

> It might be worthwhile disabling the NFS LOCALIO feature and see if that
> makes the mounts happen faster.
> 

Thanks, Scott.  That does appear to fix the problem!


[root@ip-10-75-180-241 fedora]# uname -a
Linux ip-10-75-180-241 6.12.6-200.fc41.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Dec 19 21:06:34 UTC 2024 x86_64 GNU/Linux

[root@ip-10-75-180-241 fedora]# cat /sys/module/nfs/parameters/localio_enabled
N

[root@ip-10-75-180-241 fedora]# date ; time mount -v -t nfs -o nfsvers=4.1,rsize=1048576,wsize=1048576,soft,timeo=150,retrans=2,noresvport fs-XXXX.efs.us-east-1.amazonaws.com:/ /var/www/html/ 
Fri Jan  3 09:52:30 PM UTC 2025
mount.nfs: timeout set for Fri Jan  3 21:54:30 2025
mount.nfs: trying text-based options 'nfsvers=4.1,rsize=1048576,wsize=1048576,soft,timeo=150,retrans=2,noresvport,addr=10.75.180.183,clientaddr=10.75.180.241'

real    0m0.037s
user    0m0.001s
sys     0m0.005s

Comment 11 Steve Dickson 2025-01-06 15:02:47 UTC

Can you give more details the server?

If it is a Linux server what distro is the
server and the version (NVR) is the server. 

Again, if it is a Linux server, has the server
been modified

Comment 12 Ryan 2025-01-06 15:23:53 UTC

(In reply to Steve Dickson from comment #11)
> Can you give more details the server?
> 
> If it is a Linux server what distro is the
> server and the version (NVR) is the server. 
> 
> Again, if it is a Linux server, has the server
> been modified
>

The server is an Amazon (AWS) Elastic File System (EFS) (https://en.wikipedia.org/wiki/Amazon_Elastic_File_System) volume.  I don't have access to the server, but I suspect it's highly modified :).

I opened a support case on their end after we discovered this is related to LOCALIO.

Comment 13 Steve Dickson 2025-01-06 15:33:52 UTC

(In reply to Ryan from comment #12)
> (In reply to Steve Dickson from comment #11)
> > Can you give more details the server?
> > 
> > If it is a Linux server what distro is the
> > server and the version (NVR) is the server. 
> > 
> > Again, if it is a Linux server, has the server
> > been modified
> >
> 
> The server is an Amazon (AWS) Elastic File System (EFS)
> (https://en.wikipedia.org/wiki/Amazon_Elastic_File_System) volume.  I don't
> have access to the server, but I suspect it's highly modified :).
Me Too! ;-) 

> 
> I opened a support case on their end after we discovered this is related to
> LOCALIO.

Do we (the NFS team) have access to that support case?

Comment 14 Mike Snitzer 2025-01-07 18:46:04 UTC

(In reply to Ryan from comment #12)
> (In reply to Steve Dickson from comment #11)
> > Can you give more details the server?
> > 
> > If it is a Linux server what distro is the
> > server and the version (NVR) is the server. 
> > 
> > Again, if it is a Linux server, has the server
> > been modified
> >
> 
> The server is an Amazon (AWS) Elastic File System (EFS)
> (https://en.wikipedia.org/wiki/Amazon_Elastic_File_System) volume.  I don't
> have access to the server, but I suspect it's highly modified :).
> 
> I opened a support case on their end after we discovered this is related to
> LOCALIO.

I'm happy to work with the team at AWS to support LOCALIO (and explain/justify it if need be).

Comment 15 Ryan 2025-01-13 03:24:12 UTC

(In reply to Mike Snitzer from comment #14)

> I'm happy to work with the team at AWS to support LOCALIO (and
> explain/justify it if need be).

Thank you, Mike.  I'm not sure how to facilitate this.  Do you want me to pass along your contact info in the AWS ticket?

Comment 16 Ryan 2025-01-17 15:43:59 UTC

I received an update to my AWS ticket:

I would like to inform you that I've received the following update from Amazon EFS internal team.

"We are able to reproduce the issue and we are working on the solution. In order to getting the customer out of pain, they can temporarily disable LOCALIO on the client instance editing /sys/module/nfs/parameters/localio_enabled with N."

Comment 17 Mike Snitzer 2025-02-25 21:07:36 UTC

(In reply to Ryan from comment #15)
> (In reply to Mike Snitzer from comment #14)
> 
> > I'm happy to work with the team at AWS to support LOCALIO (and
> > explain/justify it if need be).
> 
> Thank you, Mike.  I'm not sure how to facilitate this.  Do you want me to
> pass along your contact info in the AWS ticket?

Yes, that would be perfect.  Please share all 3 of my emails:

snitzer
snitzer
snitzer

Apologies for not responding until now.

Comment 18 Mike Snitzer 2025-06-04 17:34:37 UTC

This issue should be resolved now with the latest upstream kernel via commit 1ff4716f420b ("NFS: always probe for LOCALIO support asynchronously").
See: https://git.kernel.org/linus/1ff4716f420b

Note You need to log in before you can comment on or make changes to this bug.