Bug 740024 - nfs: Mount occasionally fails with EIO
Summary: nfs: Mount occasionally fails with EIO
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 15
Hardware: i686
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Jeff Layton
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-09-20 18:11 UTC by Anders Blomdell
Modified: 2014-06-18 07:41 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-09-21 12:58:51 UTC


Attachments (Terms of Use)
Script to trigger mount bug (405 bytes, text/plain)
2011-09-20 18:13 UTC, Anders Blomdell
no flags Details

Description Anders Blomdell 2011-09-20 18:11:37 UTC
Description of problem:

When doing a lot of mount/unmounts, mount fails with EIO after some time.

Version-Release number of selected component (if applicable):

2.6.40-4.fc15.i686.PAE 


How reproducible:

Always

Steps to Reproduce:
1. mkdir /x
2. echo '/x localhost(ro,async)' >> /etc/exports
3. exportfs -a
4. Run attached script
  
Actual results:

Output:

mount("localhost:/x", "/tmp/mnt", "nfs", 0, "vers=4,addr=127.0.0.1,clientaddr"...) = 0
mount("localhost:/x", "/tmp/mnt", "nfs", 0, "vers=4,addr=127.0.0.1,clientaddr"...) = 0
mount("localhost:/x", "/tmp/mnt", "nfs", 0, "vers=4,addr=127.0.0.1,clientaddr"...) = 0
mount("localhost:/x", "/tmp/mnt", "nfs", 0, "vers=4,addr=127.0.0.1,clientaddr"...) = -1 EIO (Input/output error)
mount.nfs: mount system call failed
mount("localhost:/x", "/tmp/mnt", "nfs", 0, "vers=4,addr=127.0.0.1,clientaddr"...) = -1 EIO (Input/output error)
mount.nfs: mount system call failed

Dmesg:

[459671.539903] RPC:       created transport e3986000 with 16 slots
[459671.539929] RPC: 53234 reserved req e3a2d000 xid e2f61e3e
[459671.539975] RPC: 53234 xprt_connect xprt e3986000 is not connected
[459671.540170] RPC: 53234 xprt_connect_status: error 98 connecting to server localhost
[459671.540177] RPC: 53234 release request e3a2d000
[459671.540195] RPC:       destroying transport e3986000
[459671.540205] RPC:       disconnected transport e3986000

Expected results:

infinite number of mounts

Additional info:

Problem manifests itself on 2.6.35.13-91.fc14.i686.PAE as well, only workaround I have found is:

    echo 1 > /proc/sys/net/ipv4/tcp_tw_recycle
    echo 1 > /proc/sys/net/ipv4/tcp_tw_reuse

Comment 1 Anders Blomdell 2011-09-20 18:13:05 UTC
Created attachment 524079 [details]
Script to trigger mount bug

Comment 2 Jeff Layton 2011-09-20 19:36:48 UTC
(In reply to comment #0)

> Expected results:
> 
> infinite number of mounts
> 

Sounds like reserved port exhaustion. Does this problem go away if you do the mounts with "-o resvport"?

Comment 3 Anders Blomdell 2011-09-20 20:45:46 UTC
"-o resvport" gives same problem, with "-o noresvport" the numer of sockets in TIME_WAIT seems to stabilize around 700, will see if I can increase mount rate to trigger bug.

Comment 4 Anders Blomdell 2011-09-21 06:41:31 UTC
Would it make sense to convert EADDRINUSE to EAGAIN in exprt_connect_status, in the vain hope that it would eventually retry? Or would it make more sense to allow socket reuse (sock->sk->sk_reuse?), should it then be made an option?
Or should I just accept that the problem exists and use noresvport and cross my fingers that it won't bite me again?

Comment 5 Jeff Layton 2011-09-21 11:18:18 UTC
(In reply to comment #4)

Sorry, I meant -o noresvport before...

> Would it make sense to convert EADDRINUSE to EAGAIN in exprt_connect_status, in
> the vain hope that it would eventually retry?

I don't think that will work.

> Or would it make more sense to allow socket reuse (sock->sk->sk_reuse?),
> should it then be made an option?

I suspect that would get a chilly reception upstream, but you're welcome to propose it there. You'll need some way to deal with "stray" packets that come in after the socket has been closed and reused.

> Or should I just accept that the problem exists and use noresvport and cross my
> fingers that it won't bite me again?

That's probably what I'd recommend. Reserved ports are a limited resource so you'll never get an infinite number of connections with them. Note that the reproducer you have represents the pessimal case. It mounts only to immediately unmount again.

If you have multiple mounts to the same server, then sockets will be shared between them. So, for instance if you were to loop and do a ton of mounts to one server and then another loop to do a ton of unmounts then that will just make one socket (and probably will share superblocks too).

What exactly were you doing when you got "bitten" by this? Do you really have that many individual NFS servers? Or were you mounting and unmounting in quick succession like this for some other reason?

Comment 6 Anders Blomdell 2011-09-21 12:50:33 UTC
(In reply to comment #5)
> (In reply to comment #4)
> 
> Sorry, I meant -o noresvport before...
I forgot a smiley :-)

> > Would it make sense to convert EADDRINUSE to EAGAIN in exprt_connect_status, in
> > the vain hope that it would eventually retry?
> 
> I don't think that will work.
OK, I won't try that at home then :-)

> > Or would it make more sense to allow socket reuse (sock->sk->sk_reuse?),
> > should it then be made an option?
> 
> I suspect that would get a chilly reception upstream, but you're welcome to
> propose it there. You'll need some way to deal with "stray" packets that come
> in after the socket has been closed and reused.
I'm not surprised.

> > Or should I just accept that the problem exists and use noresvport and cross my
> > fingers that it won't bite me again?
> 
> That's probably what I'd recommend. Reserved ports are a limited resource so
> you'll never get an infinite number of connections with them. Note that the
> reproducer you have represents the pessimal case. It mounts only to immediately
> unmount again.
I know, this was the reproducible case :-)

Would have been easier to track down if EADDRINUSE did not get transmogrified into EIO.

> If you have multiple mounts to the same server, then sockets will be shared
> between them. So, for instance if you were to loop and do a ton of mounts to
> one server and then another loop to do a ton of unmounts then that will just
> make one socket (and probably will share superblocks too).
> 
> What exactly were you doing when you got "bitten" by this? Do you really have
> that many individual NFS servers? Or were you mounting and unmounting in quick
> succession like this for some other reason?

It ususally only bites during busy nights (backups, etc) when ypbind and ypserv seems to eat reserved ports (judged from TIME_WAIT status during light load). 

    echo 1 > /proc/sys/net/ipv4/tcp_tw_recycle
    echo 1 > /proc/sys/net/ipv4/tcp_tw_reuse

Is an OK workaround here, server is local and only switches, so stray packets should be very rare. Long term goal is to get rid of yp/nis, so problem will diminish itself.

Now the bug will show up in google and hopefully save somebody else some time. You can close this bug as you see fit!

Comment 7 Jeff Layton 2011-09-21 12:58:51 UTC
Ok, I think it's probably safe enough to use noresvport and that's probably safer than messing around with the tcp_tw parms. Reserved ports don't give much in the way of security these days anyway. We're stuck though with making that the default for nfs since that's what the RFC specifies.

I'll go ahead and close this WONTFIX.


Note You need to log in before you can comment on or make changes to this bug.