Bug 1392133 - nfs client mount fails to fallback to NFSv3 when talking to Ganesha server
Summary: nfs client mount fails to fallback to NFSv3 when talking to Ganesha server
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: nfs-ganesha
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Kaleb KEITHLEY
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-11-04 23:43 UTC by Valdis Kletnieks
Modified: 2016-11-08 00:05 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-11-07 19:37:47 UTC
Type: Bug


Attachments (Terms of Use)
tcpdump of failing NFS mount. (15.49 KB, application/octet-stream)
2016-11-04 23:43 UTC, Valdis Kletnieks
no flags Details

Description Valdis Kletnieks 2016-11-04 23:43:59 UTC
Created attachment 1217534 [details]
tcpdump of failing NFS mount.

Description of problem:
(This may be an upstream problem, or a server-end problem. We're seeing the same problem on multiple client machines)

We are in the process of deploying a Ganesha-based NFS server (RHEL 7.1, nfs-ganesha 2.3.2).  Due to internal issues that mean multiple auth servers would be involved, we're *not* doing NFSv4, and only NFSv3 (we can fortunately isolate the communities involved to different exports, so there's no issue with UID/GID namespace overlap).  As a result, although the NFS server is NFSv4 capable, there are no v4 exports, and the server is configured for 'default to nfsv3'.

If I do a 'mount server.name:/export/path/here /mnt/nfs', it goes astray:

Client end defaults to trying NFS 4.

Client sends nfsv4 NULL call with AUTH_NULL.
Server replies with 'accepted' and Accept State: program can't support procedure (3).

Client waits 4 seconds, and retries the exact same thing, rather than dropping back to NFSv3.  Lather rinse repeat till the mount command times out.

Specifying 'mount -o nfsvers=3' makes the mount work.


Version-Release number of selected component (if applicable):
nfs-utils-1.3.4-1.rc2.fc26

tcpdump attached.


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 J. Bruce Fields 2016-11-07 13:23:00 UTC
"Client sends nfsv4 NULL call with AUTH_NULL. Server replies with 'accepted' and Accept State: program can't support procedure (3)."

That response makes no sense (the client sent procedure 0, not procedure 3), so there's almost certainly some simple Ganesha bug here.  The client could possibly handle this better (e.g., erroring out instead of retrying), but I think for such a clear server bug it's not worth it.

By the way: "As a result, although the NFS server is NFSv4 capable, there are no v4 exports, and the server is configured for 'default to nfsv3'."

A 'default to nfsv3' server configuration doesn't really make sense: There is no way in the protocol for a server to indicate that it supports v4 but prefers v3, so if you want clients to negotiate down then v4 needs to be completely off.

Comment 2 Soumya Koduri 2016-11-07 14:23:20 UTC
Which version of nfs-ganesha are you using? I checked the behaviour using the latest nfs-ganesha-2.4 branch. The client seems to be falling back to NFSv3.

Comment 3 Valdis Kletnieks 2016-11-07 18:29:25 UTC
The client didn't request procedure 3.

The server sent back a result code 3 which is "program can't support procedure".

Or so says wireshark when looking at the data stream.

Comment 4 J. Bruce Fields 2016-11-07 18:59:14 UTC
(In reply to Valdis Kletnieks from comment #3)
> The client didn't request procedure 3.
> 
> The server sent back a result code 3 which is "program can't support
> procedure".
> 
> Or so says wireshark when looking at the data stream.

D'oh, you're right, I don't know what I was thinking when I said that.

Anyway, that's still a nonsensical response, support for procedure 0 is mandatory, so it should be PROG_MISMATCH if it doesn't support NFS version 4.

Comment 5 Valdis Kletnieks 2016-11-07 19:23:38 UTC
(In reply to J. Bruce Fields from comment #4)
> Anyway, that's still a nonsensical response, support for procedure 0 is
> mandatory, so it should be PROG_MISMATCH if it doesn't support NFS version 4.

Aha.  That's probably the smoking gun.

And deeper digging indicates that our copy of nfs-ganesha came from IBM, which presents the possibility that they broke it somehow.  I'll take it up with them.

Comment 6 J. Bruce Fields 2016-11-07 19:37:47 UTC
Thanks, in that case I think we should close this; reopen if it turns out there's some problem on our side after all.

Comment 7 Valdis Kletnieks 2016-11-07 19:49:03 UTC
OK, that works for me, will re-open if needed.

Comment 8 Frank Filz 2016-11-08 00:05:51 UTC
Hmm, there COULD have been bugs crept into Ganesha that make it not work right if NFS v4 is disabled. It SHOULD NOT register with rpcbind for NFS v4 if only Protocols=3 is defined in NFS_CORE_PARAM.

If that isn't the case, but all exports are Protocols=3 only, I think Ganesha will still build a PseudoFS with only a root directory, but in that case, it should "work" just not be very interesting...


Note You need to log in before you can comment on or make changes to this bug.