Red Hat Bugzilla – Full Text Bug Listing
|Summary:||NFS to NFSv3 server is broken in F-13|
|Product:||[Fedora] Fedora||Reporter:||Tom Lane <tgl>|
|Component:||nfs-utils||Assignee:||Steve Dickson <steved>|
|Status:||CLOSED NOTABUG||QA Contact:||Fedora Extras Quality Assurance <extras-qa>|
|Version:||13||CC:||dan, gmrandazzo, hhorak, jlayton, steved, zkabelac|
|Fixed In Version:||Doc Type:||Bug Fix|
|Doc Text:||Story Points:||---|
|Last Closed:||2010-10-15 13:47:03 EDT||Type:||---|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
Description Tom Lane 2010-06-20 13:10:17 EDT
Description of problem: I just updated to F-13 and find myself once again unable to access my old UDP-only NFSv3 server. Experimentation suggests that NFSv3 support is completely broken now, because if I try to do the mount manually rather than via automount, I get this: $ sudo mount -t nfs -s -v -o nosuid,nodev,intr sss2:/ /tmp/zzz mount.nfs: timeout set for Sun Jun 20 13:06:09 2010 mount.nfs: trying text-based options 'intr,sloppy,vers=4,addr=192.168.168.3,clientaddr=192.168.168.8' mount.nfs: mount(2): Connection refused mount.nfs: trying text-based options 'intr,sloppy,vers=4,addr=192.168.168.3,clientaddr=192.168.168.8' mount.nfs: mount(2): Connection refused ... (repeat till timeout) ... The vers=4 bit looks like a smoking gun :-( ... Version-Release number of selected component (if applicable): nfs-utils-1.2.2-2.fc13.x86_64 nfs-utils-lib-1.1.5-1.fc13.x86_64 How reproducible: 100% Steps to Reproduce: 1. Try to mount remote NFS volume from NFSv3, UDP-only server. Actual results: Times out. Expected results: Success. Additional info: See bug #528776 for details of the configuration involved here.
Comment 1 Steve Dickson 2010-06-22 07:02:08 EDT
what is the output of showmount -e sss2 and rpcinfo -p sss2:
Comment 2 Tom Lane 2010-06-23 00:33:50 EDT
$ showmount -e sss2 Export list for sss2: / sss,rh1,rh2,hp715,g3,g42,pro /home sss,rh1,rh2,hp715,g3,g42,pro /opt sss,rh1,rh2,hp715,g3,g42,pro /tmp sss,rh1,rh2,hp715,g3,g42,pro /usr sss,rh1,rh2,hp715,g3,g42,pro /var sss,rh1,rh2,hp715,g3,g42,pro (btw, rh2 is the F-13 machine that's failing here) [tgl@rh2 ~]$ rpcinfo -p sss2 program vers proto port service 100000 2 tcp 111 portmapper 100000 2 udp 111 portmapper 100024 1 udp 855 status 100024 1 tcp 857 status 100021 1 tcp 861 nlockmgr 100021 1 udp 1030 nlockmgr 100021 3 tcp 865 nlockmgr 100021 3 udp 1031 nlockmgr 100021 4 tcp 869 nlockmgr 100021 4 udp 1032 nlockmgr 100020 1 udp 4045 llockmgr 100020 1 tcp 4045 llockmgr 100021 2 tcp 876 nlockmgr 100099 1 udp 2155 100068 2 udp 1036 100068 3 udp 1036 100068 4 udp 1036 100068 5 udp 1036 100083 1 tcp 1036 351456 1 udp 847 351456 1 tcp 849 100005 1 udp 889 mountd 100005 3 udp 889 mountd 100005 1 tcp 892 mountd 100005 3 tcp 892 mountd 100003 2 udp 2049 nfs 100003 3 udp 2049 nfs 1342177279 4 tcp 1056 1342177279 1 tcp 1056
Comment 3 Giuseppe Marco Randazzo 2010-07-06 12:49:22 EDT
Hi! edit /etc/nfsmount.conf and in "[ NFSMount_Global_Options ]" set: "Defaultvers=3" For me it works.
Comment 4 Tom Lane 2010-07-08 12:19:42 EDT
Confirm the workaround in comment #3 gets things going again for me.
Comment 5 Steve Dickson 2010-10-14 11:11:32 EDT
My apologizes for disappear on this... comment #3 is the workaround... but I would like to figure out the problem... I see your server is only advertising v2 and v3. What OS is your server running? Also could you post a bzip2 binary network trace something similar to; yum install wireshark tshark -w /tmp/data.pcap host <server> bzip2 /tmp/data.pcap tia....
Comment 6 Tom Lane 2010-10-14 11:21:44 EDT
It's HPUX 10.20, probably fifteen years old at this point. I'm not sure whether anybody still cares about compatibility with that --- I'm perfectly willing to use the Defaultvers workaround.
Comment 7 Steve Dickson 2010-10-14 11:29:06 EDT
Understood... but if we figure out why mounts are not dialling back from v4 to v3 automatically I could fix it in upcoming release. Meaning things would just work out of the box... Unfortunately I don't have access to a HPUX, so if possible, could you please post a network trace as described in Comment 5 It would be much appreciated!
Comment 9 Tom Lane 2010-10-14 17:46:12 EDT
OK, there you go. This is a tshark trace of the following interaction: [tgl@rh3 ~]$ ls /net/sss2 home/ opt/ tmp/ usr/ var/ [tgl@rh3 ~]$ ls /net/sss2/home ls: cannot open directory /net/sss2/home: No such file or directory [tgl@rh3 ~]$ Each command sat for a minute or two before responding. Note that what the first command is reporting is the names of exported volumes on the server, but not any of the loose files that are in the server's root directory. Other than the delay, the symptoms look very very much like bug #528776, which you might want to consult for additional details about my setup and the expected results from these commands. You commented there that the critical point was lack of TCP support in this server, not so much the NFS protocol version. Again, setting Defaultvers=3 in /etc/nfsmount.conf solves the problem immediately.
Comment 10 Steve Dickson 2010-10-14 18:47:05 EDT
Thank you for taking the time... I'll need to digest this but I do appreciate you making the effort!
Comment 11 Giuseppe Marco Randazzo 2010-10-15 04:09:35 EDT
alternatively if you do not set in nfsmount.conf to Defaultvers=3, you can edit /etc/fstab and specify the version of nfs like this line: 18.104.22.168:/home/pippa /mnt/remote_pippa nfs defaults,nfsvers=3 0 0 so you could mount various nfs with various version \o/ ;) END
Comment 12 Steve Dickson 2010-10-15 09:58:05 EDT
When doing the a verbose mount (i.e. mount -v ) without specifying the v3 do the messages displayed contain: mount.nfs: mount(2): Connection refused ?
Comment 13 Tom Lane 2010-10-15 10:41:29 EDT
Hm, sorry, I don't usually do any explicit mounts in this setup. What command do you want me to try, exactly?
Comment 14 Steve Dickson 2010-10-15 11:55:44 EDT
Sorry... I got my answer from looking at your opening description... I see the problem... The f13 client now, by default, first tries 'NFS v4 over TCP' when initiating the mount to the server. In the past 'NFS v3 over TCP' was first tried. The idea is for the server to return an "NO SUPPORT VERSION" error causing the client to dial back to 'NFS v3 over TCP'. If that combination does not work, client again dial back to 'NFS v3 over UDP'. This type of negotiation happens until 'NFS v2 over UDP' fails. Then the mount will fail. In your case your server only supports 'NFS v3 over UDP' and 'NFS v2 over UDP'. So in your case, what should happen is the 'NFS v4 over TCP' mount will fail as well as the 'NFS v3 over TCP' mount attempt. The the 'NFS v3 over UDP' attempt should succeed... Here is the problem, when the f13 sends the 'NFS v4 over TCP' your server is failing the mount with "Connection refused". Unfortunately "Connection refused" can have multiple meanings. One, it can mean there is not a TCP listener (which is true in this case) but it also can mean the server is down and it could be on the way up... So we must keep trying (using the same 'NFS v4 over TCP' combo) assuming the server is on the way up. Unfortunately I don't have to answer for this problem.. Actually I think this is a long standing problem because if the server only supported 'NFS v2 over UDP' (which is very uncommon,) the same scenario would occur when the legacy mount send the 'NFS v3 over TCP' My suggestion to use the '[ Server "Server_Name" ]' section in the /etc/nfsmount.conf to cause all mounts to that server to default NFS V3. See nfsmount.conf(5) for details. I do thank you for your time... it was much appreciated.
Comment 15 Tom Lane 2010-10-15 12:09:37 EDT
Hmm ... so I guess the remaining question is what about that behavior changed in F-13? Because it used to work fine without any configuration hacking.
Comment 16 Steve Dickson 2010-10-15 13:24:09 EDT
> Hmm ... so I guess the remaining question is what about that behavior changed > in F-13? Because it used to work fine without any configuration hacking. Good point... It was decided with v4 mounts not to do any pre-checking with the remote portmapper see if the server supports that version and protocol. The reason has to do with firewalls. With v4 only the 2049 port has to be open for the mount to succeed since the mounting protocol is built into the v4. Unlike legacy NFS versions were the mount was separate protocol, needing a separate daemon (rpc.mountd) listening on dynamic ports... Very firewall unfriendly... Especially when it means you also have to open up the portmapper port so the client get the port of rpc.mountd. But I do agree, having to to do "configuration hacking" is a pain... Question, why doesn't your server support NFS over TCP? TCP support has been around for many years and using TCP as the transport is by far superior that UDP in a number of ways... Actually that's another workaround... turn TCP support on the server and the "configuration hacking" will not be needed.
Comment 17 Tom Lane 2010-10-15 13:47:03 EDT
(In reply to comment #16) > It was decided with v4 mounts not to do any pre-checking with the > remote portmapper see if the server supports that version > and protocol. The reason has to do with firewalls. I see. That's a pretty fair reason. > Question, why doesn't your server support NFS over TCP? AFAICT it's just too old. There's no indication in the nfsd docs that it has any ability to do TCP. Sooner or later I'll get around to replacing it. Anyway, thanks for your time. Since there's a defensible reason for changing this behavior, it's clearly NOTABUG.