Description of problem: A successfull NFS mount/umount consists of a NFS RPC call and a MOUNT RPC call. If mounted without specifying the protocol MOUNT may goes via UDP in which case the MOUNT(UMNT) still goes via TCP due to /proc/mounts which holds the protocol used for the NFS call. Version-Release number of selected component (if applicable): util-linux-2.12a-EL4.20 How reproducible: Steps to Reproduce: 1. Capture traffic 2. Mount NFS share without specifying the protocl to use 3. umount 4. Analyze capture. You will see the NFS part is handled via TCP and the mount call via UDP at mount time, but the umount (MOUNT/UMNT) call then goes via TCP. Actual results: If firewall between hosts only allows mount calls via UDP, the umount operation fails (due to blocked TCP packet). Expected results: When able to mount, someone should be able to umount too. That means that /proc/mounts should hold the protocol used for MOUNT instead of the procol used for NFS. Additional info: I'm not sure if this patch is the optimal solution or why the protocol used for NFS is stored under /proc/mounts. It's just that it's a wrong behaviour that the umount uses a different protocol (TCP) that the mount (UDP).
True if no protocol is specified, the NULL RPC ping and the call to the rpc.mountd will be tried using UDP first and TCP. Is is done to keep the reserve port pools from been eaten up in TIMEWAIT state. Specifying a protocol will cause the mount to only use the given protocol... and from util-linux-2.12a-16.EL4.3 and beyond the umount will use the protocol the mount is using. Which util-linux version are you using?
Created attachment 138515 [details] NFS mount/umount capture umount uses different protocol (TCP) than mount (UDP)
(In reply to comment #1) > Which util-linux version are you using? util-linux-2.12a-16.EL4.20 > and from util-linux-2.12a-16.EL4.3 and beyond > the umount will use the protocol the mount is using. See attached file. I experience different behaviour.
Looking back it appears the umount problem has already been fixed... * Mon Nov 29 2004 Steve Dickson <SteveD> 2.12a-16.EL4.3 - Made NFS mounts adhere to the IP protocol if specified on command line as well as made NFS umounts adhere to the current IP protocol. Fix #140016 But the network trace you posted does show tcp being used to rpc ping the server.... which is strange.... Are you explicitly using he -o udp mount option?
The package used is the newest available from RH (4.3), so I expect bugs from 2005 already included... To summarize: The command uses (_without_ specifiying a protocol) TCP for RPC NFS (NULL call/ping) and UDP for MOUNTD. Additionally 'tcp' gets written to /proc/mounts so the umount command uses TCP for MOUNTD which is not the expected result. BTW: The "udp" and "tcp" options work both as expected...
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
Ok... Sorry for being a bit brain dead on this one... but last night I realized that to show this "feature" all one needs to do is use the -v mount option when mount a NFS fs. Now the reason we do a TCP ping to the server is TCP is the default transport and we want to see if the server is up and accepting TCP connections. It would not make sense to do a UDP ping to the server and then do a TCP mount. Now the reason we do a UDP ping to the remote mountd, is basically to save ports from going into TIMEWAIT as I talked about in Comment #1. The point being, it does not matter which transport is used to get the information from the mountd when it come to which transport the mount will used (hopefully that makes sense)... So I think the problem here is when the TCP ping fails (due to being firewall-ed off) the mount should "fall back" and try a UDP ping... And your saying that does not happen?
(In reply to comment #7) > Now the reason we do a UDP ping to the remote mountd, is basically > to save ports from going into TIMEWAIT as I talked about in Comment #1. > The point being, it does not matter which transport is used to get the > information from the mountd when it come to which transport the mount > will used (hopefully that makes sense)... partly. By 'the information' you mean the information from the reply to the request of mountd procedure 1 (MNT) and 3 (GETPORT)? > So I think the problem here is when the TCP ping fails (due to being > firewall-ed off) the mount should "fall back" and try a UDP ping... > And your saying that does not happen? No. All pings succeed at 'mount' time. In fact, the whole mount succeeds without complications. This is the short version of the packet capture attachment: mount ===== *)TCP 3-way-handshake Server-Port 111 *)Client ==> TCP_RPC(111): GETPORT CALL NFS TCP ==> Server *)Server ==> TCP_RPC(111): GETPORT REPLY PORT 2049 ==> Client *)TCP 3-way-handshake Server-Port 2049 *)Client ==> TCP_NFS(2049): NULL CALL ==> Server *)Server ==> TCP_NFS(2049): NULL REPLY ==> Client *)Client ==> UDP_RPC(111): GETPORT CALL MOUNT UDP ==> Server *)Server ==> UDP_RPC(111): GETPORT REPLY PORT 946 ==> Client *)Client ==> UDP_MOUNT(946): NULL CALL ==> Server *)Server ==> UDP_MOUNT(946): NULL REPLY ==> Client *)Client ==> UDP_MOUNT(946): MNT CALL ==> Server *)Server ==> UDP_MOUNT(946): MNT REPLY ==> Client Seems as a normal mount procedure to me. Everything is fine, except that /proc/mounts reports 'tcp' as used protocol. This affects the 'umount' program as you can see in the following snap: umount ====== *)TCP 3-way-handshake Server-Port 111 *)Client ==> TCP_RPC(111): GETPORT CALL MOUNT TCP ==> Server *)Server ==> TCP_RPC(111): GETPORT REPLY PORT 949 ==> Client *)TCP 3-way-handshake: *)Client ==> TCP_SYN PORT 949 ==> Server *)Firewall intercepts *)umount hangs As /proc/mounts reports 'tcp' as protocol the client requests TCP which is not the same behaviour as during 'mount'-times. And that's it: The 'mount' and 'umount' programs behave differently. That's what I thing is wrong. Hope that things are clear now :)
Ok.. so all RPCs destine to the remote mountd go over UDP and all the rest of the RPCs (including all the umount ones) go over TCP... I do see this... So the problems is? or you would rather see???
(In reply to comment #9) > So the problems is? > or you would rather see??? I would like to have the same behaviour of the umount and the mount command, which includes using the same protocols for the same services at mount and umount times.
Well as I explained, the reason UDP is used for the mountd RPCs is to save on TCP connections from going into TIMEWAIT which means those ports are not usable for a minute or so... which a bad thing when auto mount tries to mount thousands of mounts... The reason the protocol found in /proc/mounts is used for umounts is in case there is a firewall rules setup, only allowing that particular protocol... Note, if there is TCP only firewall rules set up, the mount will take a bit longer since the UDP rpc sent out will fail.... but TCP will be tried immediately after that failure. Now if one only wants one protocol to be used, explicitly define the protocol with an mount option. So in the end, what I'm trying to do is find a happy medium for all environments... So unless a mount is failing or the extra UDP rpcs are causing other problems, I'm leaning toward closing this bz as NOTABUG...
(In reply to comment #11) OK. I think we're still not talking about the same... > Well as I explained, the reason UDP is used for the mountd RPCs > is to save on TCP connections from going into TIMEWAIT which > means those ports are not usable for a minute or so... which > a bad thing when auto mount tries to mount thousands of > mounts... I totally agree on that. UDP is just great, work as expected and results in a working system. I'm very happy with UDP. You said, UDP is being used first to prevent ports from coming into TIMEWAIT state and that's exactly what happens on my boxes - as I already wrote in the bug report itself, comment #2, comment #5 and comment #8. --> So to cut that short, can we just agree that UDP is being used first, works fine and is doing a great job? > The reason the protocol found in /proc/mounts is used > for umounts is in case there is a firewall rules setup, > only allowing that particular protocol... I think it's also a great idea to store the protocol in /proc/mounts. I think it would be also great if the protocol stored there would also be used for the right purpose during umount. Now the protocol stored there during 'mount' is the one being used for the rpc-service NFS (2049). --> Can we agree on that? The stored protocol (the 'tcp' or the 'udp' string in /proc/mounts) is being used for the rpc-service mountd during 'umount' ! --> Can we agree on that? But this usage of /proc/mounts just makes no sense to me. If 'mount' would store the protocol of the rpc-service mountd it would be tracable. > Now if one only wants one protocol to be used, > explicitly define the protocol with an mount > option. As I stated in comment #5, I know that and it is working for me... > So unless a mount is > failing or the extra UDP rpcs are causing other problems, > I'm leaning toward closing this bz as NOTABUG... True. UDP RPCs are fine. Mount is also working. Still, I think the protocol of the wrong service is stored under /proc/mounts and I think that's a bug !
After taking a second look, I thinking the bug is the fact that udp mount like "mount -o udp server:/export /mnt" looks like a tcp mount in /proc/mounts because there is not a "upd" in the /proc/mounts line: For Example: Doing a: mount -o udp rhelxen:/home /mnt/home_udp has the following line in /proc/mounts rhelxen:/home /mnt/home_udp nfs rw,v3,rsize=32768,wsize=32768,hard, tcp,lock,proto=tcp,timeo=600,retrans=5,addr=rhelxen 0 0 ^^^ should be udp instead of tcp If the correct protocol was being recorded, then umount would do the right thing...
Created attachment 177321 [details] patch -- default to UDP umount when no protocol is specified and check /etc/mtab before /proc/mounts This patch should replace the other proposed patch. /proc/mounts reports the NFS_MOUNT_TCP flag as a mount option regardless of whether it was used to do the mount or not. This means we can't reliably use it to tell whether someone specified the tcp option when they mounted. I don't think we can fix this by changing how the kernel reports this. The question we need to answer is "did the user explicitly set the tcp option on the command line?". With the current design, the kernel has no way to know this, so we need to change it to query /etc/mtab instead.
Hmm if the kernel is reporting TCP on a UDP mount then that sounds like a different issue. I think this problem is different, however: Currently when someone mounts with no options, we use UDP to do all of the mount requests and TCP for the NFS socket. The kernel reports "tcp" in the options list in /proc/mounts in this case. The kernel doesn't have any concept of what protocol was used to do the mount request. I think we need to change umount to look at /etc/mtab first and to default to udp if the tcp option wasn't explicitly recorded there for the mount.
Also, my RHEL4 box doesn't show the same issue with /proc/mounts that you're seeing: # mount -o udp server:/test/user1 /mnt/test # grep /mnt/test /proc/mounts server:/test/user1 /mnt/test nfs rw,v3,rsize=32768,wsize=32768,hard,lock,proto=udp,timeo=11,retrans=5,addr=server 0 0
Steve, thoughts on patch in comment #31? Also, I'm not seeing the same issue you are with the wrong protocol listed in /proc/mounts. On what kernel rev are you seeing that?
2.6.9-55.ELxenU util-linux-2.12a-16.EL4.25 > This patch should replace the other proposed patch. /proc/mounts reports the > NFS_MOUNT_TCP flag as a mount option regardless of whether it was used to do > the mount or not. This means we can't reliably use it to tell whether someone > specified the tcp option when they mounted. I believe this is the problem.. if we fix /proc/mounts everything works...
Jeff, Isn't https://bugzilla.redhat.com/show_bug.cgi?id=171712 basically the same problem?
No, that's a NFS4 only problem and it's only because NFS4 uses TCP by default, but doesn't bother to set the NFS_MOUNT_TCP flag. The original problem description is a case where there is a firewall between client and server. The firewall lets TCP traffic through for nfsd and drops TCP traffic to mountd. I don't think this can be fixed by making sure that /proc/mounts shows tcp or udp. The info that /proc/mounts shows just reflects the protocol used for the NFS socket. I think to fix this, we want to make sure that the umount request uses the same protocol as the mount request. That info isn't recorded by the kernel so we have to look at /etc/mtab.
> I don't think this can be fixed by making sure that /proc/mounts shows tcp or > udp. The firewall just pointed out that there's a strange bevhaviour (a bug in my opinion). Without the firewall, everything works fine, but the behaviour is still wrong and with the right protocol in /proc/mounts, it's fixed. > The info that /proc/mounts shows just reflects the protocol used for the > NFS socket. And that's the problem. In my opinion, it should reflect the protocol used for the MOUNT socket, as this value is beeing used for MOUNT during the 'umount' call. Pls. find below the patch we've used to get the expected behaviour. It should show what I mean: --- util-linux-2.12a_orig/mount/nfsmount.c 2006-10-09 08:51:28.000000000 +0200 +++ util-linux-2.12a/mount/nfsmount.c 2006-10-09 13:00:35.000000000 +0200 @@ -1308,9 +1308,9 @@ #endif } - /* create nfs socket for kernel */ + /* create mount socket for kernel */ - if (nfs_pmap->pm_prot == IPPROTO_TCP) + if (mnt_pmap->pm_prot == IPPROTO_TCP) fsock = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP); else fsock = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP); @@ -1339,7 +1339,7 @@ } #if NFS_MOUNT_VERSION >= 2 - if (nfs_pmap->pm_prot == IPPROTO_TCP) + if (mnt_pmap->pm_prot == IPPROTO_TCP) data.flags |= NFS_MOUNT_TCP; else data.flags &= ~NFS_MOUNT_TCP;
>> The info that /proc/mounts shows just reflects the protocol used for the >> NFS socket. > And that's the problem. In my opinion, it should reflect the protocol used for > the MOUNT socket, as this value is beeing used for MOUNT during the 'umount' > call. Yes I agree. So the above patch fixes the problem?
Yes, it does!
I'm afraid I don't understand. On the one hand you're saying that the problem is that the kernel doesn't record what protocol was used to do the mount command, but then state that a util-linux patch fixes it? That patch does entirely the wrong thing, IMO. Rather than fixing umount so that it's using the same protocol as the mount call, it's forcing the NFS socket to use the same protocol as the mount call. This will change the default to be UDP for both sockets unless someone adds the -o tcp option. I don't think this is what we want here. To fix this the right way, I can see two options: 1) the patch that I've proposed in comment #31 which uses the info recorded in the mtab to determine what protocol was used for the mount call 2) add a patch to the kernel to record what mount protocol was used (reported as mountproto= or something), and fix up util-linux to look at that option The only reason not to go with #1 is if we think that the info in /etc/mtab isn't reliable when we go to unmount.
(In reply to comment #41) > I'm afraid I don't understand. On the one hand you're saying that the problem is > that the kernel doesn't record what protocol was used to do the mount command, > but then state that a util-linux patch fixes it? uhmm..we're 3 people here, right? I just said /proc/mount reocrds the NFS and not the MOUNT protocol used.... Additionally, in my bug report, I've mentioned that I'm not sure if this patch is the right thing to do. > That patch does entirely the wrong thing, IMO. Rather than fixing umount so that > it's using the same protocol as the mount call, it's forcing the NFS socket to > use the same protocol as the mount call. True. I didn't get that the normal NFS tasks, while mounted, are then done via UDP. So this patch is definitley not the right thing to do. > To fix this the right way, I can see two options: > > 1) the patch that I've proposed in comment #31 which uses the info recorded in > the mtab to determine what protocol was used for the mount call I don't have access to this patch. But mounting without proto= option does not result in any protocol written to mtab?! > 2) add a patch to the kernel to record what mount protocol was used (reported as > mountproto= or something), and fix up util-linux to look at that option I think that would be the best thing to do.
> I don't have access to this patch. But mounting without proto= option does not > result in any protocol written to mtab?! Ahh sorry. I've opened it up, you should be able to see it now... Yes, /etc/mtab does not record the proto= option. When there isn't one we'll just have the umount default to UDP. That should make sure that the umount uses the same protocol as the mount.
Steve pointed out an issue with the patch in comment #31 The current mount command attempts to use UDP to talk to mountd first, and then falls back to using TCP. So just because there is no "tcp" option listed in /etc/mtab, it doesn't mean that we necessarily used UDP to talk to mountd. So the patch in comment #31 won't work in 100% of cases. I think to fix this the right way we would need to change it so that umount uses a similar fallback mechanism as mount. I'm not sure of the feasibility of that, however...
Created attachment 179221 [details] Proposed RHEL4 Patch I'm thinking the main problem is the protocol rollback mechanism is broken so which in turn causes the unmount to used the wrong protocol. This patch fixes the roll back mechanism by dealing with TCP timeouts better. I tested this patch by disabling all TCP to the server and than simply do a default mount (i.e. did not specify the protocol). After the TCP timeout, a UDP mount did occur and them only UDP was used during the unmount.
Fixed in util-linux-2.12a-16.EL4.29
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2007-0785.html