Description of problem: We have an NFS server with 1000Mbps links to our switch. We have several client machines that have 100Mbps connections to the switch. When we try to do NFS installs, they take 4-6 hours to complete. Version-Release number of selected component (if applicable): We have seen this problem with FC3 - FC5 (we have not tested for FC1 or FC2). We have also seen this problem with RHEL3 & RHEL4. How reproducible: When the client & server have different "speed" network connections to each other, this problem always appears. Steps to Reproduce: 1. Set up a network install server with a 1000Mbps connection 2. Launch an NFS install on a client with a 100Mbps connection (actually, just make sure that the client & server have different speed connections. Actual results: The install takes several hours. One of my upgrades of a box with only ~500 packages installed, took 6 hours to complete. I redid that install using another NFS server with a 100Mbps link and it took about 15 minutes. Expected results: The install should be just as fast as if both the client & server had links of the same speed as the slower of the two. Additional info: We believe that the issue is being caused by anaconda's use of UDP based NFS connections. It appears that when there is a difference in connection speed into the switch between the client & server, these UDP segments are significantly slowed down. We believe that if anaconda were updated to use TCP NFS (v3?) by default, that the installation time would be what is expected.
Please try with rawhide or a Fedora 7 test release. With kickstart installs, you can pass mount options for NFS installs. We don't let you do this through an interactive install. Pass the --opts= parameter in your kickstart file with the mount(8) options you want to use for NFS. Let us know if that works or not.
We worked around the problem here by reconfiguring our managed switch to turn on flow control. As cheaper, non-managed switches have this on by default, we expect that this issue would not appear for admins with such switches. However, this could still be an issue in environments with managed switches. There are two possible fixes: 1. Users will have to enable flow control on their managed switches. This could be an issue where the systems admins are not the ones in control of the networking infrastructure, such as in most medium to large enterprises or if they have other applications for which they need to leave flow control off. 2. Modify anaconda to use TCP for NFS. This solution fixes it for everyone, everywhere regardless of switch configuration. Personally, I feel that #2 is the right answer. We could try turning flow control off in our switch and run an NFS install of F7T[whatever]. However, if it's still using UDP for NFS, the results will be the same as before.
UDP is still the default proto option for mount (see man 5 mount) and we're hesitant to change to TCP in anaconda because of this. Basically, we'll create a whole new set of problems for a different class of people by changing the proto we're using. If you really do need to tweak the settings to this degree, consider using a kickstart install with the --opts= parameter as I mentioned earlier. If you'd like to see TCP become the default for NFS mounts (after which we'd be much more likely to change how anaconda works) then please file a bug against the nfs component asking for that change. Thanks for the report.
UDP is not the default protocol for NFS v3. Is anaconda using NFS v2? I'm not seeing a mount(5) man page. There are man pages for mount in other sections, but not in section 5 on FC6, RHEL5, several other distros I have access to or in several minutes of Google searching. Looking at all the other man pages for mount, I didn't find TCP or UDP mentioned anywhere. Where are you getting this information? I'm not finding it. What set of problems are you anticipating? TCP on a LAN isn't some kind of exotic thing. Linux systems have supported NFS v3 for years. TCP is the default for NFS mounts. Simply running tcpdump or wireshark while mounting shares or accessing files shows that. The problem here is that anaconda (not mount) is using UDP based NFS to access the installation "media" from the network. With different speed links between the NFS server and the machine being installed on, it takes several hours to install. The reason for this is because many managed switches do not have flow control turned on by default (non-managed switches do). Because of this, the problem is only going to occur for people who have managed switches where they haven't turned on the flow control options in their switch(es). In other words, small networks won't experience the problem, but big ones (including enterprise networks) will. The way to fix this for everyone is to have anaconda use NFS over TCP. This solution also does not require enterprise clients to fight the fight with their networking people to get flow control on in the switches just to be able to do network installs.
I encountered the same problem on FC7. I thought NFS install was broken completely. Tried ftp install from the same server ... worked fine. Ran cable to plug new client system in to the same 1G unmanaged switch as NFS server ... worked fine. Client has Intel E1000 1Gb NIC, but was originally plugged in to 100Mb unmanaged full-duplex switch on the same subnet as the server. Lamont said: ... small networks won't experience the problem, but big ones (including enterprise networks) will. I have a small network and I am experiencing the problem. I agree that NFS should be using TCP. Adding options to a kickstart file is not a practical solution for many/most people. RHEL4 says that TCP is the default NFS protocol. http://www.redhat.com/docs/manuals/enterprise/RHEL-4-Manual/ref-guide/ch-nfs.html Let me know if you need to me to do any testing or data collection on this.
Fedora apologizes that these issues have not been resolved yet. We're sorry it's taken so long for your bug to be properly triaged and acted on. We appreciate the time you took to report this issue and want to make sure no important bugs slip through the cracks. If you're currently running a version of Fedora Core between 1 and 6, please note that Fedora no longer maintains these releases. We strongly encourage you to upgrade to a current Fedora release. In order to refocus our efforts as a project we are flagging all of the open bugs for releases which are no longer maintained and closing them. http://fedoraproject.org/wiki/LifeCycle/EOL If this bug is still open against Fedora Core 1 through 6, thirty days from now, it will be closed 'WONTFIX'. If you can reporduce this bug in the latest Fedora version, please change to the respective version. If you are unable to do this, please add a comment to this bug requesting the change. Thanks for your help, and we apologize again that we haven't handled these issues to this point. The process we are following is outlined here: http://fedoraproject.org/wiki/BugZappers/F9CleanUp We will be following the process here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping to ensure this doesn't happen again. And if you'd like to join the bug triage team to help make things better, check out http://fedoraproject.org/wiki/BugZappers
This bug is open for a Fedora version that is no longer maintained and will not be fixed by Fedora. Therefore we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen thus bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed.