Created attachment 366988 [details] copy of boot.log Description of problem: Boot log shows nfs error at login screen. Version-Release number of selected component (if applicable):nfsd module for kernel 2.6.31.5-96.fc12.i686.PAE How reproducible: Boot up computer Steps to Reproduce: 1.Boot up computer 2.At login screen, click on the yellow error icon. 3.Read the boot.log Actual results:Boot.log error line reads Starting NFS daemon: rpc.nfsd: unable to resolve ANYADDR:nfs to inet address: Name or service not known rpc.nfsd: unable to set any sockets for nfsd [60G[[0;31mFAILED[0;39m] Expected results:No error at login screen Additional info:Running Fedora 12 Beta, fully updated as of 10/30/09 I can see the place in /etc/init.d/nfs where error is being called from. echo -n $"Starting NFS daemon: " daemon rpc.nfsd $RPCNFSDARGS $RPCNFSDCOUNT RETVAL=$? echo [ $RETVAL -ne 0 ] && exit $RETVAL I experimented by putting values into /etc/exports to see if it had any effect, still the error came up. Compared fully updated F11 setup to the F12 setup, only difference I found was the new 2.6.31 kernel and its accompanying nfsd module. On F11 with 2.6.30 kernel, error does not occur.
This has to do with the fact there is a race condition between NFS and the NetworkManager bring up the network. Even though NetworkManager is started before NFS, there is no network interface configured when NFS starts, which is the cause of the failure. Looking at the debugging logs (by setting RPCNFSDARGS="-d -s' in /etc/sysconfig/nfs) you can clearly see NFS is looking for the interface before NetworkManger has it configured Dan, any clue as to why this might be happening?
NetworkManager brings the network up asynchronously (always has). This means that init scripts *cannot* depend on the network resource they need being available when they start. There's a few ways to do deal with this: 1) the sledgehammer: set NETWORKWAIT=yes in /etc/sysconfig/network, which will cause your entire bootup to stall for 30 seconds while NetworkManager tries to get a network connection. This is the same behavior as 'ifup' 2) make the init scripts smart about what resources they require; if they need a network connection via eth0, then they should check and see if eth0 is up and if not, wait for it to come up 3) use a 'dispatcher script' to restart the service when a specific interface goes up or down; 'man NetworkManager' will give you more information about this, and the 'autofs' package does this already Basically, initscripts are pretty stupid these days, and they assume that when they start, all the resources they need will be present. That's not always the case, but we also don't always want to stall bootup for 30 seconds just because the network cable isn't plugged in. The services should really be starting on demand, and Fedora 13 should provide that with the 'upstart' project which we have shipped since Fedora 10, but in a sysvinit compatibility mode rather than a pure dependency mode.
> NetworkManager brings the network up asynchronously (always has). This means > that init scripts *cannot* depend on the network resource they need being > available when they start. After further review, it appears there has been a change to rpc.nfsd that will make it more sensitive to the interface not be configured. > 2) make the init scripts smart about what resources they require; Bill, Would you happen to know if there is going to be any intelligence added to the initscripts that would support this type of smartness? > 3) use a 'dispatcher script' to restart the service when a specific > interface goes up or down; I'm assuming this will probably be the best answer at this point... > Fedora 13 should provide that with the 'upstart' project which we > have shipped since Fedora 10, but in a sysvinit compatibility mode > rather than a pure dependency mode I had a similar thought... The current system has gotten us pretty far, but its not clear how much further it will be able to go..
(In reply to comment #3) > > 2) make the init scripts smart about what resources they require; > Bill, Would you happen to know if there is going to be any > intelligence added to the initscripts that would support > this type of smartness? There's no coherent protocol for a script to specify "I need interface <foo> with address <bar> to be up"; I don't see how you can solve that in any global initscripts way. > > 3) use a 'dispatcher script' to restart the service when a specific > > interface goes up or down; > > I'm assuming this will probably be the best answer at this point... Yes, if you're sensitive to network configuration, you should go to the source of it.
*** Bug 533893 has been marked as a duplicate of this bug. ***
>> I'm assuming this will probably be the best answer at this point... > > Yes, if you're sensitive to network configuration, you should go to > the source of it. I don't think dispatcher script are the answer either... here is what I did and the results... Added the following line (which was borrowed from the netfs init scrip) To the top of the NFS init script [ ! -f /var/lock/subsys/network -a ! -f /var/lock/subsys/NetworkManager ] && exit 0 Created /etc/NetworkManager/dispatcher.d/04-nfs with the following in it: if [ "$2" = "up" ]; then /sbin/ip route ls | grep -q ^default && { /sbin/chkconfig nfs && /etc/rc.d/init.d/nfs start || : } && { :; } fi It does work most of the time... meaning the scripts starts up without failing... on time the nfs initscript failed becaue the server was already running... and when the 04-nfs script is used to start the service there is not status on the console... Which means there is no way to debug failures since they are not logged anywhere... or am I missing something??
I just proposed the following patch for upstream acceptance... commit 2905358524c0835311501bad04c521479b0525ff Author: Steve Dickson <steved> Date: Thu Nov 12 14:16:12 2009 -0500 Remove the AI_ADDRCONFIG hint flag to getaddrinfo() when it's call by nfsd to set up the file descriptors that are sent to the kernel. The flag causes the getaddrinfo() to fail, with EAI_NONAME, when there is not a non-loopback network interface configured. Signed-off-by: Steve Dickson <steved> diff --git a/utils/nfsd/nfssvc.c b/utils/nfsd/nfssvc.c index 12d3253..b8028bb 100644 --- a/utils/nfsd/nfssvc.c +++ b/utils/nfsd/nfssvc.c @@ -212,7 +212,7 @@ int nfssvc_set_sockets(const int family, const unsigned int protobits, const char *host, const char *port) { - struct addrinfo hints = { .ai_flags = AI_PASSIVE | AI_ADDRCONFIG }; + struct addrinfo hints = { .ai_flags = AI_PASSIVE }; hints.ai_family = family;
nfs-utils-1.2.1-1.fc12 has been submitted as an update for Fedora 12. http://admin.fedoraproject.org/updates/nfs-utils-1.2.1-1.fc12
nfs-utils-1.2.1-1.fc12 has been pushed to the Fedora 12 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update nfs-utils'. You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F12/FEDORA-2009-11573
This bug appears to have been reported against 'rawhide' during the Fedora 12 development cycle. Changing version to '12'. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
nfs-utils-1.2.1-1.fc12 has been pushed to the Fedora 12 stable repository. If problems still persist, please make note of it in this bug report.