Bug 532270

Summary: rpc.nfsd: unable to resolve ANYADDR:nfs error appears at login screen
Product: [Fedora] Fedora Reporter: GoinEasy9 <GoinEasy9>
Component: nfs-utilsAssignee: Steve Dickson <steved>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 12CC: davej, dcbw, jlayton, maxiberta, notting, steved
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: 1.2.1-1.fc12 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 542663 (view as bug list) Environment:
Last Closed: 2009-11-18 09:18:13 EST Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Bug Depends On:    
Bug Blocks: 542663    
Attachments:
Description Flags
copy of boot.log none

Description GoinEasy9 2009-10-31 20:11:11 EDT
Created attachment 366988 [details]
copy of boot.log

Description of problem: Boot log shows nfs error at login screen.



Version-Release number of selected component (if applicable):nfsd module for kernel 2.6.31.5-96.fc12.i686.PAE


How reproducible: Boot up computer


Steps to Reproduce:
1.Boot up computer
2.At login screen, click on the yellow error icon.
3.Read the boot.log
  
Actual results:Boot.log error line reads Starting NFS daemon: rpc.nfsd: unable to resolve ANYADDR:nfs to inet address: Name or service not known
rpc.nfsd: unable to set any sockets for nfsd [60G[[0;31mFAILED[0;39m]


Expected results:No error at login screen


Additional info:Running Fedora 12 Beta, fully updated as of 10/30/09
I can see the place in /etc/init.d/nfs where error is being called from.
echo -n $"Starting NFS daemon: "
daemon rpc.nfsd $RPCNFSDARGS $RPCNFSDCOUNT
RETVAL=$?
echo
[ $RETVAL -ne 0 ] && exit $RETVAL

I experimented by putting values into /etc/exports to see if it had any effect, still the error came up.
Compared fully updated F11 setup to the F12 setup, only difference I found was the new 2.6.31 kernel and its accompanying nfsd module.  On F11 with 2.6.30 kernel, error does not occur.
Comment 1 Steve Dickson 2009-11-02 07:32:23 EST
This has to do with the fact there is a race condition
between NFS and the NetworkManager bring up the network.

Even though NetworkManager is started before NFS, there is
no network interface configured when NFS starts, which is the 
cause of the failure. 

Looking at the debugging logs (by setting RPCNFSDARGS="-d -s'
in /etc/sysconfig/nfs) you can clearly see NFS is looking
for the interface before NetworkManger has it configured

Dan, any clue as to why this might be happening?
Comment 2 Dan Williams 2009-11-02 12:30:20 EST
NetworkManager brings the network up asynchronously (always has).  This means that init scripts *cannot* depend on the network resource they need being available when they start.  There's a few ways to do deal with this:

1) the sledgehammer: set NETWORKWAIT=yes in /etc/sysconfig/network, which will cause your entire bootup to stall for 30 seconds while NetworkManager tries to get a network connection.  This is the same behavior as 'ifup'

2) make the init scripts smart about what resources they require; if they need a network connection via eth0, then they should check and see if eth0 is up and if not, wait for it to come up

3) use a 'dispatcher script' to restart the service when a specific interface goes up or down; 'man NetworkManager' will give you more information about this, and the 'autofs' package does this already


Basically, initscripts are pretty stupid these days, and they assume that when they start, all the resources they need will be present.  That's not always the case, but we also don't always want to stall bootup for 30 seconds just because the network cable isn't plugged in.  The services should really be starting on demand, and Fedora 13 should provide that with the 'upstart' project which we have shipped since Fedora 10, but in a sysvinit compatibility mode rather than a pure dependency mode.
Comment 3 Steve Dickson 2009-11-03 07:31:19 EST
> NetworkManager brings the network up asynchronously (always has).  This means
> that init scripts *cannot* depend on the network resource they need being
> available when they start.
After further review, it appears there has been a change to rpc.nfsd
that will make it more sensitive to the interface not be configured.

> 2) make the init scripts smart about what resources they require;
Bill, Would you happen to know if there is going to be any 
intelligence added to the initscripts that would support
this type of smartness?

> 3) use a 'dispatcher script' to restart the service when a specific 
> interface goes up or down;
I'm assuming this will probably be the best answer at this point...


> Fedora 13 should provide that with the 'upstart' project which we
> have shipped since Fedora 10, but in a sysvinit compatibility mode 
> rather than a pure dependency mode
I had a similar thought... The current system has gotten us pretty 
far, but its not clear how much further it will be able to go..
Comment 4 Bill Nottingham 2009-11-03 13:41:43 EST
(In reply to comment #3)
> > 2) make the init scripts smart about what resources they require;
> Bill, Would you happen to know if there is going to be any 
> intelligence added to the initscripts that would support
> this type of smartness?

There's no coherent protocol for a script to specify "I need interface <foo> with address <bar> to be up"; I don't see how you can solve that
in any global initscripts way.

> > 3) use a 'dispatcher script' to restart the service when a specific 
> > interface goes up or down;
>
> I'm assuming this will probably be the best answer at this point...

Yes, if you're sensitive to network configuration, you should go to
the source of it.
Comment 5 Steve Dickson 2009-11-12 10:27:51 EST
*** Bug 533893 has been marked as a duplicate of this bug. ***
Comment 6 Steve Dickson 2009-11-12 13:24:23 EST
>> I'm assuming this will probably be the best answer at this point...
>
> Yes, if you're sensitive to network configuration, you should go to
> the source of it. 

I don't think dispatcher script are the answer either... here is 
what I did and the results... 

Added the following line (which was borrowed from the netfs init scrip)
To the top of the NFS init script

[ ! -f /var/lock/subsys/network -a ! -f /var/lock/subsys/NetworkManager ] && exit 0

Created /etc/NetworkManager/dispatcher.d/04-nfs with the following
in it:

if [ "$2" = "up" ]; then
        /sbin/ip route ls | grep -q ^default && {
                /sbin/chkconfig nfs && /etc/rc.d/init.d/nfs start || :
        } && { :; }
fi

It does work most of the time... meaning the scripts starts up
without failing... on time the nfs initscript failed becaue the
server was already running... and when the 04-nfs script
is used to start the service there is not status on the console...
Which means there is no way to debug failures since they are
not logged anywhere... or am I missing something??
Comment 7 Steve Dickson 2009-11-12 14:29:10 EST
I just proposed the following patch for upstream acceptance... 
 

commit 2905358524c0835311501bad04c521479b0525ff
Author: Steve Dickson <steved@redhat.com>
Date:   Thu Nov 12 14:16:12 2009 -0500

    Remove the AI_ADDRCONFIG hint flag to getaddrinfo() when it's
    call by nfsd to set up the file descriptors that are
    sent to the kernel. The flag causes the getaddrinfo()
    to fail, with EAI_NONAME, when there is not a non-loopback
    network interface configured.
    
    Signed-off-by: Steve Dickson <steved@redhat.com>

diff --git a/utils/nfsd/nfssvc.c b/utils/nfsd/nfssvc.c
index 12d3253..b8028bb 100644
--- a/utils/nfsd/nfssvc.c
+++ b/utils/nfsd/nfssvc.c
@@ -212,7 +212,7 @@ int
 nfssvc_set_sockets(const int family, const unsigned int protobits,
 		   const char *host, const char *port)
 {
-	struct addrinfo hints = { .ai_flags = AI_PASSIVE | AI_ADDRCONFIG };
+	struct addrinfo hints = { .ai_flags = AI_PASSIVE };
 
 	hints.ai_family = family;
Comment 8 Fedora Update System 2009-11-13 09:04:38 EST
nfs-utils-1.2.1-1.fc12 has been submitted as an update for Fedora 12.
http://admin.fedoraproject.org/updates/nfs-utils-1.2.1-1.fc12
Comment 9 Fedora Update System 2009-11-16 02:33:05 EST
nfs-utils-1.2.1-1.fc12 has been pushed to the Fedora 12 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update nfs-utils'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F12/FEDORA-2009-11573
Comment 10 Bug Zapper 2009-11-16 09:48:16 EST
This bug appears to have been reported against 'rawhide' during the Fedora 12 development cycle.
Changing version to '12'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 11 Fedora Update System 2009-11-18 09:18:07 EST
nfs-utils-1.2.1-1.fc12 has been pushed to the Fedora 12 stable repository.  If problems still persist, please make note of it in this bug report.