Bug 625531

Summary: sm-notify needs to call res_init() before each try
Product: [Fedora] Fedora Reporter: Orion Poplawski <orion>
Component: nfs-utilsAssignee: Steve Dickson <steved>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 13CC: jlayton, steved
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: nfs-utils-1.2.2-6.fc13 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-10-27 22:32:25 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Call res_init() before getaddrinfo()
none
Call res_init() before getaddrinfo() none

Description Orion Poplawski 2010-08-19 18:10:39 UTC
Created attachment 439749 [details]
Call res_init() before getaddrinfo()

Description of problem:

On boot I'm seeing:

Aug 19 11:04:33 cappello sm-notify[1069]: Version 1.2.2 starting
Aug 19 11:04:33 cappello sm-notify[1069]: Backgrounding to notify hosts...
Aug 19 11:04:33 cappello sm-notify[1070]: DNS resolution of earth.cora.nwra.com failed; retrying later
Aug 19 11:04:35 cappello sm-notify[1070]: DNS resolution of earth.cora.nwra.com failed; retrying later
Aug 19 11:04:40 cappello sm-notify[1070]: DNS resolution of earth.cora.nwra.com failed; retrying later
Aug 19 11:04:48 cappello sm-notify[1070]: DNS resolution of earth.cora.nwra.com failed; retrying later
Aug 19 11:05:04 cappello sm-notify[1070]: DNS resolution of earth.cora.nwra.com failed; retrying later
Aug 19 11:05:37 cappello sm-notify[1070]: DNS resolution of earth.cora.nwra.com failed; retrying later
Aug 19 11:06:41 cappello sm-notify[1070]: DNS resolution of earth.cora.nwra.com failed; retrying later
Aug 19 11:08:41 cappello sm-notify[1070]: DNS resolution of earth.cora.nwra.com failed; retrying later
Aug 19 11:10:41 cappello sm-notify[1070]: DNS resolution of earth.cora.nwra.com failed; retrying later
Aug 19 11:12:41 cappello sm-notify[1070]: DNS resolution of earth.cora.nwra.com failed; retrying later
Aug 19 11:14:41 cappello sm-notify[1070]: DNS resolution of earth.cora.nwra.com failed; retrying later
Aug 19 11:16:41 cappello sm-notify[1070]: DNS resolution of earth.cora.nwra.com failed; retrying later
Aug 19 11:18:41 cappello sm-notify[1070]: DNS resolution of earth.cora.nwra.com failed; retrying later
Aug 19 11:20:41 cappello sm-notify[1070]: Unable to notify earth.cora.nwra.com, giving up

Network came up at 11:04:38:
Aug 19 11:04:38 cappello NetworkManager[1043]: <info> Activation (eth0) Stage 5 of 5 (IP Configure Commit) complete.

As a result locks are not being released.

Attached patch should fix.

Version-Release number of selected component (if applicable):
nfs-utils-1.2.2-2.fc13.i686

Comment 1 Orion Poplawski 2010-08-19 18:32:16 UTC
Created attachment 439751 [details]
Call res_init() before getaddrinfo()

Ah, proper includes are needed too.  This actually compiles.

Comment 2 Steve Dickson 2010-10-14 15:34:10 UTC
I'm curious as to why only that getaddrinfo() needs to
do a res_init().... It seems like that is just masking
the real problem...

Comment 3 Orion Poplawski 2010-10-14 15:53:35 UTC
There has been endless discussion on this topic.  See bug 442172 for an example.  Perhaps there is an sssd solution these days, but in this case it seems easiest just to call res_init() since it will only be a few calls.

Comment 4 Steve Dickson 2010-10-14 17:19:12 UTC
Well looking at the /etc/nscd.conf that's on my F13 machine
the 'check-files hosts yes' entry seems to exist (the workaround Uli 
was talking about in bug 443172). Now since I know I didn't put it 
there, that entry must be enabled by default these days...

Now what to do when nscd is not installed... I guess I could
put a dependency on nfs-utils but I would be worried that would
cause an avalanche of other  dependencies... 

I guess I could make a Fedora only patch, since I pretty sure
a fix like this would not be accepted in upstream... but
I'm never a fan of that...

How prevalent are those error messages?

Comment 5 Orion Poplawski 2010-10-14 17:29:02 UTC
nscd is deprecated in favor of sssd.  This error will occur anytime sm-notify is run before the network if fully up, which is happening more and more with parallel startup systems.  The res_init() call is simple, safe, quick, and a patch to use it should be able to go upstream.  Presumably the whole reason sm-notify tries several times is to wait for possible changes to the network configuration, but without calling res_init() it will never be aware of those changes.  It only tries a maximum of 13 times or so, so this is not a lot of overhead.

Comment 6 Fedora Update System 2010-10-15 22:47:17 UTC
nfs-utils-1.2.2-6.fc13 has been submitted as an update for Fedora 13.
https://admin.fedoraproject.org/updates/nfs-utils-1.2.2-6.fc13

Comment 7 Fedora Update System 2010-10-17 04:49:54 UTC
nfs-utils-1.2.2-6.fc13 has been pushed to the Fedora 13 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update nfs-utils'.  You can provide feedback for this update here: https://admin.fedoraproject.org/updates/nfs-utils-1.2.2-6.fc13

Comment 8 Fedora Update System 2010-10-27 22:32:01 UTC
nfs-utils-1.2.2-6.fc13 has been pushed to the Fedora 13 stable repository.  If problems still persist, please make note of it in this bug report.