RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 682833 - Autofs doesn't resolve correctly with dnsnames after a connection.
Summary: Autofs doesn't resolve correctly with dnsnames after a connection.
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: autofs
Version: 6.0
Hardware: Unspecified
OS: Unspecified
low
unspecified
Target Milestone: rc
: ---
Assignee: Ian Kent
QA Contact: yanfu,wang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-03-07 17:57 UTC by Patrik Martinsson
Modified: 2011-08-15 12:34 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-08-15 12:34:38 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Name lookup alternate program (722 bytes, text/plain)
2011-03-08 04:44 UTC, Ian Kent
no flags Details
Logs, recording and getaddr-sample. (9.10 MB, application/x-gzip)
2011-03-08 10:13 UTC, Patrik Martinsson
no flags Details
Another name lookup test program (966 bytes, text/plain)
2011-03-14 11:16 UTC, Ian Kent
no flags Details
Video of autofs tests. (62 bytes, text/plain)
2011-03-16 15:52 UTC, Patrik Martinsson
no flags Details

Description Patrik Martinsson 2011-03-07 17:57:59 UTC
Description of problem:
Autofs doesn't seem to be able to get host info trough getaddrinfo() when using dns-names. The feature works well and as expected when having a network, but if the connection goes down, and then brought up again it doesn't seem to work properly. 


Version-Release number of selected component (if applicable):
autofs-5.0.5-23.el6.x86_64

How reproducible:
Always. 

Steps to Reproduce:
1. Create a map, 
   test -auto,intr,vers=3  fileserver:/vol/test
   
2. /usr/sbin/automount -d -f -n 1 
   
3. cd /autofs/test 

4. Disconnect network. 

5. Connect network. 

6. Run simple getaddrinfo-program to test just to see that we actually can reach filer.  
====
// gcc getaddr.c -o getaddr
#include <sys/types.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/socket.h>
#include <netdb.h>

int main(int argc, char *argv[])
{

  struct addrinfo hints, *ni; 
	int ret;

  memset(&hints, 0, sizeof(struct addrinfo));
	hints.ai_flags = AI_ADDRCONFIG;
	hints.ai_family = AF_UNSPEC;
	hints.ai_socktype = SOCK_DGRAM;

  ret = getaddrinfo(argv[1], NULL, &hints, &ni);
  if (ret) {
    fprintf(stderr, "getaddrinfo: %s (%s)\n", gai_strerror(ret), argv[1]);
    exit(EXIT_FAILURE);
  }

  fprintf(stdout, "getaddrinfo: Success on %s \n", argv[1]);
}
====

6. connection verified, try to cd /autofs/test  

Actual results:
autofs complains about, 
add_host_addrs: hostname lookup failed: Temporary failure in name resolution

Expected results:
Successful mount. 


Additional info:
If using an ipaddr instead if dns this works as expected. 
The time after a disconnect/connect for a successful connection various, sometimes it actually works after a couple of seconds, sometimes it takes minutes, and for each try you will see the, add_host_addrs: hostname lookup failed: Temporary failure in name resolution, even though the sample program runs fine. 

Any help appreciated, maybe I'm way off here, but I'm very confused by why my sample-getaddrinfo-program is able to return a addrinfo structure but not autofs, thus making it fail

Comment 2 Ian Kent 2011-03-08 02:02:53 UTC
Yes, that's strange behaviour.
I'll check it out.
Ian

Comment 3 Ian Kent 2011-03-08 04:41:13 UTC
(In reply to comment #0)
> Description of problem:
> Autofs doesn't seem to be able to get host info trough getaddrinfo() when using
> dns-names. The feature works well and as expected when having a network, but if
> the connection goes down, and then brought up again it doesn't seem to work
> properly. 

Initial impression is I can't duplicate this but I haven't yet
setup proper DNS.

> 
> 
> Version-Release number of selected component (if applicable):
> autofs-5.0.5-23.el6.x86_64
> 
> How reproducible:
> Always. 
> 

There are a couple of things wrong with this procedure as
well. Perhaps you can redo your test taking account of the
comments below.

> Steps to Reproduce:
> 1. Create a map, 
>    test -auto,intr,vers=3  fileserver:/vol/test

OK, good start, use an invalid mount option, auto, in the map
entry. Probably not an actual problem though.

> 
> 2. /usr/sbin/automount -d -f -n 1 
> 
> 3. cd /autofs/test 
> 
> 4. Disconnect network. 
> 
> 5. Connect network. 
> 
> 6. Run simple getaddrinfo-program to test just to see that we actually can
> reach filer.  

snip ...

Right, so run an independent lookup .... mmm, which doesn't
check if glibc is caching results and is somehow affected by
the network disconnect ..... again probably not the case.

> 
> 6. connection verified, try to cd /autofs/test  

But your working directory is already /autofs/test so
the previous mount should still be mounted and this
should do nothing but set the directory again without
contacting the autofs daemon.

> 
> Actual results:
> autofs complains about, 
> add_host_addrs: hostname lookup failed: Temporary failure in name resolution

You have run the daemon with debug but haven't included the
output so we have no idea what actually happened. Like whether
the mount was expired (but you would have had to change working
directory away, which isn't included in your description) and a
new mount was attempted.

How about redoing the test with a little more detailed feedback
please.

Ian

Comment 4 Ian Kent 2011-03-08 04:44:04 UTC
Created attachment 482834 [details]
Name lookup alternate program

Running the before, during and after as part of the test
should provide a check for some sort of glibc caching
causing a problem.

Comment 5 Patrik Martinsson 2011-03-08 10:12:11 UTC
Thanks for the quick response, sorry for being somewhat sloppy in the bugreport. 

In you getaddr-program I added, ni = NULL; after the declaration, otherwise freeaddrinfo segfaults if a host is not reachable. Also changed so we never exit the program, I've attached it together with a debuglog and a recording of how I made the test. 

I've tested it with the approach you suggest, 

1. map is now, 
temp -defaults,intr,vers=3 fs5:/vol/vol3/no_backup/temp

2. /usr/sbin/automount -d -f -n 1, the log is attached. 

3. Having your getaddr app running. 

4. cd /autofs/test, mount works. 

5. Pressing enter on getaddr program shows successful getaddrinfo on host.

6. cd / 

7. Disconnect network. 

8. Pressing enter on getaddr program shows unsuccessful getaddrinfo on host.

9. cd /autofs/test, mount does not work, as expected with no network. 

10. Connect network again. 

11. Pressing enter on getaddr program shows successful getaddrinfo on host.

12. cd /autofs/test, mount does not work, debug output says temporary failure in name resolution, retrying this a couple of times with same result. 

13. Pressing enter on getaddr program shows successful getaddrinfo on host.

14. cd /autofs/test, mount does not work, debug output says temporary failure in name resolution. 

15. Closing test. 

I've recorded the test so you can see exactly how its made. I can not reproduce this when using ip-addr instead of hostname. This is really not a big issue since we never change ipaddresses on the filers, however i thought i should mention the problem if anybody else has the same issue. And as I mentioned earlier, sometimes it works after a couple of tries, sometimes I never seem to be able to mount without restarting the daemon.

Comment 6 Patrik Martinsson 2011-03-08 10:13:34 UTC
Created attachment 482867 [details]
Logs, recording and getaddr-sample.

Comment 7 Ian Kent 2011-03-09 08:16:33 UTC
(In reply to comment #5)
> Thanks for the quick response, sorry for being somewhat sloppy in the
> bugreport. 

No problem.

> 
> In you getaddr-program I added, ni = NULL; after the declaration, otherwise
> freeaddrinfo segfaults if a host is not reachable. Also changed so we never
> exit the program, I've attached it together with a debuglog and a recording of
> how I made the test. 

Got that.

At first I was wondering why the directory change after the
interface comes back up caused a new mount but I see that the
existing mount gets umounted when the interface goes down,
probably by NetworkManager (in my case), it's definitely not
autofs doing it. I wasn't aware of that behaviour.

At this point I still don't see the name lookup problem you
are seeing but that's on Fedora 14. I don't have an up to
date RHEL-6 install, so that's the next thing I need to do.

I will also update the dns lookup program to do the same
thing as the daemon does. It does an address lookup on the
passed in string and then progresses to a name lookup if
that fails, in case the name is actually an address string.

Ian

Comment 8 Ian Kent 2011-03-14 11:16:49 UTC
Created attachment 484147 [details]
Another name lookup test program

This is basically what the daemon does when looking up a name.
Does using this for the name lookup exhibit the same problem
you are seeing?

Comment 9 Patrik Martinsson 2011-03-16 15:39:34 UTC
Hey Ian, 

I've redone the tests with the new name lookup program, and I'm seeing the same behaviour as earlier, you can watch the video. 

The test is not conclusive since it tends to work sometimes and sometimes not, as you can see in the video the first mount after the disconnect works as expected, but then when i disconnect/connect again I cant get it to mount. 

Spontaneously I think this sounds like a some caching issue as you've mentioned earlier, but i don't know how much time and effort we should spend on this matter - since it appears that I'm the only one with this problem and it's very random. 

As I said earlier, we use the IP-addresses instead and it works fine. 

Although it's always a bit annoying to not find the real issue :)

Best regards, 
Patrik Martinsson

Comment 10 Patrik Martinsson 2011-03-16 15:52:24 UTC
Created attachment 485771 [details]
Video of autofs tests.

Comment 11 Ian Kent 2011-03-17 11:15:13 UTC
(In reply to comment #9)
> Hey Ian, 
> 
> I've redone the tests with the new name lookup program, and I'm seeing the same
> behaviour as earlier, you can watch the video. 

I was hoping the name lookup program would fail in the same way
as automount. I thought maybe it was an initialization problem
specific to your machine because you had to add initialization
to my original program and I didn't see that problem when I used
it. It may still be that since the environment within automount
isn't the same as the test program.

> 
> The test is not conclusive since it tends to work sometimes and sometimes not,
> as you can see in the video the first mount after the disconnect works as
> expected, but then when i disconnect/connect again I cant get it to mount. 
> 
> Spontaneously I think this sounds like a some caching issue as you've mentioned
> earlier, but i don't know how much time and effort we should spend on this
> matter - since it appears that I'm the only one with this problem and it's very
> random. 

Yes, it's quite odd.

Adding specific initialization to the name lookup parts of
autofs is straight forward, it done in only two places.
I can make a test package on the chance it would fix the
problem.

Another thing from the video is that it looks like the "-n 1"
option isn't being honoured. The video clearly shows the mount
request immediately returning a fail long after the one second
negative timeout. That deserves some investigation. I'll have
a look around and see if I can see what might cause that.

Comment 14 Ian Kent 2011-06-10 01:51:11 UTC
I'm setting devel_ack+ on this bug so I can investigate (and
fix) the observed problem with setting the negative timeout.
Since I can't reproduce the DNS problem at all I can't fix
it until we get a report that has a scenario that gives a
different view of the problem and a lead to follow as to
what the problem is.

Comment 16 Ian Kent 2011-08-04 14:37:51 UTC
I have been unable to reproduce this problem despite a fair
amount of effort. Consequently I don't have a resolution
ready for RHEL-6.2.

Deferring to 6.3.

Comment 17 Patrik Martinsson 2011-08-15 12:12:32 UTC
Hi again Ian, 

I haven't looked into this issue since we changed to rhel 6.1, and since we also changed to ip-addr instead of hostnames. I know that our dns setup is somewhat "not the way it should be", so maybe that has something to do with it (although you would figure that the test-program would get the same result then). 

I think you can close this as invalid and I can look into it in the future and reopen the bug if I hit this issue again. 

Thanks for the great work anyway!

Comment 18 Ian Kent 2011-08-15 12:34:38 UTC
(In reply to comment #17)
> Hi again Ian, 
> 
> I haven't looked into this issue since we changed to rhel 6.1, and since we
> also changed to ip-addr instead of hostnames. I know that our dns setup is
> somewhat "not the way it should be", so maybe that has something to do with it
> (although you would figure that the test-program would get the same result
> then). 

Yes, puzzling.

> 
> I think you can close this as invalid and I can look into it in the future and
> reopen the bug if I hit this issue again. 
> 
> Thanks for the great work anyway!

OK, thanks for getting back to us.
I'll close this INSUFFICIENT_DATA for want of of better status.

Ian


Note You need to log in before you can comment on or make changes to this bug.