Description of problem: [Could this be a glibc issue?] I seem to be suffering from (seemingly random) DNS failures on my system. Below shows three consecutive runs of "curl 'http://mirrors.fedoraproject.org/....'", the first and last of which succeed, but the second one fails. [I ran these with no more than 2-3 seconds pause between them.] Of course, this is causing challenges for other applications (like yum, firefox, ....). This appears to happen both with and without nscd running. Also, any idea why these all seem to be taking a minimum of 5 seconds? If I run "nslookup", I get no such failures, and the time is much less than 5 seconds: [root@localhost ~]# time curl 'http://mirrors.fedoraproject.org/mirrorlist?repo=rawhide&arch=i386' # repo = rawhide arch = i386 country = US country = CA http://fedora.mirrors.tds.net/pub/fedora/development/i386/os http://www.gtlib.gatech.edu/pub/fedora.redhat/linux/development/i386/os http://mirrors.cat.pdx.edu/fedora/linux/development/i386/os ftp://fedora.bu.edu/development/i386/os http://mirrors.kernel.org/fedora/development/i386/os http://mirror.cc.vt.edu/pub/fedora/linux/development/i386/os http://ftp.ndlug.nd.edu/pub/fedora/linux/development/i386/os http://archive.linux.duke.edu/pub/fedora/linux/development/i386/os http://mirror.its.uidaho.edu/pub/fedora/linux/development/i386/os http://mirror.hiwaay.net/pub/fedora/linux/development/i386/os http://gulus.usherbrooke.ca/pub/distro/fedora/linux/development/i386/os real 0m5.561s user 0m0.000s sys 0m0.019s [root@localhost ~]# time curl 'http://mirrors.fedoraproject.org/mirrorlist?repo=rawhide&arch=i386' curl: (6) Couldn't resolve host 'mirrors.fedoraproject.org' real 0m5.018s user 0m0.002s sys 0m0.016s [root@localhost ~]# time curl 'http://mirrors.fedoraproject.org/mirrorlist?repo=rawhide&arch=i386' # repo = rawhide arch = i386 country = US country = CA http://mirrors.kernel.org/fedora/development/i386/os http://www.gtlib.gatech.edu/pub/fedora.redhat/linux/development/i386/os ftp://fedora.bu.edu/development/i386/os http://fedora.mirrors.tds.net/pub/fedora/development/i386/os http://mirrors.cat.pdx.edu/fedora/linux/development/i386/os http://mirror.hiwaay.net/pub/fedora/linux/development/i386/os http://mirror.its.uidaho.edu/pub/fedora/linux/development/i386/os http://mirror.cc.vt.edu/pub/fedora/linux/development/i386/os http://ftp.ndlug.nd.edu/pub/fedora/linux/development/i386/os http://archive.linux.duke.edu/pub/fedora/linux/development/i386/os http://gulus.usherbrooke.ca/pub/distro/fedora/linux/development/i386/os real 0m5.191s user 0m0.002s sys 0m0.016s [root@localhost ~]# [root@localhost ~]# while :; do time nslookup mirrors.fedoraproject.org; sleep 2; done Server: 68.87.76.178 Address: 68.87.76.178#53 Non-authoritative answer: mirrors.fedoraproject.org canonical name = fedoraproject.org. Name: fedoraproject.org Address: 209.132.176.122 Name: fedoraproject.org Address: 66.35.62.162 real 0m0.030s user 0m0.006s sys 0m0.013s Server: 68.87.76.178 Address: 68.87.76.178#53 Non-authoritative answer: mirrors.fedoraproject.org canonical name = fedoraproject.org. Name: fedoraproject.org Address: 209.132.176.122 Name: fedoraproject.org Address: 66.35.62.162 real 0m0.101s user 0m0.005s sys 0m0.014s Server: 68.87.76.178 Address: 68.87.76.178#53 Non-authoritative answer: mirrors.fedoraproject.org canonical name = fedoraproject.org. Name: fedoraproject.org Address: 209.132.176.122 Name: fedoraproject.org Address: 66.35.62.162 real 0m0.031s user 0m0.003s sys 0m0.016s Server: 68.87.76.178 Address: 68.87.76.178#53 Non-authoritative answer: mirrors.fedoraproject.org canonical name = fedoraproject.org. Name: fedoraproject.org Address: 66.35.62.162 Name: fedoraproject.org Address: 209.132.176.122 real 0m0.034s user 0m0.005s sys 0m0.011s Version-Release number of selected component (if applicable): glibc-2.8.90-8.i686 bind-utils-9.5.0-37.fc10.i386 glibc-common-2.8.90-8.i386 curl-7.18.2-3.fc10.i386 glibc-headers-2.8.90-8.i386 nscd-2.8.90-8.i386 glibc-debuginfo-common-2.8.90-8.i386 curl-debuginfo-7.18.2-3.fc10.i386 glibc-devel-2.8.90-8.i386 glibc-debuginfo-2.8.90-8.i686 glibc-utils-2.8.90-8.i386 How reproducible: Very, very often Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Here are consecutive runs of "yum -x gnome-sharp\* update", with the first 2 failing and the third succeeding: [root@localhost ~]# yum -x gnome-sharp\* update Loaded plugins: fastestmirror, presto, refresh-packagekit, versionlock Loading mirror speeds from cached hostfile Could not retrieve mirrorlist http://mirrors.fedoraproject.org/mirrorlist?repo=rawhide&arch=i386 error was [Errno 4] IOError: <urlopen error (-2, 'Name or service not known')> Error: Cannot find a valid baseurl for repo: rawhide [root@localhost ~]# [root@localhost ~]# yum -x gnome-sharp\* update Loaded plugins: fastestmirror, presto, refresh-packagekit, versionlock Loading mirror speeds from cached hostfile Could not retrieve mirrorlist http://mirrors.fedoraproject.org/mirrorlist?repo=rawhide&arch=i386 error was [Errno 4] IOError: <urlopen error (-2, 'Name or service not known')> Error: Cannot find a valid baseurl for repo: rawhide [root@localhost ~]# yum -x gnome-sharp\* update Loaded plugins: fastestmirror, presto, refresh-packagekit, versionlock Loading mirror speeds from cached hostfile Could not retrieve mirrorlist http://mirrors.fedoraproject.org/mirrorlist?repo=rawhide&arch=i386 error was [Errno 4] IOError: <urlopen error (-2, 'Name or service not known')> Error: Cannot find a valid baseurl for repo: rawhide [root@localhost ~]# yum -x gnome-sharp\* update Loaded plugins: fastestmirror, presto, refresh-packagekit, versionlock Loading mirror speeds from cached hostfile * rawhide: mirrors.cat.pdx.edu * rawhide-debuginfo: mirrors.cat.pdx.edu Setting up and reading Presto delta metadata rawhide | 2.4 kB 00:00 primary.sqlite.bz2 | 6.5 MB 00:04 No Presto metadata available for rawhide rawhide-debuginfo | 1.9 kB 00:00 primary.sqlite.bz2 | 1.1 MB 00:00 No Presto metadata available for rawhide-debuginfo Excluding Packages in global exclude list Finished Reading version lock configuration Setting up Update Process Resolving Dependencies --> Running transaction check ---> Package openoffice.org-xsltfilter.i386 1:3.0.0-0.0.22.1.fc10 set to be updated <<<<SNIP>>>>
Some more data, this time with www.nytimes.com. I ran 2 shell loops: the first does 5 "curl http://www.nytimes.com", the second does 5 "nslookup www.nytimes.com". All of the curl's took a bit more than 5 seconds but the fourth: it took about .75 seconds. All the nslookup's were less than .1 seconds. [root@localhost ~]# for i in 1 1 1 1 1; do time curl http://www.nytimes.com >/dev/null; done % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 103k 0 103k 0 0 18777 0 --:--:-- 0:00:05 --:--:-- 240k real 0m5.667s user 0m0.005s sys 0m0.032s % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 104k 0 104k 0 0 18781 0 --:--:-- 0:00:05 --:--:-- 233k real 0m5.687s user 0m0.002s sys 0m0.025s % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 104k 0 104k 0 0 18870 0 --:--:-- 0:00:05 --:--:-- 232k real 0m5.685s user 0m0.003s sys 0m0.022s % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 103k 0 103k 0 0 145k 0 --:--:-- --:--:-- --:--:-- 229k real 0m0.751s user 0m0.002s sys 0m0.030s % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 104k 0 104k 0 0 18855 0 --:--:-- 0:00:05 --:--:-- 230k real 0m5.696s user 0m0.002s sys 0m0.024s [root@localhost ~]# for i in 1 1 1 1 1; do time nslookup www.nytimes.com >/dev/null; done real 0m0.054s user 0m0.003s sys 0m0.016s real 0m0.032s user 0m0.000s sys 0m0.012s real 0m0.024s user 0m0.002s sys 0m0.010s real 0m0.026s user 0m0.003s sys 0m0.009s real 0m0.026s user 0m0.001s sys 0m0.011s [root@localhost ~]#
timing nslookup vs. curl isn't exactly fair nor interesting, as curl does a lot of stuff nslookup doesn't. You can use getent hosts www.nytimes.com and/or getent ahosts www.nytimes.com to do a more useful comparisons.
Sorry for the miscue with 'nslookup'. Trying the above plus 'getent hosts' at work (different ISP, different DNS) fails to reproduce the issue: [root@localhost ~]# time getent hosts www.nytimes.com 199.239.137.200 www.nytimes.com 199.239.137.245 www.nytimes.com 199.239.136.200 www.nytimes.com 199.239.136.245 www.nytimes.com real 0m0.015s user 0m0.001s sys 0m0.007s [root@localhost ~]# time getent hosts mirrors.fedoraproject.org 209.132.176.122 fedoraproject.org mirrors.fedoraproject.org 66.35.62.162 fedoraproject.org mirrors.fedoraproject.org real 0m0.401s user 0m0.002s sys 0m0.006s [root@localhost ~]# time curl http://mirrors.fedoraproject.org >/dev/null % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 155k 100 155k 0 0 167k 0 --:--:-- --:--:-- --:--:-- 214k real 0m0.945s user 0m0.000s sys 0m0.025s [root@localhost ~]# Home ISP is Comcast. I'll retest tonight when I get home..... [If there is some ISP related issue, what should I be looking for.... ?]
Back at home (Comcast), and there is a difference: curl takes about 5 seconds about 4 out of 5 times, getent and nslookup take a few hundreths of a second each. I'll attach the complete output as an attachment. Basically, there are about half a dozen consecutive runs with the curl taking more than five seconds, followed by 2 runs with the curl taking less than half a second, followed by the remaining runs back at more than 5 seconds. There seems to be something about my Comcast connection. This affects more than just curl, I also get about 5 second delays in firefox, yum, etc. Any suggestions on how to "track this down" will be greatly appreciated! [tbl@localhost ~]$ for i in 1 1 1 1 1 1 1 1 1 1; do time curl http://mirrors.fedoraproject.org >/dev/null; time getent hosts mirrors.fedoraproject.org; time nslookup mirrors.fedoraproject.org; done % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 155k 100 155k 0 0 29682 0 0:00:05 0:00:05 --:--:-- 579k real 0m5.390s user 0m0.003s sys 0m0.048s 209.132.176.122 fedoraproject.org mirrors.fedoraproject.org 66.35.62.162 fedoraproject.org mirrors.fedoraproject.org real 0m0.031s user 0m0.001s sys 0m0.007s Server: 68.87.76.178 Address: 68.87.76.178#53 Non-authoritative answer: mirrors.fedoraproject.org canonical name = fedoraproject.org. Name: fedoraproject.org Address: 66.35.62.162 Name: fedoraproject.org Address: 209.132.176.122 real 0m0.101s user 0m0.001s sys 0m0.018s <<<<SNIP ..... more of the above>>>>> <<<<Now 2 in a row at under half a second>>>> % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 155k 100 155k 0 0 449k 0 --:--:-- --:--:-- --:--:-- 637k real 0m0.361s user 0m0.005s sys 0m0.032s 209.132.176.122 fedoraproject.org mirrors.fedoraproject.org 66.35.62.162 fedoraproject.org mirrors.fedoraproject.org real 0m0.032s user 0m0.002s sys 0m0.006s Server: 68.87.76.178 Address: 68.87.76.178#53 Non-authoritative answer: mirrors.fedoraproject.org canonical name = fedoraproject.org. Name: fedoraproject.org Address: 66.35.62.162 Name: fedoraproject.org Address: 209.132.176.122 real 0m0.039s user 0m0.001s sys 0m0.016s % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 155k 100 155k 0 0 362k 0 --:--:-- --:--:-- --:--:-- 515k real 0m0.445s user 0m0.005s sys 0m0.024s 209.132.176.122 fedoraproject.org mirrors.fedoraproject.org 66.35.62.162 fedoraproject.org mirrors.fedoraproject.org real 0m0.032s user 0m0.001s sys 0m0.007s Server: 68.87.76.178 Address: 68.87.76.178#53 Non-authoritative answer: mirrors.fedoraproject.org canonical name = fedoraproject.org. Name: fedoraproject.org Address: 209.132.176.122 Name: fedoraproject.org Address: 66.35.62.162 real 0m0.028s user 0m0.003s sys 0m0.013s <<<<< Now back to 5.x seconds for curl >>>>> % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 155k 100 155k 0 0 29780 0 0:00:05 0:00:05 --:--:-- 585k real 0m5.359s user 0m0.003s sys 0m0.020s 209.132.176.122 fedoraproject.org mirrors.fedoraproject.org 66.35.62.162 fedoraproject.org mirrors.fedoraproject.org real 0m0.033s user 0m0.000s sys 0m0.004s Server: 68.87.76.178 Address: 68.87.76.178#53 Non-authoritative answer: mirrors.fedoraproject.org canonical name = fedoraproject.org. Name: fedoraproject.org Address: 66.35.62.162 Name: fedoraproject.org Address: 209.132.176.122 real 0m0.044s user 0m0.002s sys 0m0.011s <<<<< SNIP >>>>> [tbl@localhost ~]$
Created attachment 311215 [details] output of 10 conscutive runs of "curl; getent; nslookup" on mirrors.fedoraproject.org
(In reply to comment #5) > Back at home (Comcast), and there is a difference: curl takes about 5 seconds > about 4 out of 5 times, getent and nslookup take a few hundreths of a second > each. I'll attach the complete output as an attachment. Since getent and nslookup behave (approximately) the same this has nothing to do with name resolution. It's either a problem with the networking code or just a problem with your connection. curl actually downloads something and this means the server and network play into it. This might very well explain the difference in timings. Use strace -tt -T <program> to show where the time is spent. Anyway, you haven't shown any DNS problems, just slow network traffic. If there are problems with your network connection this of course can also explain DNS problems. Unless you say that you still also see DNS issues you should close this bug. If and when you can see problems with the networking code you can file a bug for the kernel. I very much doubt there is anything like this, though.
Created attachment 311265 [details] Output of "strace -tt curl http://mirrors.fedoraproject.org" Thanks again for helping here, as this has gotten me flustered..... I attach the output of "strace -tt curl http://mirrors.fedoraproject.org". Here is a snippet showing the 5 second "pause". My limited understanding of this is that the process sits and waits 5 seconds on a poll([{fd=3, events=POLLIN}]) after receiving (partial?) response from DNS. This appears to be in a call after loading /lib/libnss_dns.so.2. Sorry if I am "crying wolf" here.... but this is quite reproducible..... 06:24:21.019907 socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 3 06:24:21.020347 connect(3, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("68.87.76.178")}, 28) = 0 06:24:21.020591 fcntl64(3, F_GETFL) = 0x2 (flags O_RDWR) 06:24:21.020749 fcntl64(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 06:24:21.020893 gettimeofday({1215523461, 20976}, NULL) = 0 06:24:21.021080 poll([{fd=3, events=POLLOUT, revents=POLLOUT}], 1, 0) = 1 06:24:21.021277 send(3, "\250\345\1\0\0\1\0\0\0\0\0\0\7mirrors\rfedoraproje"..., 43, MSG_NOSIGNAL) = 43 06:24:21.021674 poll([{fd=3, events=POLLIN|POLLOUT, revents=POLLOUT}], 1, 5000) = 1 06:24:21.022121 send(3, "W\375\1\0\0\1\0\0\0\0\0\0\7mirrors\rfedoraproje"..., 43, MSG_NOSIGNAL) = 43 06:24:21.022566 gettimeofday({1215523461, 22725}, NULL) = 0 06:24:21.022824 poll([{fd=3, events=POLLIN, revents=POLLIN}], 1, 4998) = 1 06:24:21.033435 ioctl(3, FIONREAD, [104]) = 0 06:24:21.033781 recvfrom(3, "W\375\201\200\0\1\0\1\0\1\0\0\7mirrors\rfedoraproje"..., 2048, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("68.87.76.178")}, [16]) = 104 06:24:21.034510 gettimeofday({1215523461, 34602}, NULL) = 0 06:24:21.034705 poll([{fd=3, events=POLLIN}], 1, 4986) = 0 06:24:26.021112 close(3) = 0 06:24:26.022035 time(NULL) = 1215523466 06:24:26.022219 write(2, "c", 1c) = 1 06:24:26.022398 write(2, "u", 1u) = 1
Manually clearing NEEDINFO.
(In reply to comment #8) > My limited understanding of this is that the process sits and waits 5 seconds > on a poll([{fd=3, events=POLLIN}]) after receiving (partial?) response from > DNS. You need to collect the DNS trafic using wireshark as well. The strace output alone is not sufficient. I.e., collect strace and wireshark output at the same time.
Also, if curl uses getaddrinfo rather than gethostbyname, you should try getent ahosts rather than getent hosts.
Furthermore as I explained before, use -tt -T for strace, not just -tt. Without we don't know for sure how long the syscall takes.
Sorry, I missed the '-T' above; my oversight. I'll rerun as you describe. Debugging my script (converting to ahosts), I get the following "puzzling" result. The first run of "curl" fails, but the 'getent ahosts' seem to be taking about 5 seconds now, just like curl. I'm working on the script to generate the strace output, and I'll figure out how to run/capture wireshark. Thanks again for your help. [tbl@localhost ~]$ for i in 1 1 1 1 1 1 1 1 1 1; do echo curl; time curl http://mirrors.fedoraproject.org >/dev/null; echo getent ahosts; time getent ahosts mirrors.fedoraproject.org; echo nslookup; time nslookup mirrors.fedoraproject.org; done curl curl: (6) Couldn't resolve host 'mirrors.fedoraproject.org' real 0m5.053s user 0m0.004s sys 0m0.020s getent ahosts 209.132.176.122 STREAM fedoraproject.org 209.132.176.122 DGRAM 209.132.176.122 RAW 66.35.62.162 STREAM 66.35.62.162 DGRAM 66.35.62.162 RAW real 0m5.036s user 0m0.000s sys 0m0.010s nslookup Server: 68.87.76.178 Address: 68.87.76.178#53 Non-authoritative answer: mirrors.fedoraproject.org canonical name = fedoraproject.org. Name: fedoraproject.org Address: 66.35.62.162 Name: fedoraproject.org Address: 209.132.176.122 real 0m0.339s user 0m0.006s sys 0m0.021s curl % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 155k 100 155k 0 0 29413 0 0:00:05 0:00:05 --:--:-- 491k real 0m5.428s user 0m0.001s sys 0m0.024s getent ahosts 66.35.62.162 STREAM fedoraproject.org 66.35.62.162 DGRAM 66.35.62.162 RAW 209.132.176.122 STREAM 209.132.176.122 DGRAM 209.132.176.122 RAW real 0m5.008s user 0m0.000s sys 0m0.008s nslookup Server: 68.87.76.178 Address: 68.87.76.178#53 Non-authoritative answer: mirrors.fedoraproject.org canonical name = fedoraproject.org. Name: fedoraproject.org Address: 66.35.62.162 Name: fedoraproject.org Address: 209.132.176.122 real 0m0.041s user 0m0.002s sys 0m0.017s
Created attachment 311282 [details] output of 'strace -tt -T curl ....' Here is a '5 second' curl output. I'm still working on the wireshark, and I'm looking at the strace output for the calls to curl that fail with "Couldn't resolve ...."
OK. I ran [tbl@localhost ~]$ time strace -o curl-strace.wire -tt -T curl http://mirrors.fedoraproject.org >/dev/null curl: (6) Couldn't resolve host 'mirrors.fedoraproject.org' real 0m5.157s user 0m0.012s sys 0m0.140s [tbl@localhost ~]$ 5 seconds, and "Couldn't resolve". I attach below first the "strace" output, and then the wireshark capture (sorry for a few extra packets). Hope this is what you asked for. If not I can run again....
Created attachment 311290 [details] Output of 'strace -tt -T curl ...' when capturing with wireshark This strace is of "Couldn't resolve" run
Created attachment 311291 [details] wireshark capture of packets (including DNS) during "strace -tt -T curl ..." Wireshark capture.....
Here's another run: [tbl@localhost ~]$ time strace -o curl-strace.txt -tt -T curl http://mirrors.fedoraproject.org >/dev/null % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 155k 100 155k 0 0 29458 0 0:00:05 0:00:05 --:--:-- 568k real 0m5.558s user 0m0.032s sys 0m0.296s [tbl@localhost ~]$ The wireshark trace shows the DNS traffic and what appears to be a 5 second pause with no DNS retries. Strace shows: 06:18:55.465098 ioctl(3, FIONREAD, [89]) = 0 <0.000029> 06:18:55.465385 recvfrom(3, "\201\331\201\200\0\1\0\3\0\0\0\0\7mirrors\rfedoraproje"..., 2048, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("68.87.76.178")}, [16]) = 89 <0.000034> 06:18:55.465960 gettimeofday({1215609535, 466055}, NULL) = 0 <0.000027> 06:18:55.466192 poll([{fd=3, events=POLLIN}], 1, 4985) = 0 <4.985732> 06:19:00.452707 close(3) This seems to happen to me from home (Comcast), not at work (NTT/ATT) (or not as frequently). I attach below the output of strace and the capture file.
Created attachment 311373 [details] output of 'strace -tt -T curl http://mirrors.fedoraproject.org'
Created attachment 311374 [details] Wireshark capture of packets during 'strace -tt -T http://mirrors.fedoraproject.org' Shows 5 second pause after DNS ....
For completeness, here is output of "strace -tt -T getent ahosts mirrors.fedoraproject.org" with the same 5 second pause. getent appears to load in 'lib/resolve.so before it 'hangs'. [tbl@localhost ~]$ strace -tt -T getent ahosts mirrors.fedoraproject.org 06:55:06.661389 execve("/usr/bin/getent", ["getent", "ahosts", "mirrors.fedoraproject.org"], [/* 40 vars */]) = 0 <0.000716> 06:55:06.663825 brk(0) = 0x81e5000 <0.000058> 06:55:06.664229 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) <0.000116> 06:55:06.664754 open("/etc/ld.so.cache", O_RDONLY) = 3 <0.000128> 06:55:06.665397 fstat64(3, {st_mode=S_IFREG|0644, st_size=104170, ...}) = 0 <0.000061> 06:55:06.666061 mmap2(NULL, 104170, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb80e4000 <0.000071> 06:55:06.666373 close(3) = 0 <0.000058> 06:55:06.666702 open("/lib/libc.so.6", O_RDONLY) = 3 <0.000136> 06:55:06.667107 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0000\267`\0004\0\0\0"..., 512) = 512 <0.000069> 06:55:06.667563 fstat64(3, {st_mode=S_IFREG|0755, st_size=1511052, ...}) = 0 <0.000060> 06:55:06.668224 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb80e3000 <0.000066> 06:55:06.668549 mmap2(0x5f5000, 1517136, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x5f5000 <0.000071> 06:55:06.668848 mmap2(0x762000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x16c) = 0x762000 <0.000091> 06:55:06.669211 mmap2(0x765000, 9808, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x765000 <0.000073> 06:55:06.669495 close(3) = 0 <0.000058> 06:55:06.669816 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb80e2000 <0.000064> 06:55:06.670090 set_thread_area({entry_number:-1 -> 6, base_addr:0xb80e26c0, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0 <0.000059> 06:55:06.670524 mprotect(0x804d000, 4096, PROT_READ) = 0 <0.000073> 06:55:06.670803 mprotect(0x762000, 8192, PROT_READ) = 0 <0.000070> 06:55:06.671105 mprotect(0x5f1000, 4096, PROT_READ) = 0 <0.000070> 06:55:06.671408 munmap(0xb80e4000, 104170) = 0 <0.000093> 06:55:06.672053 brk(0) = 0x81e5000 <0.000059> 06:55:06.672335 brk(0x8206000) = 0x8206000 <0.000063> 06:55:06.672649 open("/usr/lib/locale/locale-archive", O_RDONLY|O_LARGEFILE) = 3 <0.000138> 06:55:06.673140 fstat64(3, {st_mode=S_IFREG|0644, st_size=80040704, ...}) = 0 <0.000060> 06:55:06.673788 mmap2(NULL, 2097152, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7ee2000 <0.000181> 06:55:06.674244 close(3) = 0 <0.000033> 06:55:06.674546 socket(PF_NETLINK, SOCK_RAW, 0) = 3 <0.000066> 06:55:06.674752 bind(3, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 0 <0.000059> 06:55:06.674981 getsockname(3, {sa_family=AF_NETLINK, pid=5515, groups=00000000}, [12]) = 0 <0.000034> 06:55:06.675552 time(NULL) = 1215611706 <0.000033> 06:55:06.675762 sendto(3, "\24\0\0\0\26\0\1\3:\303tH\0\0\0\0\0\0\0\0", 20, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 20 <0.000080> 06:55:06.676073 recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"0\0\0\0\24\0\2\0:\303tH\213\25\0\0\2\10\200\376\1\0\0\0\10\0\1\0\177\0\0\1"..., 4096}], msg_controllen=0, msg_flags=0}, 0) = 168 <0.000080> 06:55:06.676512 recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"@\0\0\0\24\0\2\0:\303tH\213\25\0\0\n\200\200\376\1\0\0\0\24\0\1\0\0\0\0\0"..., 4096}], msg_controllen=0, msg_flags=0}, 0) = 192 <0.000073> 06:55:06.676923 recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"\24\0\0\0\3\0\2\0:\303tH\213\25\0\0\0\0\0\0\1\0\0\0\24\0\1\0\0\0\0\0"..., 4096}], msg_controllen=0, msg_flags=0}, 0) = 20 <0.000045> 06:55:06.677312 close(3) = 0 <0.000087> 06:55:06.677589 socket(PF_FILE, SOCK_STREAM, 0) = 3 <0.000070> 06:55:06.677786 fcntl64(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 <0.000035> 06:55:06.678013 connect(3, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory) <0.000083> 06:55:06.678529 close(3) = 0 <0.000063> 06:55:06.678741 socket(PF_FILE, SOCK_STREAM, 0) = 3 <0.000060> 06:55:06.678928 fcntl64(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 <0.000039> 06:55:06.679129 connect(3, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory) <0.000073> 06:55:06.679546 close(3) = 0 <0.000058> 06:55:06.679759 open("/etc/nsswitch.conf", O_RDONLY) = 3 <0.000083> 06:55:06.680030 fstat64(3, {st_mode=S_IFREG|0644, st_size=1678, ...}) = 0 <0.000057> 06:55:06.680416 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb80fd000 <0.000045> 06:55:06.680591 read(3, "#\n# /etc/nsswitch.conf\n#\n# An ex"..., 4096) = 1678 <0.000100> 06:55:06.680918 read(3, "", 4096) = 0 <0.000042> 06:55:06.681082 close(3) = 0 <0.000046> 06:55:06.681265 munmap(0xb80fd000, 4096) = 0 <0.000055> 06:55:06.681453 open("/etc/host.conf", O_RDONLY) = 3 <0.000078> 06:55:06.681692 fstat64(3, {st_mode=S_IFREG|0644, st_size=17, ...}) = 0 <0.000040> 06:55:06.682058 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb80fd000 <0.000059> 06:55:06.682245 read(3, "order hosts,bind\n", 4096) = 17 <0.000079> 06:55:06.682498 read(3, "", 4096) = 0 <0.000041> 06:55:06.682659 close(3) = 0 <0.000046> 06:55:06.682824 munmap(0xb80fd000, 4096) = 0 <0.000051> 06:55:06.683005 getpid() = 5515 <0.000038> 06:55:06.683185 open("/etc/resolv.conf", O_RDONLY) = 3 <0.000077> 06:55:06.683426 fstat64(3, {st_mode=S_IFREG|0644, st_size=152, ...}) = 0 <0.000039> 06:55:06.683790 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb80fd000 <0.000043> 06:55:06.683961 read(3, "# generated by NetworkManager, d"..., 4096) = 152 <0.000078> 06:55:06.684273 read(3, "", 4096) = 0 <0.000041> 06:55:06.684434 close(3) = 0 <0.000045> 06:55:06.684600 munmap(0xb80fd000, 4096) = 0 <0.000048> 06:55:06.684796 open("/etc/ld.so.cache", O_RDONLY) = 3 <0.000078> 06:55:06.685038 fstat64(3, {st_mode=S_IFREG|0644, st_size=104170, ...}) = 0 <0.000304> 06:55:06.685671 mmap2(NULL, 104170, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb80e4000 <0.000046> 06:55:06.685840 close(3) = 0 <0.000041> 06:55:06.686035 open("/lib/libnss_files.so.2", O_RDONLY) = 3 <0.000112> 06:55:06.686347 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0P\31\0\0004\0\0\0"..., 512) = 512 <0.000070> 06:55:06.686624 fstat64(3, {st_mode=S_IFREG|0755, st_size=46728, ...}) = 0 <0.000040> 06:55:06.686999 mmap2(NULL, 49812, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x111000 <0.000048> 06:55:06.687200 mmap2(0x11c000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0xa) = 0x11c000 <0.000053> 06:55:06.687399 close(3) = 0 <0.000039> 06:55:06.687598 mprotect(0x11c000, 4096, PROT_READ) = 0 <0.000046> 06:55:06.687789 munmap(0xb80e4000, 104170) = 0 <0.000056> 06:55:06.687986 open("/etc/hosts", O_RDONLY|0x80000 /* O_??? */) = 3 <0.000094> 06:55:06.688242 fcntl64(3, F_GETFD) = 0x1 (flags FD_CLOEXEC) <0.000039> 06:55:06.688415 fstat64(3, {st_mode=S_IFREG|0644, st_size=140, ...}) = 0 <0.000039> 06:55:06.688780 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb80fd000 <0.000043> 06:55:06.688953 read(3, "# Do not remove the following li"..., 4096) = 140 <0.000079> 06:55:06.689265 read(3, "", 4096) = 0 <0.000041> 06:55:06.689428 close(3) = 0 <0.000045> 06:55:06.689594 munmap(0xb80fd000, 4096) = 0 <0.000047> 06:55:06.689771 open("/etc/ld.so.cache", O_RDONLY) = 3 <0.000077> 06:55:06.690012 fstat64(3, {st_mode=S_IFREG|0644, st_size=104170, ...}) = 0 <0.000040> 06:55:06.690394 mmap2(NULL, 104170, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb80e4000 <0.000045> 06:55:06.690559 close(3) = 0 <0.000038> 06:55:06.690747 open("/lib/libnss_dns.so.2", O_RDONLY) = 3 <0.000108> 06:55:06.691029 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0000\f\0\0004\0\0\0"..., 512) = 512 <0.000081> 06:55:06.695474 fstat64(3, {st_mode=S_IFREG|0755, st_size=22100, ...}) = 0 <0.000042> 06:55:06.695878 mmap2(NULL, 24708, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x11e000 <0.000053> 06:55:06.696068 mmap2(0x123000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x4) = 0x123000 <0.000055> 06:55:06.696288 close(3) = 0 <0.000040> 06:55:06.696473 open("/lib/libresolv.so.2", O_RDONLY) = 3 <0.000127> 06:55:06.696773 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\220\221m\0024\0\0\0"..., 512) = 512 <0.000068> 06:55:06.697045 fstat64(3, {st_mode=S_IFREG|0755, st_size=77496, ...}) = 0 <0.000055> 06:55:06.697425 mmap2(0x26d7000, 84164, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x26d7000 <0.000445> 06:55:06.698008 mmap2(0x26e8000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x11) = 0x26e8000 <0.000073> 06:55:06.698241 mmap2(0x26ea000, 6340, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x26ea000 <0.000048> 06:55:06.698414 close(3) = 0 <0.000040> 06:55:06.698617 mprotect(0x26e8000, 4096, PROT_READ) = 0 <0.000048> 06:55:06.698793 mprotect(0x123000, 4096, PROT_READ) = 0 <0.000047> 06:55:06.698970 munmap(0xb80e4000, 104170) = 0 <0.000061> 06:55:06.699252 socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 3 <0.000082> 06:55:06.699480 connect(3, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("68.87.76.178")}, 28) = 0 <0.000057> 06:55:06.699734 fcntl64(3, F_GETFL) = 0x2 (flags O_RDWR) <0.000040> 06:55:06.699896 fcntl64(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 <0.000039> 06:55:06.700055 gettimeofday({1215611706, 700138}, NULL) = 0 <0.000056> 06:55:06.700255 poll([{fd=3, events=POLLOUT, revents=POLLOUT}], 1, 0) = 1 <0.000040> 06:55:06.700467 send(3, "\253\10\1\0\0\1\0\0\0\0\0\0\7mirrors\rfedoraproje"..., 43, MSG_NOSIGNAL) = 43 <0.000106> 06:55:06.700773 poll([{fd=3, events=POLLIN|POLLOUT, revents=POLLOUT}], 1, 5000) = 1 <0.000040> 06:55:06.700983 send(3, "\357G\1\0\0\1\0\0\0\0\0\0\7mirrors\rfedoraproje"..., 43, MSG_NOSIGNAL) = 43 <0.000066> 06:55:06.701296 gettimeofday({1215611706, 701366}, NULL) = 0 <0.000040> 06:55:06.701474 poll([{fd=3, events=POLLIN, revents=POLLIN}], 1, 4998) = 1 <0.082569> 06:55:06.784246 ioctl(3, FIONREAD, [104]) = 0 <0.000043> 06:55:06.784543 recvfrom(3, "\357G\201\200\0\1\0\1\0\1\0\0\7mirrors\rfedoraproje"..., 2048, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("68.87.76.178")}, [16]) = 104 <0.000051> 06:55:06.785102 gettimeofday({1215611706, 785191}, NULL) = 0 <0.000040> 06:55:06.785302 poll([{fd=3, events=POLLIN}], 1, 4914) = 0 <4.914651> 06:55:11.700232 close(3) = 0 <0.000117> 06:55:11.700662 exit_group(2) = ? [tbl@localhost ~]$
Narrowing this a bit..... I booted up my wife's Fedora 9 system @home (on same LAN, using same router, same ISP (Comcast)), and I do not see this problem. 'time curl http://mirrors.fedoraproject.org' always succeeds, and always takes about 1.5 seconds or less. Is there some additional debugging/tracing I can do?
More data: I changed the nameserver entries in my /etc/resolv.conf to point at the nameservers from my "@work" ISP. All worked perfectly and fast (< 1 sec). So something different between Comcast and other DNS? Here are the nameservers that "work": nameserver 208.67.222.222 nameserver 208.67.220.220 Here are the Comcast nameservers that "don't work" with Rawhide (but do with F9): nameserver 68.87.76.178 nameserver 68.87.78.130 Here is snippet from strace of curl: 07:44:30.259107 send(3, "3\376\1\0\0\1\0\0\0\0\0\0\7mirrors\rfedoraproje"..., 43, MSG_NOSIGNAL) = 43 <0.000052> 07:44:30.259327 gettimeofday({1215614670, 259384}, NULL) = 0 <0.000027> 07:44:30.259466 poll([{fd=3, events=POLLIN, revents=POLLIN}], 1, 4999) = 1 <0.015200> 07:44:30.274875 ioctl(3, FIONREAD, [57]) = 0 <0.000058> 07:44:30.275178 recvfrom(3, "3\376\201\200\0\1\0\1\0\0\0\0\7mirrors\rfedoraproje"..., 2048, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("208.67.222.222")}, [16]) = 57 <0.000044> 07:44:30.275590 gettimeofday({1215614670, 275651}, NULL) = 0 <0.000031> 07:44:30.275738 poll([{fd=3, events=POLLIN, revents=POLLIN}], 1, 4982) = 1 <0.087030> 07:44:30.362993 ioctl(3, FIONREAD, [89]) = 0 <0.000090> 07:44:30.363370 recvfrom(3, "cm\201\200\0\1\0\3\0\0\0\0\7mirrors\rfedoraproje"..., 1991, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("208.67.222.222")}, [16]) = 89 <0.000082> 07:44:30.364083 close(3) = 0 <0.000109> 07:44:30.364494 open("/etc/gai.conf", O_RDONLY) = -1 ENOENT (No such file or directory) <0.000124>
OK. This is a bit spookey...... This morning I decided to edit out the first nameserver from /etc/resolv.conf: [root@localhost ~]# cat /etc/resolv.conf # generated by NetworkManager, do not edit! domain hsd1.ca.comcast.net. search hsd1.ca.comcast.net. # nameserver 68.87.76.178 nameserver 68.87.78.130 [root@localhost ~]# The result is that the issue vanishes: [root@localhost ~]# time getent ahosts mirrors.fedoraproject.org 209.132.176.122 STREAM fedoraproject.org 209.132.176.122 DGRAM 209.132.176.122 RAW 66.35.62.162 STREAM 66.35.62.162 DGRAM 66.35.62.162 RAW real 0m0.118s user 0m0.000s sys 0m0.007s [root@localhost ~]# Uncommenting the nameserver line makes the 5 second pause come back: [tbl@localhost ~]$ time getent ahosts mirrors.fedoraproject.org real 0m5.008s user 0m0.001s sys 0m0.008s [tbl@localhost ~]$ So this appears to be an issue with 'nameserver 68.87.76.178' (but not with 'nameserver 68.87.78.130'). Sigh. I'll do some 'differential comparisons' of wireshark captures. Double sigh....
I've captured two wireshark runs with "getent ahosts mirrors.fedoraproject.org". The first (using the "good" nameserver) shows 4 DNS packets: 2 outgoing request packets to the nameserver, and 2 response packets from the nameserver, the second with the "A 66.35.62.162" answer. The second (using the "bad" nameserver) shows 3 DNS packets: the same 2 outgoing request packets to the nameserver, and 1 response packet from the nameserver with the "A 66.35.62.162". The resolver then hangs for 5 seconds in a poll(). Could the resolver code still be wanting the other packet even though it has gotten the response? I'll attach text dumps of the 2 capture files below.
Created attachment 311485 [details] wireshark capture showing "successful" 'getent ahosts mirrors.fedoraproject.org' This one has 2 DNS response packets. getent completes in a few tenths of a second.
Created attachment 311486 [details] wireshark capture showing "5 second" 'getent ahosts mirrors.fedoraproject.org' This one has only one DNS response packet. getent takes 5 seconds and fails.
(In reply to comment #25) > > The first (using the "good" nameserver) shows 4 DNS packets: 2 outgoing request > packets to the nameserver, and 2 response packets from the nameserver, the > second with the "A 66.35.62.162" answer. > > The second (using the "bad" nameserver) shows 3 DNS packets: the same 2 outgoing > request packets to the nameserver, and 1 response packet from the nameserver > with the "A 66.35.62.162". The resolver then hangs for 5 seconds in a poll(). OK, this is expected. This would also happen for lookups with older releases. The only time you wouldn't see the delay if the program explicitly requests only IPv4 addresses (i.e., it is not prepared for IPv6). The nameserver simply doesn't react to IPv6 requests. This is a server configuration problem and you should contact your ISV. The second server in your file does it right, it replies with a CNAME but nothing else. Since the lookup happens using UDP the only way for the resolver to notice that the server is not replying is to wait for the timeout (which is 5 secs). To confirm you can try an F9 installation (or older). Use getent ahosts mirrors.fedoraproject.org (note: ahosts, not hosts). With this you'll see that the program will issue two requests and also only receive one reply. With this code the replies are issued one after the other instead of concurrently (that's an improvement made in newer code) so the order might look different. If you see a different behaviour using this code I'd be surprised. This would likely mean the requests we issue changed. I don't think it's the case. So, please report on the behaviour of old code. If you don't have an older installation you can extract the file files needed from some old RPMs. Start getent in this case like /path/to/ld-linux-x86-64.so.2 --library-path /path/to/libs /usr/bin/getent ahosts mirrors.fedoraproject.org where /path/to[...] points to the extracted files.
I think I tested this before on my wife's FC9 system, and I did not observe this issue there. I can test this again when I get home tonight. To be clear, you are expecting this to fail in the same way on F9 as on Rawhide, right? If I need to wrestle with Comcast, I'd like to have all the data in place.... Thanks again for your help here.
(In reply to comment #29) > I think I tested this before on my wife's FC9 system, and I did not observe this > issue there. This really would surprise me. Don't use curl or so, use getent ahosts. Who knows whether curl was changed. > To be clear, you are expecting this to fail in the same way on F9 as on Rawhide, > right? WRT to getent ahosts, yes. If this does show a difference then the server reacts differently. This must mean the requests somehow differ. This is easy to verify using tcpdump. Whether applications behave the same is a completely different issues. Oh, and of course getent ahosts will not try to query IPv6 addresses if no IPv6 interface exists. So, that playing field has to be levelled, too.
Sorry for the delay, but I discovered that my wife's system was an older F8, not F9. So I spent this morning doing a clean install of F9 on that box (along with released updates), afterwhich I ran "time getent ahosts mirrors.fedoraproject.org". The bottom line is that the stock F9 system does not display this condition. "time getent ahosts mirrors.fedoraproject.org" with the unmodified /etc/resolv.conf completes in less than a tenth of a second. At the same time, my Rawhide system, connected to the same Linksys router takes 5 seconds (and fails): [root@localhost ~]# time getent ahosts mirrors.fedoraproject.org real 0m5.008s user 0m0.002s sys 0m0.006s [root@localhost ~]# So this F9 system is not failing the same way as Rawhide against the same DNS server. I attach below an strace output of running "time getent ahosts mirrors.fedoraproject.org" (sorry, I forgot the "-tt -T", but the output from time shows this taking 0.09 seconds). I make my Rawhide system "friendly" by commenting out the first nameserver in /etc/resolv.conf. Would wireshark outputs from F9 be of use?
Created attachment 311652 [details] Console output from F9 system running "cat /etc/resolv.conf; time strace getent ahosts mirrors.fedoraproject.org" Console output from fresh F9 system running "cat /etc/resolv.conf; time strace getent ahosts mirrors.fedoraproject.org"
Sorry for the delay. I've now installed wireshark on a basically stock F9 system and traced "time getent ahosts mirrors.fedoraproject.org". I attach the text dump from wireshark below. Compare with the wireshark dump in https://bugzilla.redhat.com/show_bug.cgi?id=454206#c27 obtained on a rawhide system. The bottom line: F9 system completes in fractions of a second, Rawhide system takes 5 seconds and ususally prints nothing. I'm sorry if I am being obtuse here..... I just want to make sure that before I approach Comcast (ugh....) that all the bases are covered. Currently, I manually edit /etc/resolv.conf on my Rawhide system on each boot (when I am at home) to comment out the first nameserver entry. [tbl@localhost ~]$ cat /etc/resolv.conf # generated by NetworkManager, do not edit! domain hsd1.ca.comcast.net. search hsd1.ca.comcast.net. #nameserver 68.87.76.178 nameserver 68.87.78.130 [tbl@localhost ~]$ Here is the (identical but for the comment) resolv.conf from F9: [tbl@localhost ~]$ cat resolv.conf.f9 # generated by NetworkManager, do not edit! domain hsd1.ca.comcast.net. search hsd1.ca.comcast.net. nameserver 68.87.76.178 nameserver 68.87.78.130 [tbl@localhost ~]$ Wireshark dump below....
Created attachment 312482 [details] wireshark capture of "time getent ahosts mirrors.fedoraproject.org" Wireshark capture from stock F9 system on same LAN as mis-behaving rawhide system.
Additional data..... : Comcast added a new name server to my resolv.conf this morning. There are now 3 name servers listed, the same two as before (68.87.76.178, 68.87.78.130), plus a new one (68.87.69.146). Commenting out the 2 "old ones" so the system only uses the "new one" reproduces the "takes 5 seconds" and fails sometimes behavior: [tbl@localhost ~]$ cat /etc/resolv.conf # generated by NetworkManager, do not edit! domain hsd1.ca.comcast.net. search hsd1.ca.comcast.net. #nameserver 68.87.76.178 #nameserver 68.87.78.130 nameserver 68.87.69.146 [tbl@localhost ~]$ [root@localhost ~]# time getent ahosts mirrors.fedoraproject.org 66.35.62.162 STREAM fedoraproject.org 66.35.62.162 DGRAM 66.35.62.162 RAW 209.132.176.122 STREAM 209.132.176.122 DGRAM 209.132.176.122 RAW real 0m5.008s user 0m0.002s sys 0m0.007s [root@localhost ~]# time getent ahosts mirrors.fedoraproject.org real 0m5.008s user 0m0.000s sys 0m0.009s [root@localhost ~]# Obivously Comcast is unaware that they are causing problems....
OK. I read your responses a bit more carefully (esp. #28). After turning off IPV6 on my rawhide system by creating a file in /etc/modprobe.d with: [root@localhost modprobe.d]# cat kill-ipv6 install ipv6 /bin/true [root@localhost modprobe.d]# and turning off ip6tables with chkconfig, the system runs fine with the "faulty (at least for IPv6) DNS server". I believe that fact that IPv6 DNS seems to work "everywhere" but this server shows (per #28) that the server is not properly configured to handle IPv6 DNS requests. I will attempt to work this with Comcast. Thanks for all your help here.....
For completeness, I post here a replay of my Chat with Comcast. Result is not too positive..... :-( -------------------------------------- user Tom_ has entered room Tom(Mon Jul 28 2008 11:03:38 GMT-0700 (PDT))> Comcast DNS servers are all not configured to handle IPv6 DNS requests. How to fix? analyst Robin has entered room Robin(Mon Jul 28 2008 11:03:41 GMT-0700 (PDT))> Hello Tom_, Thank you for contacting Comcast Live Chat Support. My name is Robin. Please give me one moment to review your information. Robin(Mon Jul 28 2008 11:03:56 GMT-0700 (PDT))> Good Morning! Tom_(Mon Jul 28 2008 08:04:00 GMT-0700 (PDT))> hi Robin(Mon Jul 28 2008 11:04:34 GMT-0700 (PDT))> I am not quite sure what you are asking for. Are you able to explain in a little bit more detail please? Tom_(Mon Jul 28 2008 08:05:12 GMT-0700 (PDT))> Sure. Comcast DHCP provides my system with 3 "nameserver" DNS servers: Tom_(Mon Jul 28 2008 08:05:19 GMT-0700 (PDT))> nameserver 68.87.76.178 nameserver 68.87.78.130 nameserver 68.87.69.146 Tom_(Mon Jul 28 2008 08:06:00 GMT-0700 (PDT))> Only the second of these (68.86.79.130) properly responds to my systems IPV6 DNS requests. The other 2 provide buggy responses. Tom_(Mon Jul 28 2008 08:06:35 GMT-0700 (PDT))> Can you have Tier 3/4 tech support review the configurations for these servers? Robin(Mon Jul 28 2008 11:06:32 GMT-0700 (PDT))> Okay, Before we get started, For your protection, and the security of your account; Can you please provide *one* of the following items? The last four-digits of the Social Security Number on the account, - OR - the last payment amount to post to the account, - OR - the account number - OR - the customer number. Tom_(Mon Jul 28 2008 08:07:06 GMT-0700 (PDT))> Uhhhh Robin(Mon Jul 28 2008 11:08:49 GMT-0700 (PDT))> Thank you, At this time I’ll need to transfer you to our Advanced Technical Support team. One moment please. Robin(Mon Jul 28 2008 11:09:12 GMT-0700 (PDT))> Please wait, while the problem is escalated to another analyst analyst Ron has entered room Ron(Mon Jul 28 2008 11:09:22 GMT-0700 (PDT))> Thank you for waiting while the chat was transferred to me. My name is Ron. One moment please while I read the chat session so far, access your account, and assess the issue. analyst Robin has left room Ron(Mon Jul 28 2008 11:11:04 GMT-0700 (PDT))> Hello Tom. Tom_(Mon Jul 28 2008 08:11:09 GMT-0700 (PDT))> For full technical details, including packet traces/dumps Tom_(Mon Jul 28 2008 08:11:17 GMT-0700 (PDT))> please review: https://bugzilla.redhat.com/show_bug.cgi?id=454206 Tom_(Mon Jul 28 2008 08:11:36 GMT-0700 (PDT))> hi Ron(Mon Jul 28 2008 11:16:45 GMT-0700 (PDT))> While I understand what you are requesting, Comcast does not offer technical support for it. The DNS servers are updated on a regular bases. Well within the two week guidelines set forth by industry standards. Tom_(Mon Jul 28 2008 08:17:13 GMT-0700 (PDT))> Sorry, but you don't understand. Tom_(Mon Jul 28 2008 08:17:24 GMT-0700 (PDT))> Nothing to do with updating the servers for new entries. Tom_(Mon Jul 28 2008 08:17:46 GMT-0700 (PDT))> Your servers are not properly configured to correctly respond to DNS reqeusts from IPv6 DNS clients. Tom_(Mon Jul 28 2008 08:17:55 GMT-0700 (PDT))> Perhaps you can escalate this? Tom_(Mon Jul 28 2008 08:18:37 GMT-0700 (PDT))> One of your DNS servers works properly. The other 2 do not. Tom_(Mon Jul 28 2008 08:18:53 GMT-0700 (PDT))> the packet traces provided in the link above show the details. Ron(Mon Jul 28 2008 11:21:41 GMT-0700 (PDT))> One moment please. Tom_(Mon Jul 28 2008 08:23:41 GMT-0700 (PDT))> Drat. This is not efficient. Is there another way for me to report this/get this fixed? Tom_(Mon Jul 28 2008 08:25:54 GMT-0700 (PDT))> Can I have a case#/ticket# to refer to if/when I try to "call" back? Ron(Mon Jul 28 2008 11:30:03 GMT-0700 (PDT))> I apologize for the delay. Ron(Mon Jul 28 2008 11:30:26 GMT-0700 (PDT))> I send the information you gave me to our networking group. The said they will look into it. Ron(Mon Jul 28 2008 11:30:45 GMT-0700 (PDT))> Since I did this via an internal email to them, there isn't a ticket number. Tom_(Mon Jul 28 2008 08:32:35 GMT-0700 (PDT))> Any ETA for any kind of response? Will they contact me? Ron(Mon Jul 28 2008 11:34:07 GMT-0700 (PDT))> They normally look into the issues I email them the same day as I send them. However, they do have up to 72 hours. Ron(Mon Jul 28 2008 11:34:41 GMT-0700 (PDT))> They just responded to me. Ron(Mon Jul 28 2008 11:35:24 GMT-0700 (PDT))> Currently the Comcast network is designed to use IPv4. Updates with be made when needed for future support of IPv6.
(In reply to comment #34) > Created an attachment (id=312482) [edit] > wireshark capture of "time getent ahosts mirrors.fedoraproject.org" > > Wireshark capture from stock F9 system on same LAN as mis-behaving rawhide > system. This trace shows that IPv6 lookups are not performed. No wonder it's faster in the presence of a name server which doesn't respond to IPv6 queries.
(In reply to comment #27) > Created an attachment (id=311486) [edit] > wireshark capture showing "5 second" 'getent ahosts mirrors.fedoraproject.org' > > This one has only one DNS response packet. getent takes 5 seconds and fails. OK, this one does an IPv6 lookup but there is no answer. Therefore the 5 sec delay. This means there is nothing wrong with the resolver. This is the expected behaviour. If you want the "old" behaviour in the presence of the broken name servers you have to disable all IPv6 addresses on your machine.