Bug 209954
Summary: | Caching nameserver setup gets no responses from root servers | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Axel Thimm <axel.thimm> | ||||
Component: | bind | Assignee: | Adam Tkac <atkac> | ||||
Status: | CLOSED RAWHIDE | QA Contact: | Ben Levenson <benl> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 6 | CC: | ovasik | ||||
Target Milestone: | --- | Keywords: | Reopened | ||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2007-04-10 17:27:13 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Axel Thimm
2006-10-08 22:01:28 UTC
I forgot to finish the description: When using the caching nameserver configuration a query on some random name results in a query to one of the root servers which is never replied to. Have you pinged these servers? Please attach named's messages from /var/log/messages (from named start). btw. do you use network manager? Yes, the servers pinged fine and no, there was no network manager involved. I replaced the config files with the config files from a working FC5 installation and everything started working, so it looks like an issue with the default config files. Here are the log messages from back then: Oct 8 22:54:38 fifty named[14218]: starting BIND 9.3.2 -u named -c /etc/named.caching-nameserver.conf Oct 8 22:54:38 fifty named[14218]: found 2 CPUs, using 2 worker threads Oct 8 22:54:38 fifty named[14218]: loading configuration from '/etc/named.caching-nameserver.conf' Oct 8 22:54:38 fifty named[14218]: listening on IPv6 interface lo, ::1#53 Oct 8 22:54:38 fifty named[14218]: listening on IPv4 interface lo, 127.0.0.1#53 Oct 8 22:54:38 fifty named[14218]: command channel listening on 127.0.0.1#953 Oct 8 22:54:38 fifty named[14218]: command channel listening on ::1#953 Oct 8 22:54:38 fifty named[14218]: zone 0.in-addr.arpa/IN/localhost_resolver: loaded serial 42 Oct 8 22:54:38 fifty named[14218]: zone 0.0.127.in-addr.arpa/IN/localhost_resolver: loaded serial 1997022700 Oct 8 22:54:38 fifty named[14218]: zone 255.in-addr.arpa/IN/localhost_resolver: loaded serial 42 Oct 8 22:54:38 fifty named[14218]: zone 0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.ip6.arpa/IN/localhost_resolver: loaded serial 1997022700 Oct 8 22:54:38 fifty named[14218]: zone localdomain/IN/localhost_resolver: loaded serial 42 Oct 8 22:54:38 fifty named[14218]: zone localhost/IN/localhost_resolver: loaded serial 42 Oct 8 22:54:38 fifty named[14218]: running Which version works for you? (from FC5) The config files I copied over from FC5 were not the unmodified ones, I had added local zones at about May/June, so these are config files matching the begining of FC5. I haven't tried with pure FC5 config files from recent bind updates. I just switched back to the default FC6 config files and the issue is still there, e.g. no response from any root server. Switching to the working setup yields immediate responses from the root servers. So in the default setup something must be entering the query packages and gets the package dropped on the root servers' side. Hm, I'm asking you because I can't reproduce it with any configuration, so it's quite hard to fix it... Have you tried on ppc hardware (although I don't see a reason for it to be ppc specific, the system I#m testing this on is FC6/ppc). Do you want root access to this system? You can do with the nameserver on it as you please, I'll point resolve.conf elsewhere. If you'd like to look at it on the system itself contact me in PM. Other than that I can only offer captured dumps. It could be a dupe of this one: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=211282 Thanks for your offers, I've found one box where I can reproduce it. Could you please check bind-9.3.3-6.fc6? There's a new option enable-edns so try to disable it. (see /usr/share/doc/bind-9.3.3/misc/options) Shouldn't the option be off by default to ensure proper operation? Definitely not. At least it'll break DNSSEC. But now it breaks on standard router hardware in the path to root servers, I think this is more severe, or not? EDNS is really good thing what should be used...and it has been here for many years. I don't think we should not deform our package because of wrong configured routers/firewalls. btw. the generic upstream package even doesn't allow to disable EDNS for all queries.... In that case this bug is unrelated to EDNS (and therefore not a sibling of bug #211282), e.g. EDNS is not the issue, since this network segment has been running DNS services for > 2 decades now. It also works with a config setup that didn't/doesn't have to enable/disable EDNS, in fact was running before bind was patched to allow for turning off EDNS (see comment #5). I did some tests and looks that problem is in your firewall configuration. Are you sure that firewall doesn't dropped responses from root server? (when I were behind firewall there I got no response and when I completely disable firewall all works fine). I tried it with rawhide's caching-nameserver-9.4.0-3.fc7. Please tell me your results. Regards, -A- Adam, the bug seems in the default config, not firewalling or the code. See comment #3 where I copied over the config of an FC5 system and the queries worked again. I'll try again and report back, after all there were two minor upstream releases since and many package updates as well. Created attachment 151186 [details]
test config file
I can't believe that this is a bug. Please try this configfile. If bind not
works correctly with this configuration that means that something blocks dns
responses and this isn't bind problem. If this configuration works correctly
and caching-nameserver's not we could disscuss proposed fix
Regards, -A-
After next thinking about this bug. Could you please try telnet to affected computer to port 53? (Of course outside from network). If you can't, something must throws DNS responses away. Tell me your results, please -A- The default caching-nameserver setup yields again the same issue as the original
report:
0.000000 127.0.0.1 -> 127.0.0.1 DNS Standard query A test.domain.tld
0.003267 <ip> -> 192.228.79.201 DNS Standard query A test.domain.tld
0.003267 <ip> -> 192.228.79.201 DNS Standard query A test.domain.tld
0.003290 <ip> -> 192.228.79.201 DNS Standard query A test.domain.tld
2.007135 <ip> -> 128.63.2.53 DNS Standard query A test.domain.tld
2.007135 <ip> -> 128.63.2.53 DNS Standard query A test.domain.tld
2.007147 <ip> -> 128.63.2.53 DNS Standard query A test.domain.tld
4.011329 <ip> -> 193.0.14.129 DNS Standard query A test.domain.tld
4.011329 <ip> -> 193.0.14.129 DNS Standard query A test.domain.tld
4.011341 <ip> -> 193.0.14.129 DNS Standard query A test.domain.tld
5.003525 127.0.0.1 -> 127.0.0.1 DNS Standard query A test.domain.tld
6.015534 <ip> -> 198.32.64.12 DNS Standard query A test.domain.tld
6.015534 <ip> -> 198.32.64.12 DNS Standard query A test.domain.tld
6.015546 <ip> -> 198.32.64.12 DNS Standard query A test.domain.tld
8.019744 <ip> -> 128.8.10.90 DNS Standard query A test.domain.tld
8.019744 <ip> -> 128.8.10.90 DNS Standard query A test.domain.tld
8.019757 <ip> -> 128.8.10.90 DNS Standard query A test.domain.tld
10.023971 <ip> -> 192.5.5.241 DNS Standard query A test.domain.tld
10.023971 <ip> -> 192.5.5.241 DNS Standard query A test.domain.tld
10.023983 <ip> -> 192.5.5.241 DNS Standard query A test.domain.tld
12.028143 <ip> -> 192.112.36.4 DNS Standard query A test.domain.tld
12.028143 <ip> -> 192.112.36.4 DNS Standard query A test.domain.tld
12.028156 <ip> -> 192.112.36.4 DNS Standard query A test.domain.tld
14.032346 <ip> -> 202.12.27.33 DNS Standard query A test.domain.tld
14.032346 <ip> -> 202.12.27.33 DNS Standard query A test.domain.tld
14.032358 <ip> -> 202.12.27.33 DNS Standard query A test.domain.tld
16.036555 <ip> -> 198.41.0.4 DNS Standard query A test.domain.tld
16.036555 <ip> -> 198.41.0.4 DNS Standard query A test.domain.tld
16.036567 <ip> -> 198.41.0.4 DNS Standard query A test.domain.tld
18.040757 <ip> -> 192.33.4.12 DNS Standard query A test.domain.tld
18.040757 <ip> -> 192.33.4.12 DNS Standard query A test.domain.tld
18.040769 <ip> -> 192.33.4.12 DNS Standard query A test.domain.tld
20.044960 <ip> -> 192.203.230.10 DNS Standard query A test.domain.tld
20.044960 <ip> -> 192.203.230.10 DNS Standard query A test.domain.tld
20.044972 <ip> -> 192.203.230.10 DNS Standard query A test.domain.tld
22.049166 <ip> -> 192.58.128.30 DNS Standard query A test.domain.tld
22.049166 <ip> -> 192.58.128.30 DNS Standard query A test.domain.tld
22.049177 <ip> -> 192.58.128.30 DNS Standard query A test.domain.tld
24.053367 <ip> -> 192.36.148.17 DNS Standard query A test.domain.tld
24.053367 <ip> -> 192.36.148.17 DNS Standard query A test.domain.tld
24.053379 <ip> -> 192.36.148.17 DNS Standard query A test.domain.tld
26.057647 <ip> -> 192.228.79.201 DNS Standard query A test.domain.tld
26.057647 <ip> -> 192.228.79.201 DNS Standard query A test.domain.tld
26.057659 <ip> -> 192.228.79.201 DNS Standard query A test.domain.tld
28.061779 <ip> -> 128.63.2.53 DNS Standard query A test.domain.tld
28.061779 <ip> -> 128.63.2.53 DNS Standard query A test.domain.tld
28.061791 <ip> -> 128.63.2.53 DNS Standard query A test.domain.tld
30.010001 127.0.0.1 -> 127.0.0.1 DNS Standard query response, Server failure
30.010067 127.0.0.1 -> 127.0.0.1 DNS Standard query response, Server failure
telneting works OK:
# telnet 192.228.79.201 53
Trying 192.228.79.201...
Connected to b.root-servers.net (192.228.79.201).
Escape character is '^]'.
I also tried the named.conf in attachment #151186 [details], but it didn't work either.
I also just tried a bare-metal install of RHEL5 on another system and pulled in
virgin caching-nameserver and bind packages, started named and had the same issue.
After investigations problem was in query-source option. Will be disabled in next release. |