275091 – Too many timeouts resolving $DOMAIN (in $DOMAIN?): disabling EDNS

Bug 275091 - Too many timeouts resolving $DOMAIN (in $DOMAIN?): disabling EDNS

Summary: Too many timeouts resolving $DOMAIN (in $DOMAIN?): disabling EDNS

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	bind
Sub Component:
Version:	9
Hardware:	All
OS:	All
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Adam Tkac
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Duplicates (3):	387191 445080 494948 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2007-09-03 11:58 UTC by Robert Scheck
Modified:	2018-04-11 08:41 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2008-11-11 15:04:14 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
My local /etc/named.conf (1.32 KB, text/plain) 2007-09-18 14:13 UTC, Jan Kratochvil	no flags	Details
simple patch (1.45 KB, patch) 2008-05-05 20:49 UTC, Adam Tkac	no flags	Details \| Diff
Output from dig (1.36 KB, text/plain) 2008-05-12 20:04 UTC, Adam Tkac	no flags	Details
Show Obsolete (1) View All

Description Robert Scheck 2007-09-03 11:58:52 UTC

Description of problem:
Since upgrading to BIND 9.5, my syslog is just flooded by such stuff:

Sep  3 12:17:07 tux named[10547]: too many timeouts resolving 'kysu.edu/A' (in 
'kysu.EDU'?): disabling EDNS
Sep  3 12:17:07 tux named[10547]: too many timeouts resolving 'kysu.edu/A' (in 
'kysu.EDU'?): disabling EDNS

Version-Release number of selected component (if applicable):
bind-9.5.0-11.a6

How reproducible:
Everytime and all the time for me.

Actual results:
Flooded syslog with dumb messages.

Expected results:
Silence like before, by having a useful possibility to turn this pseudo-feature 
globally (!) off: http://wilmer.gaast.net/downloads/bind-9.3-edns-global.diff - 
please apply this patch for the next rebuild.

Comment 1 Adam Tkac 2007-09-04 12:00:31 UTC

EDNS is NOT pseudo-feature (yes, only BIND implement it but it's good idea :) ).
And about patch - we have our modification (please see bind-9.3.3-edns.patch)
which adds enable-edns option. So specify enable-edns yes_or_no in your options
statement or view statement.

Adam

Comment 2 Robert Scheck 2007-09-06 00:38:03 UTC

Adam, I can't agree with you. There seems to be no option in your patch, which 
allows to disable edns completely. Putting "enable-edns no;" into "options {};" 
doesn't turn it of at all which would expected behaviour, thus re-opening.

Comment 3 Adam Tkac 2007-09-06 11:45:10 UTC

Hm, if this really doesn't work I'm wonder if Wilmer's patch will work. Could
you please test it and report me if works as expected? (I will build binaries
for you if you want).

Thanks

Comment 4 Robert Scheck 2007-09-06 14:59:54 UTC

I'll build binaries myself, as this isn't hard for me. If it works, you can 
maybe merge these patches somehow and get it upstream as feature. I'll let you 
know whether Wilmer's patch is working for me better in the next days.

Comment 5 Robert Scheck 2007-09-08 23:07:38 UTC

Adam, here's my status: Wilmers patch doesn't solve the problem for me, too.
It is currently not possible to disable EDNS completely. My syslog contains 
further on many many timeout error messages from BIND.

Comment 6 Adam Tkac 2007-09-17 12:27:35 UTC

Could you test 9.5.0-12.1.a6 (from http://people.redhat.com/atkac/test_srpms/ ),
please? You should be able to specify "edns no;" in options/view statement.

Adam

Comment 7 Jan Kratochvil 2007-09-18 14:13:02 UTC

Created attachment 198401 [details]
My local /etc/named.conf

Running on:
bind-9.5.0-12.1.a6.fc8.x86_64
bind-chroot-9.5.0-12.1.a6.fc8.x86_64
bind-libs-9.5.0-12.1.a6.fc8.x86_64
bind-utils-9.5.0-12.1.a6.fc8.x86_64
using the attached `/etc/named.conf'
and still get zillions of:
Sep 18 16:08:16 host0 named[20017]: too many timeouts resolving
'12.83.144.176.combined-HIB.dnsiplists.completewhois.com/A' (in
'dnsiplists.completewhois.com'?): disabling EDNS
Sep 18 16:08:16 host0 named[20017]: too many timeouts resolving
'12.247.13.87.combined-HIB.dnsiplists.completewhois.com/A' (in
'dnsiplists.completewhois.com'?): disabling EDNS
while having
options {
	edns no;
};

Comment 8 Adam Tkac 2007-09-19 16:26:19 UTC

Thanks for report. Could you please package with more improvements?
http://kojiweb.fedoraproject.org/koji/taskinfo?taskID=166219

Comment 9 Jan Kratochvil 2007-09-19 18:23:59 UTC

Yes, it looks fine now:
default .conf - Messages still produced
options { edns no; }; - Messages suppressed

bind-9.5.0-12.2.a6.fc8.x86_64
bind-chroot-9.5.0-12.2.a6.fc8.x86_64
bind-libs-9.5.0-12.2.a6.fc8.x86_64
bind-utils-9.5.0-12.2.a6.fc8.x86_64

Still I would rather like to turn off only the warnings as I expect the EDNS
feature would be still used for the servers supporting it.  Something similiar
to the following useful setting but targeted at EDNS:
logging {
        category lame-servers { null; };        
};  
I assume both types of the warnings should be turned on by default.
(Still I do not understand it much, just giving the possibly missing feature hint.)

Comment 10 Adam Tkac 2007-09-20 07:54:34 UTC

There already is option which should disable those messages. Put "category
edns-disabled { null; };" to your logging statement. Also let me discuss
proposed edns global option in upstream

Comment 11 Robert Scheck 2007-09-20 11:37:53 UTC

I'm using bind-9.5.0-12.2.a6 since yesterday evening and nothing edns specific 
in logs until now. But the last time, the messages came up after a day.

Comment 12 Adam Tkac 2007-09-21 07:03:12 UTC

After discussion in upstream I'm going to completely throw away global edns
option (http://marc.info/?t=119027612100006&r=1&w=2). There will be three broken
things whose break EDNS (with solutions):

- bad remote server - use server's statement edns option
- big packet loss - simple disable EDNS related logging (comment #10)
- bad firewall/router (most cases) - report bug to your router vendor

Adam

Comment 13 Adam Tkac 2007-09-21 07:07:37 UTC

(In reply to comment #12)
> - bad firewall/router (most cases) - report bug to your router vendor
- also set EDNS packet size should help (edns-udp-size, max-udp-size options)

Comment 14 Robert Scheck 2007-09-21 08:27:20 UTC

Adam, I can't agree with you. I can't add thousands of bad remote servers to my 
configuration. And there are cases where I can't fix the firewall/router and 
the stuff *has* to work. So there *has* to be a workaround for such IMHO pseudo 
features. Please keep this patch at downstream, because not every upstream 
decision is clever and upstream doesn't want to see the reality in this case.

Comment 15 Adam Tkac 2007-09-21 08:53:45 UTC

(In reply to comment #14)
> Adam, I can't agree with you. I can't add thousands of bad remote servers to my 
> configuration. And there are cases where I can't fix the firewall/router and 
> the stuff *has* to work. So there *has* to be a workaround for such IMHO pseudo 
> features. Please keep this patch at downstream, because not every upstream 
> decision is clever and upstream doesn't want to see the reality in this case.

Yes, upstream will be stupid sometimes but not in this case.

if you have one or more forwarders I don't believe that many doesn't support
EDNS. If yes it means that you are using old software with security bugs etc.
And set server statement for one of them isn't so annoying. And I also believe
that all (or near all) root servers and frontline servers support EDNS.

if you have bad firewall options edns-udp-size 512; (for outgoing queries) and
max-udp-size 512; (from incomming queries) should help you.

if you have bad network simply disable edns logging

It's really no argument why add global edns option, is it?

Comment 16 Adam Tkac 2007-09-24 12:03:11 UTC

As written in comments #12, #13, #15, closing. Global edns option isn't needed

Comment 17 Robert Scheck 2007-09-24 12:19:27 UTC

My forwarders support EDNS. But many authoritative DNS servers in eastern 
Europe, Asia and when wandering more through the Pacifc room don't. Most of 
this timeouts are from eastern EMEA and I don't want to configure eastern
EMEA's authoritative DNS servers as exceptions in my BIND configuration. So
there has to be a *real* solution for this. Just disabling logging doesn't
solve the problem, that DNS is very slow for these servers. IMHO you never
used EDNS in a huge environment where countless DNS queries are done. And
there *must* be a good reason, that vendors like HP are introducing similar
patches for BIND regarding this "feature". When having HP-UX, I'm able to 
disable the stuff completely, because HP recognized, there *are* cases like
I described above, where EDNS slows up heavily.

You'll have to change this for RHEL6 anyway, so why is the problem to add
this patch to Fedora 8, too? And please don't tell me, you won't add a patch
to BIND for RHEL6. There will be many customers killing you for this feature
as I'm currently doing. Do you really want to make me switching to HP-UX? ;-)

Comment 18 Adam Tkac 2007-09-24 12:58:00 UTC

In the end I've added that patch to f8 build...

Comment 19 Adam Tkac 2007-11-26 17:25:44 UTC

*** Bug 387191 has been marked as a duplicate of this bug. ***

Comment 20 Adam Tkac 2007-11-26 18:11:03 UTC

After big discussion on bind-workers
(http://marc.info/?t=119566296700004&r=1&w=2) this option is going to be removed
in F9. I'm not going to rebel against all bind developers despite I think this
is good option. Please use

server ::/0 { edns no; };
server 0.0.0.0/0 { edns no; };

instead edns no;

As I wrote above real problem is often in bad router. Please report it to router
vendor.

Comment 21 Robert Scheck 2007-11-26 18:38:06 UTC

Accepted as far as your suggestion works the same way.

And just once more: How to report to the router vendor such problems when
many companies of DNS servers in Asia Pacific (sorry, but the problems are 
always coming up when walking more and more east) are just too dumb to 
configure their routers either correctly or not buying working routers? :)

Comment 22 Adam Tkac 2007-11-26 19:21:04 UTC

(In reply to comment #21)
> Accepted as far as your suggestion works the same way.
> 
> And just once more: How to report to the router vendor such problems when
> many companies of DNS servers in Asia Pacific (sorry, but the problems are 
> always coming up when walking more and more east) are just too dumb to 
> configure their routers either correctly or not buying working routers? :)

Yes, it is problem. You could only tell them that they doesn't honour eight year
old standard. Best will be send that people to jail and let them go after they
promise that they will never viloate ENDS again :)

Comment 23 Robert Scheck 2008-03-27 22:06:05 UTC

Adam, "server ::/0 { edns no; }; server 0.0.0.0/0 { edns no; };" does NOT work
for me. It causes the same error messages as it wouldn't be there. Reopening.

Comment 24 Adam Tkac 2008-04-14 12:39:54 UTC

Hm, what you mean with "doesn't work"? There are messages in system log?

Comment 25 Robert Scheck 2008-04-14 16:25:14 UTC

"server ::/0 { edns no; }; server 0.0.0.0/0 { edns no; };" causes the same "Too 
many timeouts resolving $DOMAIN (in $DOMAIN): disabling EDNS" messages for me
and DNS gets ugly slow (even my forwarders support EDNS), but some DNS servers
in foreign don't. I know, that I can hide the log stuff, but that doesn't speed
up DNS/resolving for me. Anyway, shouldn't disabling "edns no" as above really
disable EDNS in BIND?

Comment 26 Robert Scheck 2008-04-15 21:07:36 UTC

OUCH! Is the segfault somehow related to the EDNS thing?

Apr 15 02:27:54 tux named[5132]: too many timeouts resolving
'roberthansen.com/MX' (in 'roberthansen.com'?): disabling EDNS
Apr 15 02:27:59 tux named[5132]: too many timeouts resolving
'roberthansen.com/MX' (in 'roberthansen.com'?): disabling EDNS
Apr 15 02:34:59 tux kernel: named[5134]: segfault at 00000000 eip 0016079f esp
b74feec0 error 4

Comment 27 Adam Tkac 2008-05-05 20:39:28 UTC

*** Bug 445080 has been marked as a duplicate of this bug. ***

Comment 28 Adam Tkac 2008-05-05 20:48:38 UTC

After investigation server 0.0.0.0 { edns no; }; statement doesn't work now. I'm
going to release update with some additional patches right after F9 release

Comment 29 Adam Tkac 2008-05-05 20:49:28 UTC

Created attachment 304558 [details]
simple patch

Comment 30 Robert Scheck 2008-05-08 19:29:41 UTC

32:9.5.0-31.b3 doesn't seem to contain the patch, yet.

Comment 31 Adam Tkac 2008-05-12 19:47:54 UTC

(In reply to comment #30)
> 32:9.5.0-31.b3 doesn't seem to contain the patch, yet.

You are right. After more investigation this patch isn't sufficient. Final patch
is going to be different and waiting for review in upstream.

Note that if you specify "server 0.0.0.0/0 { edns no; };" (and also for ::/0,
make sure you included "/0") EDNS is really disabled globally (you can check it
with tcpdump or dnscap), only log message leaks when query timeouts (not due
EDNS but when server is really unreachable)

Btw could you please check if your problem is really in EDNS, please? What says
"dig @<your_forwarder_IP> <some_name> +edns=0" ? And if it fails can add
+bufsize=512 parameter, please? From comment #26 it looks that you have big UDP
packet loss than your forwarder doesn't support EDNS. Thanks

Comment 32 Robert Scheck 2008-05-12 20:00:06 UTC

I'm using /0 of course. My forwarders are both working fine, interestingly the 
following causes a timeout and thus a log message as in summary:

tux:~ # dig @127.0.0.1 redrumterriers.com +edns=0

; <<>> DiG 9.5.0b3 <<>> @127.0.0.1 redrumterriers.com +edns=0
; (1 server found)
;; global options:  printcmd
;; connection timed out; no servers could be reached
tux:~ # 

AFAIK I've got no special configuration of bind on my server (just caching DNS 
which provides the same forwarders as in /etc/resolv.conf).

Comment 33 Adam Tkac 2008-05-12 20:04:26 UTC

Created attachment 305175 [details]
Output from dig

Look on this dig output - from comment #26 it might look that roberthansen.com.
server doesn't support EDNS but it supports it. Log messages were due query
timeouts (packet loss, server was down etc) so no problem related to EDNS.

I recommend do not disable EDNS but first check if servers really doesn't
support it. And if they don't then proper administrator should be informed that
he doesn't conform nine year old standard and he should fix his server.

Comment 34 Robert Scheck 2008-05-12 20:06:43 UTC

tux:~ # grep -c "too many timeouts resolving" /var/log/messages*
/var/log/messages:619
/var/log/messages-20080420:2146
/var/log/messages-20080427:2782
/var/log/messages-20080504:2942
/var/log/messages-20080511:4064
tux:~ #

You or me writing these e-mails? ;-)

Comment 35 Fedora Update System 2008-05-13 07:43:44 UTC

bind-9.5.0-31.b3.fc9 has been submitted as an update for Fedora 9

Comment 36 Bug Zapper 2008-05-14 03:10:25 UTC

Changing version to '9' as part of upcoming Fedora 9 GA.
More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 37 Fedora Update System 2008-05-14 21:31:10 UTC

bind-9.5.0-31.b3.fc9 has been pushed to the Fedora 9 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update bind'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F9/FEDORA-2008-3861

Comment 38 Fedora Update System 2008-05-23 11:14:29 UTC

bind-9.5.0-32.rc1.fc9 has been submitted as an update for Fedora 9

Comment 39 Fedora Update System 2008-06-03 07:28:29 UTC

bind-9.5.0-32.rc1.fc9 has been pushed to the Fedora 9 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 40 Jan Kratochvil 2008-06-10 18:41:52 UTC

(In reply to comment #33)
> I recommend do not disable EDNS but first check if servers really doesn't
> support it. And if they don't then proper administrator should be informed that
> he doesn't conform nine year old standard and he should fix his server.

Sorry but I am not going to contact admins of about 3000 servers per day, this
log is for the last two days on my workstation:
  http://www.jankratochvil.net/priv/disabling-edns
  named[24415]: too many timeouts resolving 'LINE-CONTENT/xxx' (in 'yyy'?):
  disabling EDNS
Connectivity is over ADSL on public static IP (through openvpn), 30ms RTT.

Comment 42 Adam Tkac 2008-10-03 12:57:23 UTC

this problem is addressed in upcomming 9.5.1 release. Currently we have 9.5.1b2 in rawhide so if anyone is using F10 tree he could test bind-9.5.1-0.7.b2.fc10 and check if situation is better.

Comment 43 Jan Kratochvil 2008-10-04 18:26:28 UTC

bind-9.5.1-0.7.b2.fc10.x86_64
Still a flood of them but they are of two kinds now:
Oct  4 20:12:54 host0 named[17739]: too many timeouts resolving 'homebanc.com/TXT' (in 'homebanc.COM'?): reducing the advertised EDNS UDP packet size to 512 octets
Oct  4 20:13:06 host0 named[17739]: too many timeouts resolving 'scissorlok.com/A' (in 'scissorlok.COM'?): disabling EDNS
Oct  4 20:15:21 host0 named[17739]: too many timeouts resolving 'valent.com/TXT' (in 'valent.COM'?): reducing the advertised EDNS UDP packet size to 512 octets
Oct  4 20:15:29 host0 named[17739]: too many timeouts resolving 'imaginemedia.co.uk.fulldom.rfc-ignorant.org/A' (in 'fulldom.rfc-ignorant.org'?): reducing the advertised EDNS UDP packet size to 512 octets

(full MTU/MRU 1500 - openvpn)

Comment 44 Adam Tkac 2008-11-11 15:04:14 UTC

After many discussions about EDNS0 there is no way how better solve this problem. named works as RFC 2671 says and that messages are usually helpful so they are logged by default. If you want suppress that messages use "category edns-disabled { null; };" in your logging statement in named.conf. Closing

Comment 45 Adam Tkac 2009-04-10 10:08:39 UTC

*** Bug 494948 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.