Bug 275091

Summary: Too many timeouts resolving $DOMAIN (in $DOMAIN?): disabling EDNS
Product: [Fedora] Fedora Reporter: Robert Scheck <redhat-bugzilla>
Component: bindAssignee: Adam Tkac <atkac>
Status: CLOSED NOTABUG QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: medium    
Version: 9CC: aliahsan81, davej, jan.kratochvil, mcepl, ndbecker2, ovasik
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-11-11 15:04:14 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
My local /etc/named.conf
none
simple patch
none
Output from dig none

Description Robert Scheck 2007-09-03 11:58:52 UTC
Description of problem:
Since upgrading to BIND 9.5, my syslog is just flooded by such stuff:

Sep  3 12:17:07 tux named[10547]: too many timeouts resolving 'kysu.edu/A' (in 
'kysu.EDU'?): disabling EDNS
Sep  3 12:17:07 tux named[10547]: too many timeouts resolving 'kysu.edu/A' (in 
'kysu.EDU'?): disabling EDNS

Version-Release number of selected component (if applicable):
bind-9.5.0-11.a6

How reproducible:
Everytime and all the time for me.

Actual results:
Flooded syslog with dumb messages.

Expected results:
Silence like before, by having a useful possibility to turn this pseudo-feature 
globally (!) off: http://wilmer.gaast.net/downloads/bind-9.3-edns-global.diff - 
please apply this patch for the next rebuild.

Comment 1 Adam Tkac 2007-09-04 12:00:31 UTC
EDNS is NOT pseudo-feature (yes, only BIND implement it but it's good idea :) ).
And about patch - we have our modification (please see bind-9.3.3-edns.patch)
which adds enable-edns option. So specify enable-edns yes_or_no in your options
statement or view statement.

Adam

Comment 2 Robert Scheck 2007-09-06 00:38:03 UTC
Adam, I can't agree with you. There seems to be no option in your patch, which 
allows to disable edns completely. Putting "enable-edns no;" into "options {};" 
doesn't turn it of at all which would expected behaviour, thus re-opening.

Comment 3 Adam Tkac 2007-09-06 11:45:10 UTC
Hm, if this really doesn't work I'm wonder if Wilmer's patch will work. Could
you please test it and report me if works as expected? (I will build binaries
for you if you want).

Thanks

Comment 4 Robert Scheck 2007-09-06 14:59:54 UTC
I'll build binaries myself, as this isn't hard for me. If it works, you can 
maybe merge these patches somehow and get it upstream as feature. I'll let you 
know whether Wilmer's patch is working for me better in the next days.

Comment 5 Robert Scheck 2007-09-08 23:07:38 UTC
Adam, here's my status: Wilmers patch doesn't solve the problem for me, too.
It is currently not possible to disable EDNS completely. My syslog contains 
further on many many timeout error messages from BIND.

Comment 6 Adam Tkac 2007-09-17 12:27:35 UTC
Could you test 9.5.0-12.1.a6 (from http://people.redhat.com/atkac/test_srpms/ ),
please? You should be able to specify "edns no;" in options/view statement.

Adam

Comment 7 Jan Kratochvil 2007-09-18 14:13:02 UTC
Created attachment 198401 [details]
My local /etc/named.conf

Running on:
bind-9.5.0-12.1.a6.fc8.x86_64
bind-chroot-9.5.0-12.1.a6.fc8.x86_64
bind-libs-9.5.0-12.1.a6.fc8.x86_64
bind-utils-9.5.0-12.1.a6.fc8.x86_64
using the attached `/etc/named.conf'
and still get zillions of:
Sep 18 16:08:16 host0 named[20017]: too many timeouts resolving
'12.83.144.176.combined-HIB.dnsiplists.completewhois.com/A' (in
'dnsiplists.completewhois.com'?): disabling EDNS
Sep 18 16:08:16 host0 named[20017]: too many timeouts resolving
'12.247.13.87.combined-HIB.dnsiplists.completewhois.com/A' (in
'dnsiplists.completewhois.com'?): disabling EDNS
while having
options {
	edns no;
};

Comment 8 Adam Tkac 2007-09-19 16:26:19 UTC
Thanks for report. Could you please package with more improvements?
http://kojiweb.fedoraproject.org/koji/taskinfo?taskID=166219

Comment 9 Jan Kratochvil 2007-09-19 18:23:59 UTC
Yes, it looks fine now:
default .conf - Messages still produced
options { edns no; }; - Messages suppressed

bind-9.5.0-12.2.a6.fc8.x86_64
bind-chroot-9.5.0-12.2.a6.fc8.x86_64
bind-libs-9.5.0-12.2.a6.fc8.x86_64
bind-utils-9.5.0-12.2.a6.fc8.x86_64

Still I would rather like to turn off only the warnings as I expect the EDNS
feature would be still used for the servers supporting it.  Something similiar
to the following useful setting but targeted at EDNS:
logging {
        category lame-servers { null; };        
};  
I assume both types of the warnings should be turned on by default.
(Still I do not understand it much, just giving the possibly missing feature hint.)


Comment 10 Adam Tkac 2007-09-20 07:54:34 UTC
There already is option which should disable those messages. Put "category
edns-disabled { null; };" to your logging statement. Also let me discuss
proposed edns global option in upstream

Comment 11 Robert Scheck 2007-09-20 11:37:53 UTC
I'm using bind-9.5.0-12.2.a6 since yesterday evening and nothing edns specific 
in logs until now. But the last time, the messages came up after a day.

Comment 12 Adam Tkac 2007-09-21 07:03:12 UTC
After discussion in upstream I'm going to completely throw away global edns
option (http://marc.info/?t=119027612100006&r=1&w=2). There will be three broken
things whose break EDNS (with solutions):

- bad remote server - use server's statement edns option
- big packet loss - simple disable EDNS related logging (comment #10)
- bad firewall/router (most cases) - report bug to your router vendor

Adam

Comment 13 Adam Tkac 2007-09-21 07:07:37 UTC
(In reply to comment #12)
> - bad firewall/router (most cases) - report bug to your router vendor
- also set EDNS packet size should help (edns-udp-size, max-udp-size options)



Comment 14 Robert Scheck 2007-09-21 08:27:20 UTC
Adam, I can't agree with you. I can't add thousands of bad remote servers to my 
configuration. And there are cases where I can't fix the firewall/router and 
the stuff *has* to work. So there *has* to be a workaround for such IMHO pseudo 
features. Please keep this patch at downstream, because not every upstream 
decision is clever and upstream doesn't want to see the reality in this case.

Comment 15 Adam Tkac 2007-09-21 08:53:45 UTC
(In reply to comment #14)
> Adam, I can't agree with you. I can't add thousands of bad remote servers to my 
> configuration. And there are cases where I can't fix the firewall/router and 
> the stuff *has* to work. So there *has* to be a workaround for such IMHO pseudo 
> features. Please keep this patch at downstream, because not every upstream 
> decision is clever and upstream doesn't want to see the reality in this case.

Yes, upstream will be stupid sometimes but not in this case.

if you have one or more forwarders I don't believe that many doesn't support
EDNS. If yes it means that you are using old software with security bugs etc.
And set server statement for one of them isn't so annoying. And I also believe
that all (or near all) root servers and frontline servers support EDNS.

if you have bad firewall options edns-udp-size 512; (for outgoing queries) and
max-udp-size 512; (from incomming queries) should help you.

if you have bad network simply disable edns logging

It's really no argument why add global edns option, is it?

Comment 16 Adam Tkac 2007-09-24 12:03:11 UTC
As written in comments #12, #13, #15, closing. Global edns option isn't needed

Comment 17 Robert Scheck 2007-09-24 12:19:27 UTC
My forwarders support EDNS. But many authoritative DNS servers in eastern 
Europe, Asia and when wandering more through the Pacifc room don't. Most of 
this timeouts are from eastern EMEA and I don't want to configure eastern
EMEA's authoritative DNS servers as exceptions in my BIND configuration. So
there has to be a *real* solution for this. Just disabling logging doesn't
solve the problem, that DNS is very slow for these servers. IMHO you never
used EDNS in a huge environment where countless DNS queries are done. And
there *must* be a good reason, that vendors like HP are introducing similar
patches for BIND regarding this "feature". When having HP-UX, I'm able to 
disable the stuff completely, because HP recognized, there *are* cases like
I described above, where EDNS slows up heavily.

You'll have to change this for RHEL6 anyway, so why is the problem to add
this patch to Fedora 8, too? And please don't tell me, you won't add a patch
to BIND for RHEL6. There will be many customers killing you for this feature
as I'm currently doing. Do you really want to make me switching to HP-UX? ;-)

Comment 18 Adam Tkac 2007-09-24 12:58:00 UTC
In the end I've added that patch to f8 build...

Comment 19 Adam Tkac 2007-11-26 17:25:44 UTC
*** Bug 387191 has been marked as a duplicate of this bug. ***

Comment 20 Adam Tkac 2007-11-26 18:11:03 UTC
After big discussion on bind-workers
(http://marc.info/?t=119566296700004&r=1&w=2) this option is going to be removed
in F9. I'm not going to rebel against all bind developers despite I think this
is good option. Please use

server ::/0 { edns no; };
server 0.0.0.0/0 { edns no; };

instead edns no;

As I wrote above real problem is often in bad router. Please report it to router
vendor.

Comment 21 Robert Scheck 2007-11-26 18:38:06 UTC
Accepted as far as your suggestion works the same way.

And just once more: How to report to the router vendor such problems when
many companies of DNS servers in Asia Pacific (sorry, but the problems are 
always coming up when walking more and more east) are just too dumb to 
configure their routers either correctly or not buying working routers? :)

Comment 22 Adam Tkac 2007-11-26 19:21:04 UTC
(In reply to comment #21)
> Accepted as far as your suggestion works the same way.
> 
> And just once more: How to report to the router vendor such problems when
> many companies of DNS servers in Asia Pacific (sorry, but the problems are 
> always coming up when walking more and more east) are just too dumb to 
> configure their routers either correctly or not buying working routers? :)

Yes, it is problem. You could only tell them that they doesn't honour eight year
old standard. Best will be send that people to jail and let them go after they
promise that they will never viloate ENDS again :)

Comment 23 Robert Scheck 2008-03-27 22:06:05 UTC
Adam, "server ::/0 { edns no; }; server 0.0.0.0/0 { edns no; };" does NOT work
for me. It causes the same error messages as it wouldn't be there. Reopening.

Comment 24 Adam Tkac 2008-04-14 12:39:54 UTC
Hm, what you mean with "doesn't work"? There are messages in system log?

Comment 25 Robert Scheck 2008-04-14 16:25:14 UTC
"server ::/0 { edns no; }; server 0.0.0.0/0 { edns no; };" causes the same "Too 
many timeouts resolving $DOMAIN (in $DOMAIN): disabling EDNS" messages for me
and DNS gets ugly slow (even my forwarders support EDNS), but some DNS servers
in foreign don't. I know, that I can hide the log stuff, but that doesn't speed
up DNS/resolving for me. Anyway, shouldn't disabling "edns no" as above really
disable EDNS in BIND?

Comment 26 Robert Scheck 2008-04-15 21:07:36 UTC
OUCH! Is the segfault somehow related to the EDNS thing?

Apr 15 02:27:54 tux named[5132]: too many timeouts resolving
'roberthansen.com/MX' (in 'roberthansen.com'?): disabling EDNS
Apr 15 02:27:59 tux named[5132]: too many timeouts resolving
'roberthansen.com/MX' (in 'roberthansen.com'?): disabling EDNS
Apr 15 02:34:59 tux kernel: named[5134]: segfault at 00000000 eip 0016079f esp
b74feec0 error 4

Comment 27 Adam Tkac 2008-05-05 20:39:28 UTC
*** Bug 445080 has been marked as a duplicate of this bug. ***

Comment 28 Adam Tkac 2008-05-05 20:48:38 UTC
After investigation server 0.0.0.0 { edns no; }; statement doesn't work now. I'm
going to release update with some additional patches right after F9 release

Comment 29 Adam Tkac 2008-05-05 20:49:28 UTC
Created attachment 304558 [details]
simple patch

Comment 30 Robert Scheck 2008-05-08 19:29:41 UTC
32:9.5.0-31.b3 doesn't seem to contain the patch, yet.

Comment 31 Adam Tkac 2008-05-12 19:47:54 UTC
(In reply to comment #30)
> 32:9.5.0-31.b3 doesn't seem to contain the patch, yet.

You are right. After more investigation this patch isn't sufficient. Final patch
is going to be different and waiting for review in upstream.

Note that if you specify "server 0.0.0.0/0 { edns no; };" (and also for ::/0,
make sure you included "/0") EDNS is really disabled globally (you can check it
with tcpdump or dnscap), only log message leaks when query timeouts (not due
EDNS but when server is really unreachable)

Btw could you please check if your problem is really in EDNS, please? What says
"dig @<your_forwarder_IP> <some_name> +edns=0" ? And if it fails can add
+bufsize=512 parameter, please? From comment #26 it looks that you have big UDP
packet loss than your forwarder doesn't support EDNS. Thanks

Comment 32 Robert Scheck 2008-05-12 20:00:06 UTC
I'm using /0 of course. My forwarders are both working fine, interestingly the 
following causes a timeout and thus a log message as in summary:

tux:~ # dig @127.0.0.1 redrumterriers.com +edns=0

; <<>> DiG 9.5.0b3 <<>> @127.0.0.1 redrumterriers.com +edns=0
; (1 server found)
;; global options:  printcmd
;; connection timed out; no servers could be reached
tux:~ # 

AFAIK I've got no special configuration of bind on my server (just caching DNS 
which provides the same forwarders as in /etc/resolv.conf).

Comment 33 Adam Tkac 2008-05-12 20:04:26 UTC
Created attachment 305175 [details]
Output from dig

Look on this dig output - from comment #26 it might look that roberthansen.com.
server doesn't support EDNS but it supports it. Log messages were due query
timeouts (packet loss, server was down etc) so no problem related to EDNS.

I recommend do not disable EDNS but first check if servers really doesn't
support it. And if they don't then proper administrator should be informed that
he doesn't conform nine year old standard and he should fix his server.

Comment 34 Robert Scheck 2008-05-12 20:06:43 UTC
tux:~ # grep -c "too many timeouts resolving" /var/log/messages*
/var/log/messages:619
/var/log/messages-20080420:2146
/var/log/messages-20080427:2782
/var/log/messages-20080504:2942
/var/log/messages-20080511:4064
tux:~ #

You or me writing these e-mails? ;-)

Comment 35 Fedora Update System 2008-05-13 07:43:44 UTC
bind-9.5.0-31.b3.fc9 has been submitted as an update for Fedora 9

Comment 36 Bug Zapper 2008-05-14 03:10:25 UTC
Changing version to '9' as part of upcoming Fedora 9 GA.
More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 37 Fedora Update System 2008-05-14 21:31:10 UTC
bind-9.5.0-31.b3.fc9 has been pushed to the Fedora 9 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update bind'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F9/FEDORA-2008-3861

Comment 38 Fedora Update System 2008-05-23 11:14:29 UTC
bind-9.5.0-32.rc1.fc9 has been submitted as an update for Fedora 9

Comment 39 Fedora Update System 2008-06-03 07:28:29 UTC
bind-9.5.0-32.rc1.fc9 has been pushed to the Fedora 9 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 40 Jan Kratochvil 2008-06-10 18:41:52 UTC
(In reply to comment #33)
> I recommend do not disable EDNS but first check if servers really doesn't
> support it. And if they don't then proper administrator should be informed that
> he doesn't conform nine year old standard and he should fix his server.

Sorry but I am not going to contact admins of about 3000 servers per day, this
log is for the last two days on my workstation:
  http://www.jankratochvil.net/priv/disabling-edns
  named[24415]: too many timeouts resolving 'LINE-CONTENT/xxx' (in 'yyy'?):
  disabling EDNS
Connectivity is over ADSL on public static IP (through openvpn), 30ms RTT.


Comment 42 Adam Tkac 2008-10-03 12:57:23 UTC
this problem is addressed in upcomming 9.5.1 release. Currently we have 9.5.1b2 in rawhide so if anyone is using F10 tree he could test bind-9.5.1-0.7.b2.fc10 and check if situation is better.

Comment 43 Jan Kratochvil 2008-10-04 18:26:28 UTC
bind-9.5.1-0.7.b2.fc10.x86_64
Still a flood of them but they are of two kinds now:
Oct  4 20:12:54 host0 named[17739]: too many timeouts resolving 'homebanc.com/TXT' (in 'homebanc.COM'?): reducing the advertised EDNS UDP packet size to 512 octets
Oct  4 20:13:06 host0 named[17739]: too many timeouts resolving 'scissorlok.com/A' (in 'scissorlok.COM'?): disabling EDNS
Oct  4 20:15:21 host0 named[17739]: too many timeouts resolving 'valent.com/TXT' (in 'valent.COM'?): reducing the advertised EDNS UDP packet size to 512 octets
Oct  4 20:15:29 host0 named[17739]: too many timeouts resolving 'imaginemedia.co.uk.fulldom.rfc-ignorant.org/A' (in 'fulldom.rfc-ignorant.org'?): reducing the advertised EDNS UDP packet size to 512 octets

(full MTU/MRU 1500 - openvpn)

Comment 44 Adam Tkac 2008-11-11 15:04:14 UTC
After many discussions about EDNS0 there is no way how better solve this problem. named works as RFC 2671 says and that messages are usually helpful so they are logged by default. If you want suppress that messages use "category edns-disabled { null; };" in your logging statement in named.conf. Closing

Comment 45 Adam Tkac 2009-04-10 10:08:39 UTC
*** Bug 494948 has been marked as a duplicate of this bug. ***