Bug 556366 - bind repeatedly requests DNSKEY records after getting responses with unknown keys
Summary: bind repeatedly requests DNSKEY records after getting responses with unknown ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: bind
Version: 12
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: Adam Tkac
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 575604 (view as bug list)
Depends On:
Blocks: 572848 572850
TreeView+ depends on / blocked
 
Reported: 2010-01-18 05:03 UTC by Kieran Clancy
Modified: 2013-04-30 23:45 UTC (History)
8 users (show)

Fixed In Version: bind-9.6.2-2.P1.fc12
Clone Of:
: 572848 (view as bug list)
Environment:
Last Closed: 2010-03-27 00:55:16 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Kieran Clancy 2010-01-18 05:03:26 UTC
Description of problem:
In bug 556365 I reported how as the dnssec reverse zone keys were not up to date, bind was generating a lot of unwanted messages. To copy a bit from that report:

-----
I noticed that bind had begun to generate thousands of messages like this in /var/log/messages:
Jan 17 20:07:47 localhost named[1521]: no valid KEY resolving '80.in-addr.arpa/DNSKEY/IN': 202.12.29.59#53

In fact, in little over 24 hours I had 240,000 of these messages, all for
different reverse zones, and bind was chewing through a surprising amount of
bandwidth. This was prompted solely by the occasional reverse lookup by sshd.

It seems bind forgets it has no valid key and just repeatedly tries to request DNSKEY's for the same reverse zones from the same servers. For example, for just the 80.in-addr zone, in 24 hours bind retried the DNSKEY query almost 2000 times for each of 9 DNS servers (nearly 18000 retries in total)! Multiply this by the number of reverse zones with keys and this becomes fairly annoying. I'm almost surprised that I didn't receive a complaint from my ISP.
-----

Version: bind-9.6.1-13.P2.fc12.x86_64

I don't really understand how DNSSEC works, so please excuse me if I'm wrong, but this is what I think is happening:
- bind sends a DNSKEY request for a particular reverse zone
- bind receives a response, but it is signed with a key that bind doesn't know

Now, bind shouldn't stop trying after the first response with an unknown key, since otherwise a third party malicious host may be able to forge responses with bad keys so that bind would give up in its search for the DNSKEY.

However, is there a point at which, after a number of responses with unknown keys (and not a single response for that zone with a known key), bind should decide to wait a period of time before trying that zone again?

I hope that makes some amount of sense.

Comment 1 Adam Tkac 2010-03-19 11:05:23 UTC
This issue is addressed in 9.6.2-P1 upstream release.

Comment 2 Fedora Update System 2010-03-19 11:27:47 UTC
bind-9.6.2-2.P1.fc11 has been submitted as an update for Fedora 11.
http://admin.fedoraproject.org/updates/bind-9.6.2-2.P1.fc11

Comment 3 Fedora Update System 2010-03-19 11:27:51 UTC
bind-9.6.2-2.P1.fc12 has been submitted as an update for Fedora 12.
http://admin.fedoraproject.org/updates/bind-9.6.2-2.P1.fc12

Comment 4 Fedora Update System 2010-03-23 02:19:18 UTC
bind-9.6.2-2.P1.fc12 has been pushed to the Fedora 12 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update bind'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/bind-9.6.2-2.P1.fc12

Comment 5 Fedora Update System 2010-03-23 02:20:17 UTC
bind-9.6.2-2.P1.fc11 has been pushed to the Fedora 11 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update bind'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/bind-9.6.2-2.P1.fc11

Comment 6 Adam Tkac 2010-03-26 14:13:47 UTC
*** Bug 575604 has been marked as a duplicate of this bug. ***

Comment 7 Fedora Update System 2010-03-27 00:55:11 UTC
bind-9.6.2-2.P1.fc11 has been pushed to the Fedora 11 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 8 Fedora Update System 2010-03-27 01:01:53 UTC
bind-9.6.2-2.P1.fc12 has been pushed to the Fedora 12 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 9 josip@icase.edu 2010-03-27 14:36:49 UTC
The news is both bad and good.  Bad first:

bind-9.6.2-2.P1.fc12 doesn't fix related bug #575604 and in fact makes it worse because DNS stops answering queries -- it only generates numerous syslog messages.

However, reconfiguring named to *not* use forwarders restores normal operation.

Suggestion: bind should be able to function even if forwarders don't handle DNSSEC and/or DLV properly.

Comment 10 Eddie Lania 2010-03-27 17:31:28 UTC
Shouldn't this bug be reopened then now?

Comment 11 josip@icase.edu 2010-03-28 15:38:58 UTC
Maybe this bug should be reopened, although its symptoms changed.  

Even without forwarders, the latest bind produces thousands of messages daily, about *both* successes and failures, e.g.

success resolving... after reducing the advertised EDNS UDP packet size to 512 octets
lame server resolving...
connection refused resolving...
validating @x... ... no valid signature
broken trust chain resolving...
unexpected RCODE (REFUSED) resolving...
unexpected RCODE (SERVFAIL) resolving...
must-be-secure resolving...
client... RFC 1918 response from Internet for...

Before I removed my forwarders, virtually all messages said "broken trust chain resolving..." and DNS queries went unresolved.  Without forwarders (i.e. working from named.ca) DNSSEC+DLV works but logs storms of INFO messages every few seconds.  This level of detail is more appropriate for a debugging version of bind than for a production package.

Named w/DNSSEC+DLV generates >100,000 messages/week = about 3/4 of all syslog messages on my server -- that's way too much for comfort.  I get no benefit from so much raw information.  Named should work competently, with *rare* problems coalesced & reported, or DNSSEC+DLV isn't ready for normal use.

Comment 12 Eddie Lania 2010-03-28 17:57:24 UTC
Okay, 

I also encountered this issue in the past and solved it my own way.
I configured the logging channels to something like this and the problem of cluttering up the syslog was gone:


logging {
        channel default_syslog {
            syslog daemon;
            severity info;
        };
        channel standard_syslog {
            syslog daemon;
            severity dynamic;
        };
        channel named_logfile {
            file "/var/log/named.log";
            print-time yes;
            print-severity yes;
            print-category yes;
            severity dynamic;
        };
        category update { standard_syslog; };
        category lame-servers { named_logfile; };
        category query-errors { named_logfile; };
        category edns-disabled { named_logfile; };
};


Note You need to log in before you can comment on or make changes to this bug.