Bug 554316

Summary: ISC BIND (named) crashes with "keytable.c:286: REQUIRE(nextnodep != ((void *)0) && *nextnodep == ((void *)0)) failed"
Product: [Fedora] Fedora Reporter: Adam Tkac <atkac>
Component: bindAssignee: Adam Tkac <atkac>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: low    
Version: 12CC: atkac, chrisw, franta, gary, jan.kratochvil, ovasik, pwouters, raytodd
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: bind-9.6.1-15.P3.fc12 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 549284 Environment:
Last Closed: 2010-01-25 11:58:57 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 549284    
Bug Blocks:    

Description Adam Tkac 2010-01-11 11:25:21 UTC
+++ This bug was initially created as a clone of Bug #549284 +++

Description of problem:
named repeatedly crashes (on several independent sites), in "/var/log/messages" are lines:
Dec 21 07:21:43 ns named[1602]: no valid KEY resolving '77.in-addr.arpa/DNSKEY/IN': 192.36.125.2#53
Dec 21 07:21:43 ns named[1602]: no valid KEY resolving '77.in-addr.arpa/DNSKEY/IN': 199.212.0.53#53
Dec 21 07:21:43 ns named[1602]: no valid KEY resolving '77.in-addr.arpa/DNSKEY/IN': 193.0.0.195#53
Dec 21 07:21:43 ns named[1602]: no valid KEY resolving '77.in-addr.arpa/DNSKEY/IN': 202.12.28.140#53
Dec 21 07:21:44 ns named[1602]: keytable.c:286: REQUIRE(nextnodep != ((void *)0) && *nextnodep == ((void *)0)) failed
Dec 21 07:21:44 ns named[1602]: exiting (due to assertion failure)



Version-Release number of selected component (if applicable):
9.6.1-P2-RedHat-9.6.1-7.P2.fc11

Additional info:
At affected nodes named run with DNSSEC enabled. Crashes may be related with dlv.isc.org inaccessibility. But crashes are non acceptable

--- Additional comment from franta on 2009-12-28 08:37:41 EST ---

Disabling DNSSEC (I suppose comment named.conf lines

//      dnssec-enable yes;
//      dnssec-validation yes;
//      dnssec-lookaside . trust-anchor dlv.isc.org.;

do it) does not help.

I call for increase priority of this bug.

--- Additional comment from raytodd on 2009-12-29 14:51:36 EST ---

For me this seems to relate to something about lots of bad lookups or doing inaddr lookups.  (or at least that is when I see it)

Example
Dec 28 07:47:23 *  named[3491]: network unreachable resolving '92.in-addr.arpa/DNSKEY/IN': 2001:660:3006:1::1:1#53
Dec 28 07:47:23 * named[3491]: network unreachable resolving '92.in-addr.arpa/DNSKEY/IN': 2001:dc0:1:0:4777::140#53

By the way we are not doing ip6, but the system regularly insists on trying to do lookups on ip6 addresses.

Hope this helps.

--- Additional comment from franta on 2010-01-05 23:29:41 EST ---

On my sites we are not doing IPv6 too (loading ipv6.ko kernel module is supressed, then no interface own IPv6 address), and bind does not ip6 lookups.

But messages as:
Dec 21 07:21:42 ns named[1912]: no valid KEY resolving '95.in-addr.arpa/DNSKEY/IN': 199.212.0.53#53

Dec 21 07:21:42 ns named[1912]: unexpected RCODE (SERVFAIL) resolving '95.in-addr.arpa/DNSKEY/IN': 200.3.13.11#53

Jan  1 05:20:12 ns named[23401]: not insecure resolving '228.9.60.86.in-addr.arpa/PTR/IN': 192.36.125.2#53

Jan  1 05:20:18 ns named[23401]: no valid RRSIG resolving '228.9.60.86.in-addr.arpa/PTR/IN': 193.0.0.195#53

Jan  1 05:55:14 ns named[23401]: unexpected RCODE (REFUSED) resolving 'cache.freebsd.lublin.pl/A/IN': 77.79.235.102#53

appears frequently in /var/log/messages, first one even 10x per second.

--- Additional comment from atkac on 2010-01-11 05:27:07 EST ---

*** Bug 553814 has been marked as a duplicate of this bug. ***

--- Additional comment from atkac on 2010-01-11 05:55:24 EST ---

Created an attachment (id=382949)
proposed patch

Patch has been sent to upstream for review, will be part of next update.

Comment 1 Adam Tkac 2010-01-11 11:26:16 UTC
*** Bug 551003 has been marked as a duplicate of this bug. ***

Comment 2 Jan Kratochvil 2010-01-12 21:16:47 UTC
bind-9.6.1-13.P2.fc12.x86_64

Comment 3 Ray Todd Stevens 2010-01-12 22:24:12 UTC
Still have the problem with bind-9.6.1-13.P2.fc12.i686

Comment 4 Adam Tkac 2010-01-13 11:07:41 UTC
I just built updated package but I'm not going to release it because I expect upstream release soon (~ 1 week). You can use it if you would like to fix this issue right now. Build is located on http://kojiweb.fedoraproject.org/koji/buildinfo?buildID=150708.

Comment 5 Ray Todd Stevens 2010-01-14 19:28:57 UTC
Tell them to hurry up, I have about 2 crashes a day from this.

Comment 6 Gary Myers 2010-01-14 20:40:20 UTC
I have created a crude script to monitor the daemon via a cron job for the servers I maintain.


#!/bin/bash

CHECK=`/sbin/service named status | grep -c "server is up"`

if [ $CHECK = 0 ]; then
  service named restart
fi

exit 0


I call this file 'named-monitor' and I placed it in /root with 700 permissions and root:root ownership. I call the script from /etc/cron.d/named-monitor thus:


# Cron script to run named-check every 5 minutes.

*/5 * * * * root /root/named-check


If the named daemon is not running for any reason, it will be restarted. If it is running, the script simply exits. This should keep us going until the new packages are released.

HTH  :)

Comment 7 Frantisek Hanzlik 2010-01-14 21:25:36 UTC
On four production servers bind-9.6.1-8.P2.fc11.i586 from koji run over 13 hours
without failure.
This version is probably patched similarly to bind-9.6.1-14.P2.fc12

Comment 8 Frantisek Hanzlik 2010-01-18 10:00:30 UTC
On four servers bind-9.6.1-8.P2.fc11.i586 successfully run for four days, thus this bug is probably solved.

Comment 9 Adam Tkac 2010-01-25 11:58:57 UTC
Fixed in bind-9.6.1-15.P3.fc12.