Bug 837165

Summary: BIND crashes with assertion failure
Product: Red Hat Enterprise Linux 6 Reporter: Daniel McNamara <daniel>
Component: bindAssignee: Adam Tkac <atkac>
Status: CLOSED ERRATA QA Contact: qe-baseos-daemons
Severity: high Docs Contact:
Priority: high    
Version: 6.3CC: azelinka, Colin.Simpson, ddumas, derek, dgherman, jmontleo, john.mora, kbooth, ksquizza, manish, mchappel, mooney, nc, ovasik, pingale, plyons, redhat-bugzilla, rhn, robert.scheck
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-02-21 10:58:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 838956    
Attachments:
Description Flags
Proposed patch none

Description Daniel McNamara 2012-07-03 03:21:29 UTC
Description of problem:

Ever since BIND update in RHEL 6.3 round up named crashing with assertion failures, appears to be triggered by non valid queries (yet to manually confirm this however)

Version-Release number of selected component (if applicable):

bind-chroot-9.8.2-0.10.rc1.el6.x86_64
bind-9.8.2-0.10.rc1.el6.x86_64
bind-utils-9.8.2-0.10.rc1.el6.x86_64
bind-libs-9.8.2-0.10.rc1.el6.x86_64

How reproducible:

So far unable to locate query type that causes the crash, has occurred several times since package update however.


Steps to Reproduce:

1. Update RHEL 6.3 to bind-9.8.2-0.10.rc1.el6.x86_64
2. Run script with multiple invalid query types
3. Named crashes with assertion failure
  
Actual results:

Named crashes out:

--
Jun 26 19:19:47 f2 named[16303]: error (network unreachable) resolving 'attendee.role.req.participant.partstat.needs.action.rsvp.true.cn.sdowns.aom.c.org/A/IN': 2001:500:22::254#53
Jun 26 19:19:49 f2 named[16303]: rbtdb.c:1619: INSIST(!((void *)((node)->deadlink.prev) != (void *)(-1))) failed
Jun 26 19:19:49 f2 named[16303]: exiting (due to assertion failure)
--
Jun 27 03:20:12 f2 named[17699]: error (connection refused) resolving 'zapaska.biz/A/IN': 74.208.64.145#53
Jun 27 03:20:13 f2 named[17699]: rbtdb.c:1619: INSIST(!((void *)((node)->deadlink.prev) != (void *)(-1))) failed
Jun 27 03:20:13 f2 named[17699]: exiting (due to assertion failure)
--
Jun 28 06:53:03 f2 named[30490]:   validating @0x7ff4f80a38a0: I60UU5FGC7SPHUOLS5OC7RKBT1EJ66RG.tw NSEC3: no valid signature found
Jun 28 06:53:04 f2 named[30490]: rbtdb.c:1857: INSIST(!((void *)((node)->deadlink.prev) != (void *)(-1))) failed
Jun 28 06:53:04 f2 named[30490]: exiting (due to assertion failure)
--
Jun 29 21:47:26 f2 named[25616]: error (network unreachable) resolving 'm2.nstld.net/A/IN': 2001:503:231d::2:30#53
Jun 29 21:47:26 f2 named[25616]: rbtdb.c:1619: INSIST(!((void *)((node)->deadlink.prev) != (void *)(-1))) failed
Jun 29 21:47:26 f2 named[25616]: exiting (due to assertion failure)
--
Jun 30 01:32:20 f2 named[13112]: error (network unreachable) resolving 'ns2.byet.org/AAAA/IN': 2001:500:40::1#53
Jun 30 01:32:20 f2 named[13112]: rbtdb.c:1857: INSIST(!((void *)((node)->deadlink.prev) != (void *)(-1))) failed
Jun 30 01:32:20 f2 named[13112]: exiting (due to assertion failure)
--
Jun 30 20:05:43 f2 named[7233]:   validating @0x7f4bd008b5d0: EEE0K4ONQCCHCJQTQ5VJD52NKJTEHAJN.net NSEC3: no valid signature found
Jun 30 20:05:45 f2 named[7233]: rbtdb.c:1511: INSIST(!((void *)((node)->deadlink.prev) != (void *)(-1))) failed
Jun 30 20:05:45 f2 named[7233]: exiting (due to assertion failure)
--
Jul  2 13:25:06 f2 named[26254]:   validating @0x7ffa5400fc30: 3RL0HJSI26SCTO21AV9TVIGIPUVPJAI1.com NSEC3: no valid signature found
Jul  2 13:25:08 f2 named[26254]: rbtdb.c:1511: INSIST(!((void *)((node)->deadlink.prev) != (void *)(-1))) failed
Jul  2 13:25:08 f2 named[26254]: exiting (due to assertion failure)
--
Jul  3 02:43:16 f2 named[3275]: error (unexpected RCODE SERVFAIL) resolving 'academyunion.net/AAAA/IN': 67.220.238.207#53
Jul  3 02:43:16 f2 named[3275]: rbtdb.c:1619: INSIST(!((void *)((node)->deadlink.prev) != (void *)(-1))) failed
Jul  3 02:43:16 f2 named[3275]: exiting (due to assertion failure)
--

Expected results:

DNS requests should not be causing named to crash out on an assertion failure

Additional info:

Machines that have shown this so far are not IPv6 enabled - this may or may not be relevant to the issue

Comment 4 Daniel McNamara 2012-07-09 03:55:45 UTC
Hey guys,

This issue is become extremely problematic and is occurring 2 -3 times each day. I've had to put a script in place that restarts the named service on death - this is not an appropriate long term solution.

I still haven't located any one particular request that can trigger the crash but it is becoming extremely frustrating. Has there been any movement on this bug?

Comment 7 Adam Tkac 2012-07-09 11:51:34 UTC
Created attachment 597049 [details]
Proposed patch

Comment 11 rhn 2012-07-09 19:18:52 UTC
This problem also affects the 32-bit version of 6.3

Comment 16 Adam Tkac 2012-07-16 09:48:41 UTC
*** Bug 840299 has been marked as a duplicate of this bug. ***

Comment 18 Tim Mooney 2012-07-16 19:19:36 UTC
We're seeing this on a regular basis too.

What's the timeline for getting an erratum issued for this?  It definitely looks like 3284 is the right fix to backport, assuming a rebase against 9.8.2 final isn't in the cards.

Comment 19 Dumitru Gherman 2012-07-17 23:08:42 UTC
We started to see this more and more frequently in our infra.
I'd say this should be a blocker bug since production DNS servers are affected by this unresolved issue.

Comment 20 Adam Tkac 2012-07-18 11:50:44 UTC
*** Bug 840788 has been marked as a duplicate of this bug. ***

Comment 21 Robert Scheck 2012-07-18 12:10:04 UTC
We are also experiencing this issue, case #00678617 is open in the Red Hat
Customer Portal.

Comment 23 Manish Gupta 2012-07-31 16:55:18 UTC
Do we have any timeframe by when the fix would be out for this issue? 

We recently migrated all of our DNS from Solaris to RHEL. And, this bug gave our management an opportunity to rollback everything back to Solaris. Unless we have patch soon, we may have to bid RHEL goodbye as far as DNS is concerned.

Comment 24 Adam Tkac 2012-07-31 18:26:18 UTC
(In reply to comment #23)
> Do we have any timeframe by when the fix would be out for this issue? 
> 
> We recently migrated all of our DNS from Solaris to RHEL. And, this bug gave
> our management an opportunity to rollback everything back to Solaris. Unless
> we have patch soon, we may have to bid RHEL goodbye as far as DNS is
> concerned.

Please check bug #838956 and http://rhn.redhat.com/errata/RHBA-2012-1107.html. This issue is already fixed, just update to the latest released packages.

Comment 28 errata-xmlrpc 2013-02-21 10:58:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-0475.html