Bug 840299

Summary: race condition in rbtnode.deadlink
Product: Red Hat Enterprise Linux 6 Reporter: derek
Component: bindAssignee: Adam Tkac <atkac>
Status: CLOSED DUPLICATE QA Contact: qe-baseos-daemons
Severity: medium Docs Contact:
Priority: unspecified    
Version: 6.3CC: ovasik
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-07-16 09:48:41 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description derek 2012-07-15 14:37:53 UTC
Description of problem:
named will crash semi-randomly (twice in 3 weeks) with a assertion failure.  

named[19833]: 14-Jul-2012 12:18:28.122 general:
critical: rbtdb.c:1619: INSIST(!((void *)((node)->deadlink.prev) !=
(void *)(-1))) failed
named[19833]: 14-Jul-2012 12:18:28.131 general:
critical: exiting (due to assertion failure)

Abrtd captures it and here is the stack trace,

#0  0x00007f22d57af8a5 in raise (sig=6) at
../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x00007f22d57b1085 in abort () at abort.c:92
#2  0x00007f22d7f76e14 in assertion_failed (file=<value optimized out>,
line=<value optimized out>, type=<value optimized out>, cond=<value
optimized out>) at ./main.c:219
#3  0x00007f22d692d89a in isc_assertion_failed (file=<value optimized
out>, line=<value optimized out>, type=<value optimized out>,
cond=<value optimized out>) at assertions.c:57
#4  0x00007f22d78074a2 in cleanup_dead_nodes (rbtdb=0x7f22d0609010,
bucketnum=<value optimized out>) at rbtdb.c:1600
#5  0x00007f22d7807726 in reactivate_node (rbtdb=0x7f22d0609010,
node=0x7f22c075cb68, treelocktype=isc_rwlocktype_write) at rbtdb.c:1662
#6  0x00007f22d78092cb in findnodeintree (rbtdb=0x7f22d0609010,
tree=0x7f22d060d010, name=0x7f22c3621920, create=isc_boolean_true,
nodep=0x7f22d3a10c70) at rbtdb.c:2573
#7  0x00007f22d786929f in cache_name (fctx=0x7f22c363d438,
name=0x7f22c3621920, addrinfo=0x7f22c092b430, now=1340718604) at
resolver.c:4427
#8  0x00007f22d786f4c0 in cache_message (task=0x7f22d065f380,
event=<value optimized out>) at resolver.c:4740
#9  resquery_response (task=0x7f22d065f380, event=<value optimized out>)
at resolver.c:7125
#10 0x00007f22d694c2f8 in dispatch (uap=0x7f22d7ee7010) at task.c:1012
#11 run (uap=0x7f22d7ee7010) at task.c:1157
#12 0x00007f22d6301851 in start_thread (arg=0x7f22d3a12700) at
pthread_create.c:301
#13 0x00007f22d586467d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:115

This crash seems to be fixed upstream,
 
 https://lists.isc.org/pipermail/bind-users/2012-July/088082.html


Version-Release number of selected component (if applicable):
bind-9.8.2-0.10.rc1.el6.x86_64

How reproducible:
Unclear how to reproduce it manually as we have not been able to capture the state of the daemon around when it dies.

Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 2 Adam Tkac 2012-07-16 09:48:41 UTC

*** This bug has been marked as a duplicate of bug 837165 ***