RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1175321 - named is crashing in load_configuration due to race condition in isc__task_beginexclusive
Summary: named is crashing in load_configuration due to race condition in isc__task_be...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: bind
Version: 6.3
Hardware: x86_64
OS: Linux
urgent
high
Target Milestone: rc
: 6.7
Assignee: Tomáš Hozza
QA Contact: qe-baseos-daemons
URL:
Whiteboard:
Depends On:
Blocks: 1126841
TreeView+ depends on / blocked
 
Reported: 2014-12-17 13:47 UTC by Mohit Agrawal
Modified: 2019-09-12 08:08 UTC (History)
9 users (show)

Fixed In Version: bind-9.8.2-0.34.rc1.el6
Doc Type: Bug Fix
Doc Text:
Due to a race condition in the beginexclusive() function, the BIND DNS server (named) could terminate unexpectedly while loading configuration. To fix this bug, a patch has been applied, and the race condition no longer occurs.
Clone Of:
Environment:
Last Closed: 2015-07-22 05:50:21 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Possible patch (11.19 KB, patch)
2014-12-18 18:20 UTC, Tomáš Hozza
no flags Details | Diff
Patch for the issue (10.64 KB, patch)
2015-01-28 11:51 UTC, Tomáš Hozza
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2015:1250 0 normal SHIPPED_LIVE bind bug fix and enhancement update 2015-07-20 17:50:10 UTC

Description Mohit Agrawal 2014-12-17 13:47:25 UTC
Description of problem:
named is crashing in load_configuration due to race condition in isc__task_beginexclusive

Version-Release number of selected component (if applicable):
bind-9.8.2-0.10.rc1.el6.x86_64

How reproducible:
No Idea

Steps to Reproduce:
1.
2.
3.

Actual results:

It should not be crashed.
Expected results:

named should not be crashed
Additional info:

Comment 1 Mohit Agrawal 2014-12-17 13:50:26 UTC
As per bt pattern it seems named is crashing in isc__task_beginexclusive and the thread 5 is also waiting in the same function so it is returning bad result in thread 1 so it is crashing.

(gdb) bt
#0  0x00007f01bcc858a5 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x00007f01bcc87085 in abort () at abort.c:92
#2  0x00007f01bf44cb6c in library_fatal_error (file=0x7f01bf4945fc "server.c", line=<value optimized out>, format=0x7f01bde383e2 "RUNTIME_CHECK(%s) %s", args=0x7f01baac3070) at ./main.c:260
#3  0x00007f01bde06174 in isc_error_fatal (file=<value optimized out>, line=<value optimized out>, format=<value optimized out>) at error.c:74
#4  0x00007f01bde061d4 in isc_error_runtimecheck (file=0x7f01bf4945fc "server.c", line=4493, expression=0x7f01bf49a41a "result == 0") at error.c:81
#5  0x00007f01bf46d5c3 in load_configuration (filename=0x7f01baac32b0 "\360#H\214\001\177", server=0x7f01bf3de010, first_time=isc_boolean_false) at server.c:4493
#6  0x00007f01bf46f8c6 in loadconfig (server=0x7f01bf3de010) at server.c:5805
#7  0x00007f01bf46fffe in reconfig (server=<value optimized out>, args=<value optimized out>) at server.c:5845
#8  ns_server_reconfigcommand (server=<value optimized out>, args=<value optimized out>) at server.c:6067
#9  0x00007f01bf445c67 in ns_control_docommand (message=<value optimized out>, text=0x7f01baac3880) at control.c:104
#10 0x00007f01bf449346 in control_recvmessage (task=0x7f01bf3ea010, event=<value optimized out>) at controlconf.c:458
#11 0x00007f01bde222f8 in dispatch (uap=0x7f01bf3d5010) at task.c:1012
#12 run (uap=0x7f01bf3d5010) at task.c:1157
#13 0x00007f01bd7d7851 in start_thread (arg=0x7f01baac4700) at pthread_create.c:301
#14 0x00007f01bcd3a67d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
(gdb) f 5
#5  0x00007f01bf46d5c3 in load_configuration (filename=0x7f01baac32b0 "\360#H\214\001\177", server=0x7f01bf3de010, first_time=isc_boolean_false) at server.c:4493
4493			RUNTIME_CHECK(result == ISC_R_SUCCESS);
(gdb) l
4488		}
4489	
4490		/* Ensure exclusive access to configuration data. */
4491		if (!exclusive) {
4492			result = isc_task_beginexclusive(server->task);
4493			RUNTIME_CHECK(result == ISC_R_SUCCESS);
4494			exclusive = ISC_TRUE;
4495		}
4496	
4497		/*
(gdb) thread 5
[Switching to thread 5 (Thread 0x7f01bb4c5700 (LWP 16711))]#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
162	62:	movl	(%rsp), %edi
(gdb) bt
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1  0x00007f01bde219bc in isc__task_beginexclusive (task0=<value optimized out>) at task.c:1456
#2  0x00007f01bec90c16 in grow_entries (task=0x7f018eedf7a0, ev=0x0) at adb.c:520
#3  0x00007f01bde222f8 in dispatch (uap=0x7f01bf3d5010) at task.c:1012
#4  run (uap=0x7f01bf3d5010) at task.c:1157
#5  0x00007f01bd7d7851 in start_thread (arg=0x7f01bb4c5700) at pthread_create.c:301
#6  0x00007f01bcd3a67d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
(gdb) f 1
#1  0x00007f01bde219bc in isc__task_beginexclusive (task0=<value optimized out>) at task.c:1456
1456			WAIT(&manager->exclusive_granted, &manager->lock);
(gdb) p manager
$1 = (isc__taskmgr_t *) 0x7f01bf3d5010
(gdb) p *manager
$2 = {common = {impmagic = 1414744909, magic = 1098149223, methods = 0x7f01be04b480}, mctx = 0x7f01c0b992d0, lock = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 1, __kind = 0, 
      __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 12 times>, "\001", '\000' <repeats 26 times>, __align = 0}, workers = 2, threads = 0x7f01bf3d2078, 
  default_quantum = 5, tasks = {head = 0x7f01bf3ea010, tail = 0x7f01972c4b50}, ready_tasks = {head = 0x7f01932435d0, tail = 0x7f018eb90f90}, work_available = {__data = {__lock = 0, 
      __futex = 18822202, __total_seq = 9411101, __wakeup_seq = 9411101, __woken_seq = 9411101, __mutex = 0x7f01bf3d5028, __nwaiters = 0, __broadcast_seq = 222}, 
    __size = "\000\000\000\000:4\037\001\035\232\217\000\000\000\000\000\035\232\217\000\000\000\000\000\035\232\217\000\000\000\000\000(P=\277\001\177\000\000\000\000\000\000\336\000\000", 
    __align = 80840742028705792}, exclusive_granted = {__data = {__lock = 0, __futex = 257, __total_seq = 129, __wakeup_seq = 128, __woken_seq = 128, __mutex = 0x7f01bf3d5028, __nwaiters = 2, 
      __broadcast_seq = 0}, 
    __size = "\000\000\000\000\001\001\000\000\201\000\000\000\000\000\000\000\200\000\000\000\000\000\000\000\200\000\000\000\000\000\000\000(P=\277\001\177\000\000\002\000\000\000\000\000\000", 
    __align = 1103806595072}, tasks_running = 2, exclusive_requested = isc_boolean_true, exiting = isc_boolean_false}
(gdb)

Comment 3 Tomáš Hozza 2014-12-17 14:07:55 UTC
Result from the investigation:

The beginexclusive function should be called by a single task server-wide, but from the backtrace it is clear that it was exedcuted in two different threads. One thread is inside the function and the second called it and returned with different return value than SUCCESS. From the beginexclusive function code is clear that it can return only different return value than SUCCESS (LOCKBUSY) only when some other task already called the function.

Comment 6 Tomáš Hozza 2014-12-18 18:20:41 UTC
Created attachment 970700 [details]
Possible patch

Comment 8 Tomáš Hozza 2015-01-28 11:51:10 UTC
Created attachment 985105 [details]
Patch for the issue

New Patch with fixed bug, that I found during the backport.

Reported also upstream:
[ISC-Bugs #38470] Bug in IF condition in lib/dns/adb.c:new_adbentry()

Comment 24 errata-xmlrpc 2015-07-22 05:50:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-1250.html


Note You need to log in before you can comment on or make changes to this bug.