Bug 1175321
Summary: | named is crashing in load_configuration due to race condition in isc__task_beginexclusive | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Mohit Agrawal <moagrawa> | ||||||
Component: | bind | Assignee: | Tomáš Hozza <thozza> | ||||||
Status: | CLOSED ERRATA | QA Contact: | qe-baseos-daemons | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | urgent | ||||||||
Version: | 6.3 | CC: | gnaik, hmatsumo, mmatsuya, ovasik, psklenar, pspacek, shane.seymour, thozza, yozone | ||||||
Target Milestone: | rc | Keywords: | OtherQA, Patch | ||||||
Target Release: | 6.7 | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | bind-9.8.2-0.34.rc1.el6 | Doc Type: | Bug Fix | ||||||
Doc Text: |
Due to a race condition in the beginexclusive() function, the BIND DNS server (named) could terminate unexpectedly while loading configuration. To fix this bug, a patch has been applied, and the race condition no longer occurs.
|
Story Points: | --- | ||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2015-07-22 05:50:21 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 1126841 | ||||||||
Attachments: |
|
Description
Mohit Agrawal
2014-12-17 13:47:25 UTC
As per bt pattern it seems named is crashing in isc__task_beginexclusive and the thread 5 is also waiting in the same function so it is returning bad result in thread 1 so it is crashing. (gdb) bt #0 0x00007f01bcc858a5 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 #1 0x00007f01bcc87085 in abort () at abort.c:92 #2 0x00007f01bf44cb6c in library_fatal_error (file=0x7f01bf4945fc "server.c", line=<value optimized out>, format=0x7f01bde383e2 "RUNTIME_CHECK(%s) %s", args=0x7f01baac3070) at ./main.c:260 #3 0x00007f01bde06174 in isc_error_fatal (file=<value optimized out>, line=<value optimized out>, format=<value optimized out>) at error.c:74 #4 0x00007f01bde061d4 in isc_error_runtimecheck (file=0x7f01bf4945fc "server.c", line=4493, expression=0x7f01bf49a41a "result == 0") at error.c:81 #5 0x00007f01bf46d5c3 in load_configuration (filename=0x7f01baac32b0 "\360#H\214\001\177", server=0x7f01bf3de010, first_time=isc_boolean_false) at server.c:4493 #6 0x00007f01bf46f8c6 in loadconfig (server=0x7f01bf3de010) at server.c:5805 #7 0x00007f01bf46fffe in reconfig (server=<value optimized out>, args=<value optimized out>) at server.c:5845 #8 ns_server_reconfigcommand (server=<value optimized out>, args=<value optimized out>) at server.c:6067 #9 0x00007f01bf445c67 in ns_control_docommand (message=<value optimized out>, text=0x7f01baac3880) at control.c:104 #10 0x00007f01bf449346 in control_recvmessage (task=0x7f01bf3ea010, event=<value optimized out>) at controlconf.c:458 #11 0x00007f01bde222f8 in dispatch (uap=0x7f01bf3d5010) at task.c:1012 #12 run (uap=0x7f01bf3d5010) at task.c:1157 #13 0x00007f01bd7d7851 in start_thread (arg=0x7f01baac4700) at pthread_create.c:301 #14 0x00007f01bcd3a67d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 (gdb) f 5 #5 0x00007f01bf46d5c3 in load_configuration (filename=0x7f01baac32b0 "\360#H\214\001\177", server=0x7f01bf3de010, first_time=isc_boolean_false) at server.c:4493 4493 RUNTIME_CHECK(result == ISC_R_SUCCESS); (gdb) l 4488 } 4489 4490 /* Ensure exclusive access to configuration data. */ 4491 if (!exclusive) { 4492 result = isc_task_beginexclusive(server->task); 4493 RUNTIME_CHECK(result == ISC_R_SUCCESS); 4494 exclusive = ISC_TRUE; 4495 } 4496 4497 /* (gdb) thread 5 [Switching to thread 5 (Thread 0x7f01bb4c5700 (LWP 16711))]#0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 162 62: movl (%rsp), %edi (gdb) bt #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 #1 0x00007f01bde219bc in isc__task_beginexclusive (task0=<value optimized out>) at task.c:1456 #2 0x00007f01bec90c16 in grow_entries (task=0x7f018eedf7a0, ev=0x0) at adb.c:520 #3 0x00007f01bde222f8 in dispatch (uap=0x7f01bf3d5010) at task.c:1012 #4 run (uap=0x7f01bf3d5010) at task.c:1157 #5 0x00007f01bd7d7851 in start_thread (arg=0x7f01bb4c5700) at pthread_create.c:301 #6 0x00007f01bcd3a67d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 (gdb) f 1 #1 0x00007f01bde219bc in isc__task_beginexclusive (task0=<value optimized out>) at task.c:1456 1456 WAIT(&manager->exclusive_granted, &manager->lock); (gdb) p manager $1 = (isc__taskmgr_t *) 0x7f01bf3d5010 (gdb) p *manager $2 = {common = {impmagic = 1414744909, magic = 1098149223, methods = 0x7f01be04b480}, mctx = 0x7f01c0b992d0, lock = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 1, __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 12 times>, "\001", '\000' <repeats 26 times>, __align = 0}, workers = 2, threads = 0x7f01bf3d2078, default_quantum = 5, tasks = {head = 0x7f01bf3ea010, tail = 0x7f01972c4b50}, ready_tasks = {head = 0x7f01932435d0, tail = 0x7f018eb90f90}, work_available = {__data = {__lock = 0, __futex = 18822202, __total_seq = 9411101, __wakeup_seq = 9411101, __woken_seq = 9411101, __mutex = 0x7f01bf3d5028, __nwaiters = 0, __broadcast_seq = 222}, __size = "\000\000\000\000:4\037\001\035\232\217\000\000\000\000\000\035\232\217\000\000\000\000\000\035\232\217\000\000\000\000\000(P=\277\001\177\000\000\000\000\000\000\336\000\000", __align = 80840742028705792}, exclusive_granted = {__data = {__lock = 0, __futex = 257, __total_seq = 129, __wakeup_seq = 128, __woken_seq = 128, __mutex = 0x7f01bf3d5028, __nwaiters = 2, __broadcast_seq = 0}, __size = "\000\000\000\000\001\001\000\000\201\000\000\000\000\000\000\000\200\000\000\000\000\000\000\000\200\000\000\000\000\000\000\000(P=\277\001\177\000\000\002\000\000\000\000\000\000", __align = 1103806595072}, tasks_running = 2, exclusive_requested = isc_boolean_true, exiting = isc_boolean_false} (gdb) Result from the investigation: The beginexclusive function should be called by a single task server-wide, but from the backtrace it is clear that it was exedcuted in two different threads. One thread is inside the function and the second called it and returned with different return value than SUCCESS. From the beginexclusive function code is clear that it can return only different return value than SUCCESS (LOCKBUSY) only when some other task already called the function. Created attachment 970700 [details]
Possible patch
Created attachment 985105 [details]
Patch for the issue
New Patch with fixed bug, that I found during the backport.
Reported also upstream:
[ISC-Bugs #38470] Bug in IF condition in lib/dns/adb.c:new_adbentry()
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-1250.html |