Bug 727856 - bind-dyndb-ldap: race condition in semaphore_wait() function
Summary: bind-dyndb-ldap: race condition in semaphore_wait() function
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: bind-dyndb-ldap
Version: 6.1
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: rc
: ---
Assignee: Adam Tkac
QA Contact: Chandrasekar Kannan
URL:
Whiteboard:
Depends On:
Blocks: 734003
TreeView+ depends on / blocked
 
Reported: 2011-08-03 12:38 UTC by Adam Tkac
Modified: 2015-01-04 23:50 UTC (History)
6 users (show)

Fixed In Version: bind-dyndb-ldap-0.2.0-3.el6
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 734003 (view as bug list)
Environment:
Last Closed: 2011-12-06 17:57:09 UTC


Attachments (Terms of Use)
Proposed patch (477 bytes, patch)
2011-08-03 12:49 UTC, Adam Tkac
no flags Details | Diff
pstack of hung named (4.12 KB, application/octet-stream)
2011-08-07 00:56 UTC, Phil Anderson
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2011:1715 normal SHIPPED_LIVE bind-dyndb-ldap bug fix update 2011-12-06 01:02:17 UTC

Description Adam Tkac 2011-08-03 12:38:09 UTC
Description of problem:
Current implementation of the semaphore_wait() function is not fully thread-safe

Version-Release number of selected component (if applicable):
bind-dyndb-ldap-0.2.0-1.el6

How reproducible:
sometimes, when server is under heavy load

Steps to Reproduce:
1. send many queries for RRs which authoritative zones are served via bind-dyndb-ldap plugin
2. wait for server lockup
  
Actual results:
all server's worker threads are blocked in semaphore_wait()

Expected results:
thread-safe semaphore_wait()

Comment 1 Adam Tkac 2011-08-03 12:49:28 UTC
Created attachment 516506 [details]
Proposed patch

Comment 2 Phil Anderson 2011-08-07 00:56:58 UTC
Created attachment 517022 [details]
pstack of hung named

I recently upgraded DNS my server from an older dual core core CPU to a quad core Xeon E3 and now named locks up after the first few queries.  Stack trace attached.

I was able to work around the problem by reducing the number of worker threads named starts by adding the following line to /etc/sysconfig/named:
OPTIONS="-n 1"

Comment 3 Adam Tkac 2011-08-08 08:37:28 UTC
(In reply to comment #2)
> Created attachment 517022 [details]
> pstack of hung named
> 
> I recently upgraded DNS my server from an older dual core core CPU to a quad
> core Xeon E3 and now named locks up after the first few queries.  Stack trace
> attached.
> 
> I was able to work around the problem by reducing the number of worker threads
> named starts by adding the following line to /etc/sysconfig/named:
> OPTIONS="-n 1"

Which version of bind, bind-libs and bind-dyndb-ldap do you use, please?

Comment 6 Martin Foster 2011-08-22 06:25:03 UTC
I was experiencing the same semaphore error as described in the freeipa-users list.  Other than serving records for IPA, my bind install is also an authoritative DNS server.

I rebuilt bind + bind-dyndb-ldap from Adam's proposed patches:
bind-dyndb-ldap-0.2.0-1.el6.1.src.rpm; and
bind-9.7.3-2.el6_1.P3.2.5.rh725577.src.rpm

The resolver has now been running for 6+ hours, where previously it would hang on the semaphore issue within an hour.

Comment 7 Michael Gregg 2011-11-08 18:25:04 UTC
Given that all of the servers we have in QA have not been locking up, even under load, and that this patch was submitted quite a while ago, I am going to mark this bug as verified.

Verified against:
bind-dyndb-ldap-0.2.0-7.el6.x86_64
ipa-server-2.1.3-8.el6.x86_64

Comment 8 errata-xmlrpc 2011-12-06 17:57:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2011-1715.html


Note You need to log in before you can comment on or make changes to this bug.