Bug 979169 - allow setting db deadlock rejection policy
Summary: allow setting db deadlock rejection policy
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: 389-ds-base
Version: 6.4
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Rich Megginson
QA Contact: Sankar Ramalingam
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-06-27 19:50 UTC by Rich Megginson
Modified: 2013-11-21 21:09 UTC (History)
4 users (show)

Fixed In Version: 389-ds-base-1.2.11.15-22.el6
Doc Type: Bug Fix
Doc Text:
Cause: Under certain conditions, with a mix of concurrent search and update and outgoing replication operations, there will be deadlocks in the changelog db, leading to error messages like this: NSMMReplicationPlugin - changelog program - _cl5WriteOperationTxn: failed to write entry with csn (XXXXXXX); db error - -30994 DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock This is caused by a deadlock between the changelog readers, writers, and main database writers. Consequence: Update operations will fail with the above error message in the directory server errors log. Fix: A new configuration parameter is introduced: dn: cn=config,cn=ldbm database,cn=plugins,cn=config nsslapd-db-deadlock-policy: 9 With the default policy 9 (DB_LOCK_YOUNGEST), the last locker gets killed when there is a deadlock. In the case that this is the changelog writer, the write will fail, and the entire update will fail. Users who frequently see the above errors in the errors log are advised to change this setting to 6 (DB_LOCK_MINWRITE) will which instead kill the locker that has the fewest write locks (that is, the changelog reader). The changelog reader code has been changed to handle this deadlock condition and retry. The setting can be changed like this: ldapmodify -x -D "cn=directory manager" -W <<EOF dn: cn=config,cn=ldbm database,cn=plugins,cn=config changetype: modify replace: nsslapd-db-deadlock-policy nsslapd-db-deadlock-policy: 6 EOF You may ask why the default is not changed to 6. The answer is that the setting will apply to _all_ threads, so that changing this setting could cause regular search requests to fail, if the directory server is under a heavy update load. In our testing, we did not see this happen, but we cannot guarantee that changing this value to 6 will not impact regular search requests. Result: After changing nsslapd-db-deadlock-policy to 6, updates will succeed and no longer cause errors like the above.
Clone Of:
Environment:
Last Closed: 2013-11-21 21:09:56 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2013:1653 normal SHIPPED_LIVE 389-ds-base bug fix update 2013-11-20 21:53:19 UTC

Description Rich Megginson 2013-06-27 19:50:10 UTC
This bug is created as a clone of upstream ticket:
https://fedorahosted.org/389/ticket/47409

The deadlock_threadmain DBENV->lock_detect atype is currently hardcoded to DB_LOCK_YOUNGEST which works in most cases.  We need the ability to change this.

Comment 2 Milan Kubík 2013-07-08 13:34:55 UTC
Bugzilla covered in basic acceptance suite.

Comment 3 Sankar Ramalingam 2013-07-08 13:51:48 UTC
Unchecking the "Red Hat Employee (internal)" group since it was accidentally selected.

Comment 5 Rich Megginson 2013-08-08 19:33:28 UTC
NOTE: doc text is the same as https://bugzilla.redhat.com/show_bug.cgi?id=975250

Comment 6 Milan Kubík 2013-08-09 13:51:44 UTC
All related test cases in basic passing with 1.2.11.15-22. Marking VERIFIED.

Comment 7 errata-xmlrpc 2013-11-21 21:09:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1653.html


Note You need to log in before you can comment on or make changes to this bug.