Bug 730387 - Use POSIX RW locks instead of NSPR implementation
Summary: Use POSIX RW locks instead of NSPR implementation
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: 389
Classification: Retired
Component: Directory Server
Version: 1.2.9
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Nathan Kinder
QA Contact: Viktor Ashirov
URL:
Whiteboard:
: 528567 (view as bug list)
Depends On:
Blocks: 690319 730394 730395 730403 730434 730436 389_1.2.10 743979
TreeView+ depends on / blocked
 
Reported: 2011-08-12 18:23 UTC by Nathan Kinder
Modified: 2015-12-07 17:01 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Previously, 389 Directory Server used the Netscape Portable Runtime (NSPR) implementation of the read/write locking mechanism. This implementation allowed deadlocks to occur if 389 Directory Server was under a heavy load, which caused the server to become unresponsive. With this update, 389 Directory Server now uses the POSIX implementation of the locking mechanism, and deadlocks no longer occur under a heavy load.
Clone Of:
: 743979 (view as bug list)
Environment:
Last Closed: 2015-12-07 17:01:05 UTC


Attachments (Terms of Use)
rwlock-test (8.11 KB, application/x-gzip)
2011-08-12 19:13 UTC, Rich Megginson
no flags Details
stack trace from ipa update (4.95 KB, text/plain)
2011-08-12 19:14 UTC, Rich Megginson
no flags Details
Patch (1.20 MB, patch)
2011-08-16 21:43 UTC, Nathan Kinder
no flags Details | Diff
Revised Patch (1.20 MB, patch)
2011-08-17 15:49 UTC, Nathan Kinder
no flags Details | Diff
Revised Patch (1.20 MB, patch)
2011-08-17 16:21 UTC, Nathan Kinder
nhosoi: review+
Details | Diff

Description Nathan Kinder 2011-08-12 18:23:51 UTC
The NSPR RW lock implementation does not safely allow re-entrant use of reader locks.  If a writer lock is waiting, all reader locks are blocked.  This includes threads that already hold a reader lock and are trying to obtain another reader lock.  This leads to a deadlock.  This issue has been reported to the NSPR developers, but they are hesitant to fix it.  NSPR does not currently keep track of the threads that own locks, so there's no way for it to differentiate a thread asking for it's first reader lock and one who already holds a reader lock.  The NSPR developers are hesitant to add this as they feel it would degrade performance.

POSIX RW locks safely allow re-entrant reader locks to be used.  We should use the POSIX implementation to avoid deadlocks in ns-slapd, as we do have areas where we use reader locks in a re-entrant fashion.

To switch the RW lock implementation, we need to refactor the 389-ds-base code to use a new slapi_rwlock_* API anywhere we use RW locks.  We currently call PR_RWLock_*() functions from many places within the code.  The slapi_rwlock_* API should be able to switch implementations between NSPR RW locks and POSIX RW locks based on defines.  We can then add a configure test to use POSIX RW locks if available, with an override switch to use NSPR locks if that is needed on certain platforms.

Comment 1 Rich Megginson 2011-08-12 19:13:44 UTC
Created attachment 518096 [details]
rwlock-test

Comment 2 Rich Megginson 2011-08-12 19:14:40 UTC
Created attachment 518097 [details]
stack trace from ipa update

Comment 3 Nathan Kinder 2011-08-16 21:43:42 UTC
Created attachment 518567 [details]
Patch

Comment 4 Nathan Kinder 2011-08-17 15:49:50 UTC
Created attachment 518711 [details]
Revised Patch

This correct some search-and-replace errors in the previous patch.

Comment 5 Nathan Kinder 2011-08-17 16:21:04 UTC
Created attachment 518717 [details]
Revised Patch

Comment 6 Nathan Kinder 2011-08-17 18:25:25 UTC
Pushed to master.  Thanks to Noriko for hew review!

Counting objects: 151, done.
Delta compression using up to 2 threads.
Compressing objects: 100% (76/76), done.
Writing objects: 100% (76/76), 53.22 KiB, done.
Total 76 (delta 70), reused 0 (delta 0)
To ssh://git.fedorahosted.org/git/389/ds.git
   a150e8e..f9b199e  master -> master

Comment 9 Rich Megginson 2012-01-06 22:20:11 UTC
*** Bug 528567 has been marked as a duplicate of this bug. ***

Comment 10 Rich Megginson 2012-01-10 20:18:53 UTC
Upstream ticket:
https://fedorahosted.org/389/ticket/247

Comment 11 Miroslav Svoboda 2012-01-17 11:32:44 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Previously, 389 Directory Server used the Netscape Portable Runtime (NSPR) implementation of the read/write locking mechanism. This implementation allowed deadlocks to occur if 389 Directory Server was under a heavy load, which caused the server to become unresponsive. With this update, 389 Directory Server now uses the POSIX implementation of the locking mechanism, and deadlocks no longer occur under a heavy load.

Comment 12 Amita Sharma 2012-04-19 06:25:19 UTC
No Regressions, Marking as VERIFIED.


Note You need to log in before you can comment on or make changes to this bug.