Bug 730387

Summary: Use POSIX RW locks instead of NSPR implementation
Product: [Retired] 389 Reporter: Nathan Kinder <nkinder>
Component: Directory ServerAssignee: Nathan Kinder <nkinder>
Status: CLOSED CURRENTRELEASE QA Contact: Viktor Ashirov <vashirov>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 1.2.9CC: amsharma, jhradile, rmeggins
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Previously, 389 Directory Server used the Netscape Portable Runtime (NSPR) implementation of the read/write locking mechanism. This implementation allowed deadlocks to occur if 389 Directory Server was under a heavy load, which caused the server to become unresponsive. With this update, 389 Directory Server now uses the POSIX implementation of the locking mechanism, and deadlocks no longer occur under a heavy load.
Story Points: ---
Clone Of:
: 743979 (view as bug list) Environment:
Last Closed: 2015-12-07 17:01:05 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 690319, 730394, 730395, 730403, 730434, 730436, 743970, 743979    
Attachments:
Description Flags
rwlock-test
none
stack trace from ipa update
none
Patch
none
Revised Patch
none
Revised Patch nhosoi: review+

Description Nathan Kinder 2011-08-12 18:23:51 UTC
The NSPR RW lock implementation does not safely allow re-entrant use of reader locks.  If a writer lock is waiting, all reader locks are blocked.  This includes threads that already hold a reader lock and are trying to obtain another reader lock.  This leads to a deadlock.  This issue has been reported to the NSPR developers, but they are hesitant to fix it.  NSPR does not currently keep track of the threads that own locks, so there's no way for it to differentiate a thread asking for it's first reader lock and one who already holds a reader lock.  The NSPR developers are hesitant to add this as they feel it would degrade performance.

POSIX RW locks safely allow re-entrant reader locks to be used.  We should use the POSIX implementation to avoid deadlocks in ns-slapd, as we do have areas where we use reader locks in a re-entrant fashion.

To switch the RW lock implementation, we need to refactor the 389-ds-base code to use a new slapi_rwlock_* API anywhere we use RW locks.  We currently call PR_RWLock_*() functions from many places within the code.  The slapi_rwlock_* API should be able to switch implementations between NSPR RW locks and POSIX RW locks based on defines.  We can then add a configure test to use POSIX RW locks if available, with an override switch to use NSPR locks if that is needed on certain platforms.

Comment 1 Rich Megginson 2011-08-12 19:13:44 UTC
Created attachment 518096 [details]
rwlock-test

Comment 2 Rich Megginson 2011-08-12 19:14:40 UTC
Created attachment 518097 [details]
stack trace from ipa update

Comment 3 Nathan Kinder 2011-08-16 21:43:42 UTC
Created attachment 518567 [details]
Patch

Comment 4 Nathan Kinder 2011-08-17 15:49:50 UTC
Created attachment 518711 [details]
Revised Patch

This correct some search-and-replace errors in the previous patch.

Comment 5 Nathan Kinder 2011-08-17 16:21:04 UTC
Created attachment 518717 [details]
Revised Patch

Comment 6 Nathan Kinder 2011-08-17 18:25:25 UTC
Pushed to master.  Thanks to Noriko for hew review!

Counting objects: 151, done.
Delta compression using up to 2 threads.
Compressing objects: 100% (76/76), done.
Writing objects: 100% (76/76), 53.22 KiB, done.
Total 76 (delta 70), reused 0 (delta 0)
To ssh://git.fedorahosted.org/git/389/ds.git
   a150e8e..f9b199e  master -> master

Comment 9 Rich Megginson 2012-01-06 22:20:11 UTC
*** Bug 528567 has been marked as a duplicate of this bug. ***

Comment 10 Rich Megginson 2012-01-10 20:18:53 UTC
Upstream ticket:
https://fedorahosted.org/389/ticket/247

Comment 11 Miroslav Svoboda 2012-01-17 11:32:44 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Previously, 389 Directory Server used the Netscape Portable Runtime (NSPR) implementation of the read/write locking mechanism. This implementation allowed deadlocks to occur if 389 Directory Server was under a heavy load, which caused the server to become unresponsive. With this update, 389 Directory Server now uses the POSIX implementation of the locking mechanism, and deadlocks no longer occur under a heavy load.

Comment 12 Amita Sharma 2012-04-19 06:25:19 UTC
No Regressions, Marking as VERIFIED.