Bug 195920

Summary: slapd deadlock when adding entries
Product: Red Hat Enterprise Linux 4 Reporter: Aleksandar Milivojevic <alex>
Component: openldapAssignee: Jan Safranek <jsafrane>
Status: CLOSED ERRATA QA Contact: Jay Turner <jturner>
Severity: high Docs Contact:
Priority: medium    
Version: 4.0CC: aleksey, jplans, setup, srevivo
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHBA-2007-0739 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-11-15 16:04:05 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Aleksandar Milivojevic 2006-06-19 16:47:51 UTC
Description of problem:
From time to time the slapd will deadlock when adding new entry.  I'm using bdb
database backend and syncrepl to replicate to slave servers.  All updates are
done on single host (master), and that is the one where the problem occurs. 
Slaves are read-only and do not seem to be affected (well, the change doesn't
seem to ever get into the master's database, so it is never replicated to the
slaves).  When deadlock occurs, I'm still able to bind to the master LDAP
server, however any query will simply hang.  If I restart slapd on the master,
than it simply hangs while BDB is initializing (it never binds to the network port).

The only way to get out of it is to stop slapd process, remove all files from
/var/lib/ldap and use ldapadd to recreate database.  If I attempt to perform
same operation that caused the deadlock, it happens again.  I need to do some
other type of update to the database, than try the operation that caused the
problem (and prey it won't deadlock again).

I remember that last update for openldap packages was supposed to solve some
deadlock issues with database.  Whatever fix was inthere, it hasn't solved the
problem completely.

Version-Release number of selected component (if applicable):
openldap-2.2.13-4

How reproducible:
Occurs rarely, however once it happens it tends to repeat itself.

Steps to Reproduce:
1. add a new entry to the database
  
Actual results:


Expected results:


Additional info:

Comment 1 Aleksandar Milivojevic 2006-06-21 20:43:27 UTC
I've asked about the problem on OpenLDAP mailing list.  Other then upgrading to
OpenLDAP 2.3 (which everybody strongly suggested), it was suggested that (with
slapd not running) I try db_recover on /var/lib/ldap directory.  Which seems to
solve the problem on the restart, and to create DB_CONFIG file in /var/lib/ldap
to optimize DB4 for use with OpenLDAP.  Would be nice if /etc/init.d/ldap
invoked db_recover on startup when previous unclean shutdown is detected (which
was most likely the source of my problem) and if DB_CONFIG with some sensible
values was included in RPM package.

For more details, search the OpenLDAP mailing list archives, thread "Database
deadlock when adding new entry".

Comment 2 Aleksey Nogin 2006-10-31 01:12:01 UTC
Potentially related - bug 213167.

Comment 3 Aleksandar Milivojevic 2006-10-31 04:59:50 UTC
I haven't updated this bug report.  Some time ago I switched to OpenLDAP 2.3. 
If memory serves me right, I think I just recompiled Fedora Core SRPM on RHEL4.
 I haven't experienced a single problem since.  It seems that whatever the bug
was, it was fixed upstream in 2.3.

Comment 4 RHEL Program Management 2007-06-19 08:15:27 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 9 errata-xmlrpc 2007-11-15 16:04:05 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0739.html