Description of problem:
From time to time the slapd will deadlock when adding new entry. I'm using bdb
database backend and syncrepl to replicate to slave servers. All updates are
done on single host (master), and that is the one where the problem occurs.
Slaves are read-only and do not seem to be affected (well, the change doesn't
seem to ever get into the master's database, so it is never replicated to the
slaves). When deadlock occurs, I'm still able to bind to the master LDAP
server, however any query will simply hang. If I restart slapd on the master,
than it simply hangs while BDB is initializing (it never binds to the network port).
The only way to get out of it is to stop slapd process, remove all files from
/var/lib/ldap and use ldapadd to recreate database. If I attempt to perform
same operation that caused the deadlock, it happens again. I need to do some
other type of update to the database, than try the operation that caused the
problem (and prey it won't deadlock again).
I remember that last update for openldap packages was supposed to solve some
deadlock issues with database. Whatever fix was inthere, it hasn't solved the
Version-Release number of selected component (if applicable):
Occurs rarely, however once it happens it tends to repeat itself.
Steps to Reproduce:
1. add a new entry to the database
I've asked about the problem on OpenLDAP mailing list. Other then upgrading to
OpenLDAP 2.3 (which everybody strongly suggested), it was suggested that (with
slapd not running) I try db_recover on /var/lib/ldap directory. Which seems to
solve the problem on the restart, and to create DB_CONFIG file in /var/lib/ldap
to optimize DB4 for use with OpenLDAP. Would be nice if /etc/init.d/ldap
invoked db_recover on startup when previous unclean shutdown is detected (which
was most likely the source of my problem) and if DB_CONFIG with some sensible
values was included in RPM package.
For more details, search the OpenLDAP mailing list archives, thread "Database
deadlock when adding new entry".
Potentially related - bug 213167.
I haven't updated this bug report. Some time ago I switched to OpenLDAP 2.3.
If memory serves me right, I think I just recompiled Fedora Core SRPM on RHEL4.
I haven't experienced a single problem since. It seems that whatever the bug
was, it was fixed upstream in 2.3.
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release. Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products. This request is not yet committed for inclusion in an Update
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.