Bug 972976 - ldbm errors when adding/modifying/deleting entries
ldbm errors when adding/modifying/deleting entries
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: 389-ds-base (Show other bugs)
6.4
x86_64 Linux
unspecified Severity unspecified
: rc
: ---
Assigned To: Rich Megginson
Sankar Ramalingam
:
: 979170 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-06-10 20:30 EDT by Venkat Mahadevan
Modified: 2013-11-21 16:08 EST (History)
7 users (show)

See Also:
Fixed In Version: 389-ds-base-1.2.11.15-22.el6
Doc Type: Bug Fix
Doc Text:
Cause: The directory server is using replication and DNA. The client does a sequence of ADD/DELETE requests for the same entry over and over again. Consequence: The directory server will give errors containing "modify_switch_entries failed". Fix: When replication is used, there is an entry in the directory server that holds the metadata for the operation (the RUV), which is updated and written to the database for each operation. This includes internal operations, such as DNA. The nested DNA operation was overwriting the pointer to the RUV held by the main operation (ADD, DELETE). The fix is for the outer operation to acquire a pointer to this RUV _after_ the inner operation has completed. Result: There are no more "modify_switch_entries failed" errors when doing many ADD/DELETE operations.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-11-21 16:08:57 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
mkubik: needinfo+


Attachments (Terms of Use)
JMeter test plan to reproduce bug (16.49 KB, application/xml)
2013-06-10 20:30 EDT, Venkat Mahadevan
no flags Details

  None (edit)
Description Venkat Mahadevan 2013-06-10 20:30:17 EDT
Created attachment 759394 [details]
JMeter test plan to reproduce bug

Description of problem:
The issue I am encountering are the following errors when an entry is being added to the directory or deleted from the directory – it seems an operations error err=1 corresponds to a failure in the backend ldbm (see log snippets below).

Version-Release number of selected component (if applicable):
389-ds-base.x86_64              1.2.11.15-14.el6_4 @rhel-x86_64-server-6
Linux 2.6.32-358.2.1.el6.x86_64 #1 SMP Wed Feb 20 12:17:37 EST 2013 x86_64 x86_64 x86_64 GNU/Linux
Red Hat Enterprise Linux Server release 6.4 (Santiago)

How reproducible:
Consistently reproducible.

Steps to Reproduce:
I can reproduce these errors consistently by running a JMeter add/delete loop test to add and delete entries (see the attached testplan_. It always transpires that some entries will fail to add or delete and will throw an operations error err=1 and a corresponding ldbm error.

Actual results:
err=1 and ldbm errors; entry fails to add or delete.

Expected results:
Entries should be added and deleted successfully.

Additional info:
The DNA plugin is active on the server. When sent a uidNumber of 999 or gidNumber of 999, it will generate a new POSIX uid/gid.
The servers are in a replicated environment (2 MMR masters and 3 replicas).
The error seems to occur regardless of whether the DNA plugin is used or not (i.e. when sending a non-POSIX entry vs. a POSIX entry to LDAP).
Comment 1 Venkat Mahadevan 2013-06-10 20:35:50 EDT
Sorry, forgot to attach actual error messages:

In the error log I see things like:
[10/Jun/2013:14:46:43 -0700] - ldbm_back_delete: modify_switch_entries failed
[10/Jun/2013:14:50:58 -0700] - ldbm_back_modify: modify_switch_entries failed
[10/Jun/2013:14:50:59 -0700] - ldbm_back_modify: modify_switch_entries failed
[10/Jun/2013:14:50:59 -0700] - ldbm_back_modify: modify_switch_entries failed
[10/Jun/2013:14:50:59 -0700] - ldbm_back_add: modify_switch_entries failed
[10/Jun/2013:14:50:59 -0700] - ldbm_back_modify: modify_switch_entries failed
[10/Jun/2013:14:51:00 -0700] - ldbm_back_modify: modify_switch_entries failed
[10/Jun/2013:14:51:00 -0700] - ldbm_back_add: modify_switch_entries failed
[10/Jun/2013:14:51:00 -0700] - ldbm_back_modify: modify_switch_entries failed
[10/Jun/2013:14:51:01 -0700] - ldbm_back_modify: modify_switch_entries failed
[10/Jun/2013:14:51:02 -0700] - ldbm_back_modify: modify_switch_entries failed
[10/Jun/2013:14:51:04 -0700] - ldbm_back_modify: modify_switch_entries failed
[10/Jun/2013:14:51:04 -0700] - ldbm_back_add: modify_switch_entries failed

In the access log I see things like:
[10/Jun/2013:14:50:59 -0700] conn=3516 SSL 256-bit AES
[10/Jun/2013:14:50:59 -0700] conn=3516 op=0 BIND dn="cn=Directory Manager" method=128 version=3
[10/Jun/2013:14:50:59 -0700] conn=3516 op=0 RESULT err=0 tag=97 nentries=0 etime=0 dn="cn=directory manager"
[10/Jun/2013:14:50:59 -0700] conn=3516 op=1 ADD dn="uid=jmeter25,dc=dev,dc=id,dc=ubc,dc=ca"
[10/Jun/2013:14:50:59 -0700] conn=3516 op=1 RESULT err=1 tag=105 nentries=0 etime=0 csn=51b64a46001b01f50000
[10/Jun/2013:14:50:59 -0700] conn=3516 op=2 DEL dn="uid=jmeter25,dc=dev,dc=id,dc=ubc,dc=ca"
[10/Jun/2013:14:50:59 -0700] conn=3516 op=2 RESULT err=32 tag=107 nentries=0 etime=0
[10/Jun/2013:14:50:59 -0700] conn=3516 op=3 UNBIND
[10/Jun/2013:14:50:59 -0700] conn=3516 op=3 fd=107 closed - U1
[10/Jun/2013:14:51:00 -0700] conn=3533 SSL 256-bit AES
[10/Jun/2013:14:51:00 -0700] conn=3533 op=0 BIND dn="cn=Directory Manager" method=128 version=3
[10/Jun/2013:14:51:00 -0700] conn=3533 op=0 RESULT err=0 tag=97 nentries=0 etime=0 dn="cn=directory manager"
[10/Jun/2013:14:51:00 -0700] conn=3533 op=1 ADD dn="uid=jmeter57,dc=dev,dc=id,dc=ubc,dc=ca"
[10/Jun/2013:14:51:00 -0700] conn=3533 op=1 RESULT err=1 tag=105 nentries=0 etime=0 csn=51b64a46004d01f50000
[10/Jun/2013:14:51:00 -0700] conn=3533 op=2 DEL dn="uid=jmeter57,dc=dev,dc=id,dc=ubc,dc=ca"
[10/Jun/2013:14:51:00 -0700] conn=3533 op=2 RESULT err=32 tag=107 nentries=0 etime=0
[10/Jun/2013:14:51:00 -0700] conn=3533 op=3 UNBIND
[10/Jun/2013:14:51:00 -0700] conn=3533 op=3 fd=110 closed - U1
Comment 3 Venkat Mahadevan 2013-06-11 17:05:37 EDT
Just an update on this. Using a suggestion from Thierry Bordaz here (https://lists.fedoraproject.org/pipermail/389-users/2013-June/016001.html), I increased the nsslapd-cachememsize and nsslapd-dbcachesize from their default values of about 10 megabytes to 1 gigabyte. The database was very small (about 400 entries), but doing this seems to have resolved the issue and my test does not trigger the error anymore.
Comment 5 Nathan Kinder 2013-06-16 18:22:58 EDT
Upstream ticket:
https://fedorahosted.org/389/ticket/47392
Comment 9 Nathan Kinder 2013-06-28 15:58:10 EDT
*** Bug 979170 has been marked as a duplicate of this bug. ***
Comment 13 Sankar Ramalingam 2013-10-17 13:52:35 EDT
1). Configured 2 way MMR with the latest 389-ds-base.
2). Enabled DNA plugin, AutoMembers and managedEntry plugins. Added DNA entries like this...to M1

dn: cn=Distributed Numeric Assignment Plugin,cn=plugins,cn=config
changetype: modify
replace: nsslapd-pluginEnabled
nsslapd-pluginEnabled: On

dn: cn=Posix IDs,cn=Distributed Numeric Assignment Plugin,cn=plugins,cn=config
changetype: add
objectClass: top
objectClass: extensibleObject
cn: Posix IDs
dnafilter: (|(objectclass=posixAccount)(objectClass=posixGroup))
dnamagicregen: 999
dnamaxvalue: 4294967295
dnanextvalue: 131073
dnascope: dc=passsync,dc=com
dnasharedcfgdn: cn=posix-ids,cn=dna,cn=plugins,cn=configuration,ou=ELDAP,ou=Services,dc=passsync,dc=com
dnathreshold: 1000
dnatype: uidNumber
dnatype: gidNumber

3).Added the following entry to M2...

dn: cn=Distributed Numeric Assignment Plugin,cn=plugins,cn=config
changetype: modify
replace: nsslapd-pluginEnabled
nsslapd-pluginEnabled: On

dn: cn=Posix IDs,cn=Distributed Numeric Assignment Plugin,cn=plugins,cn=config
changetype: add
objectClass: top
objectClass: extensibleObject
cn: Posix IDs
dnafilter: (|(objectclass=posixAccount)(objectClass=posixGroup))
dnamagicregen: 999
dnamaxvalue: 0
dnanextvalue: 0
dnascope: dc=passsync,dc=com
dnasharedcfgdn: cn=posix-ids,cn=dna,cn=plugins,cn=configuration,ou=ELDAP,ou=Services,dc=passsync,dc=com
dnathreshold: 1000
dnatype: uidNumber
dnatype: gidNumber

4). Restarted both M1 and M2.

5). Simultaneous add/delete/search of user entries to M1.

dn: uid=bug975250new7666,ou=people,dc=passsync,dc=com
telephoneNumber: 989898197666
mail: bug975250new7666@redhat.com">bug975250new7666@redhat.com
uid: bug975250new7666
givenName: bug975250new7666
objectClass: top
objectClass: person
objectClass: organizationalPerson
objectClass: inetorgperson
objectclass: posixAccount
sn: bug975250new7666
cn: bug975250new7666
homeDirectory: /home/bug975250new7666
loginShell: /bin/bash
userPassword: {SSHA}aBIV4atRWyMZqiWucSiZgYGVEw1bJa7V
uidNumber: 999
gidNumber: 999

6) No error messages encountered while running the tests. Hence, marking the bug as Verified.

Checked both M1 and M2 error logs.
Comment 14 errata-xmlrpc 2013-11-21 16:08:57 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1653.html

Note You need to log in before you can comment on or make changes to this bug.