Red Hat Bugzilla – Bug 972976
ldbm errors when adding/modifying/deleting entries
Last modified: 2013-11-21 16:08:57 EST
Created attachment 759394 [details] JMeter test plan to reproduce bug Description of problem: The issue I am encountering are the following errors when an entry is being added to the directory or deleted from the directory – it seems an operations error err=1 corresponds to a failure in the backend ldbm (see log snippets below). Version-Release number of selected component (if applicable): 389-ds-base.x86_64 1.2.11.15-14.el6_4 @rhel-x86_64-server-6 Linux 2.6.32-358.2.1.el6.x86_64 #1 SMP Wed Feb 20 12:17:37 EST 2013 x86_64 x86_64 x86_64 GNU/Linux Red Hat Enterprise Linux Server release 6.4 (Santiago) How reproducible: Consistently reproducible. Steps to Reproduce: I can reproduce these errors consistently by running a JMeter add/delete loop test to add and delete entries (see the attached testplan_. It always transpires that some entries will fail to add or delete and will throw an operations error err=1 and a corresponding ldbm error. Actual results: err=1 and ldbm errors; entry fails to add or delete. Expected results: Entries should be added and deleted successfully. Additional info: The DNA plugin is active on the server. When sent a uidNumber of 999 or gidNumber of 999, it will generate a new POSIX uid/gid. The servers are in a replicated environment (2 MMR masters and 3 replicas). The error seems to occur regardless of whether the DNA plugin is used or not (i.e. when sending a non-POSIX entry vs. a POSIX entry to LDAP).
Sorry, forgot to attach actual error messages: In the error log I see things like: [10/Jun/2013:14:46:43 -0700] - ldbm_back_delete: modify_switch_entries failed [10/Jun/2013:14:50:58 -0700] - ldbm_back_modify: modify_switch_entries failed [10/Jun/2013:14:50:59 -0700] - ldbm_back_modify: modify_switch_entries failed [10/Jun/2013:14:50:59 -0700] - ldbm_back_modify: modify_switch_entries failed [10/Jun/2013:14:50:59 -0700] - ldbm_back_add: modify_switch_entries failed [10/Jun/2013:14:50:59 -0700] - ldbm_back_modify: modify_switch_entries failed [10/Jun/2013:14:51:00 -0700] - ldbm_back_modify: modify_switch_entries failed [10/Jun/2013:14:51:00 -0700] - ldbm_back_add: modify_switch_entries failed [10/Jun/2013:14:51:00 -0700] - ldbm_back_modify: modify_switch_entries failed [10/Jun/2013:14:51:01 -0700] - ldbm_back_modify: modify_switch_entries failed [10/Jun/2013:14:51:02 -0700] - ldbm_back_modify: modify_switch_entries failed [10/Jun/2013:14:51:04 -0700] - ldbm_back_modify: modify_switch_entries failed [10/Jun/2013:14:51:04 -0700] - ldbm_back_add: modify_switch_entries failed In the access log I see things like: [10/Jun/2013:14:50:59 -0700] conn=3516 SSL 256-bit AES [10/Jun/2013:14:50:59 -0700] conn=3516 op=0 BIND dn="cn=Directory Manager" method=128 version=3 [10/Jun/2013:14:50:59 -0700] conn=3516 op=0 RESULT err=0 tag=97 nentries=0 etime=0 dn="cn=directory manager" [10/Jun/2013:14:50:59 -0700] conn=3516 op=1 ADD dn="uid=jmeter25,dc=dev,dc=id,dc=ubc,dc=ca" [10/Jun/2013:14:50:59 -0700] conn=3516 op=1 RESULT err=1 tag=105 nentries=0 etime=0 csn=51b64a46001b01f50000 [10/Jun/2013:14:50:59 -0700] conn=3516 op=2 DEL dn="uid=jmeter25,dc=dev,dc=id,dc=ubc,dc=ca" [10/Jun/2013:14:50:59 -0700] conn=3516 op=2 RESULT err=32 tag=107 nentries=0 etime=0 [10/Jun/2013:14:50:59 -0700] conn=3516 op=3 UNBIND [10/Jun/2013:14:50:59 -0700] conn=3516 op=3 fd=107 closed - U1 [10/Jun/2013:14:51:00 -0700] conn=3533 SSL 256-bit AES [10/Jun/2013:14:51:00 -0700] conn=3533 op=0 BIND dn="cn=Directory Manager" method=128 version=3 [10/Jun/2013:14:51:00 -0700] conn=3533 op=0 RESULT err=0 tag=97 nentries=0 etime=0 dn="cn=directory manager" [10/Jun/2013:14:51:00 -0700] conn=3533 op=1 ADD dn="uid=jmeter57,dc=dev,dc=id,dc=ubc,dc=ca" [10/Jun/2013:14:51:00 -0700] conn=3533 op=1 RESULT err=1 tag=105 nentries=0 etime=0 csn=51b64a46004d01f50000 [10/Jun/2013:14:51:00 -0700] conn=3533 op=2 DEL dn="uid=jmeter57,dc=dev,dc=id,dc=ubc,dc=ca" [10/Jun/2013:14:51:00 -0700] conn=3533 op=2 RESULT err=32 tag=107 nentries=0 etime=0 [10/Jun/2013:14:51:00 -0700] conn=3533 op=3 UNBIND [10/Jun/2013:14:51:00 -0700] conn=3533 op=3 fd=110 closed - U1
Just an update on this. Using a suggestion from Thierry Bordaz here (https://lists.fedoraproject.org/pipermail/389-users/2013-June/016001.html), I increased the nsslapd-cachememsize and nsslapd-dbcachesize from their default values of about 10 megabytes to 1 gigabyte. The database was very small (about 400 entries), but doing this seems to have resolved the issue and my test does not trigger the error anymore.
Upstream ticket: https://fedorahosted.org/389/ticket/47392
*** Bug 979170 has been marked as a duplicate of this bug. ***
1). Configured 2 way MMR with the latest 389-ds-base. 2). Enabled DNA plugin, AutoMembers and managedEntry plugins. Added DNA entries like this...to M1 dn: cn=Distributed Numeric Assignment Plugin,cn=plugins,cn=config changetype: modify replace: nsslapd-pluginEnabled nsslapd-pluginEnabled: On dn: cn=Posix IDs,cn=Distributed Numeric Assignment Plugin,cn=plugins,cn=config changetype: add objectClass: top objectClass: extensibleObject cn: Posix IDs dnafilter: (|(objectclass=posixAccount)(objectClass=posixGroup)) dnamagicregen: 999 dnamaxvalue: 4294967295 dnanextvalue: 131073 dnascope: dc=passsync,dc=com dnasharedcfgdn: cn=posix-ids,cn=dna,cn=plugins,cn=configuration,ou=ELDAP,ou=Services,dc=passsync,dc=com dnathreshold: 1000 dnatype: uidNumber dnatype: gidNumber 3).Added the following entry to M2... dn: cn=Distributed Numeric Assignment Plugin,cn=plugins,cn=config changetype: modify replace: nsslapd-pluginEnabled nsslapd-pluginEnabled: On dn: cn=Posix IDs,cn=Distributed Numeric Assignment Plugin,cn=plugins,cn=config changetype: add objectClass: top objectClass: extensibleObject cn: Posix IDs dnafilter: (|(objectclass=posixAccount)(objectClass=posixGroup)) dnamagicregen: 999 dnamaxvalue: 0 dnanextvalue: 0 dnascope: dc=passsync,dc=com dnasharedcfgdn: cn=posix-ids,cn=dna,cn=plugins,cn=configuration,ou=ELDAP,ou=Services,dc=passsync,dc=com dnathreshold: 1000 dnatype: uidNumber dnatype: gidNumber 4). Restarted both M1 and M2. 5). Simultaneous add/delete/search of user entries to M1. dn: uid=bug975250new7666,ou=people,dc=passsync,dc=com telephoneNumber: 989898197666 mail: bug975250new7666@redhat.com">bug975250new7666@redhat.com uid: bug975250new7666 givenName: bug975250new7666 objectClass: top objectClass: person objectClass: organizationalPerson objectClass: inetorgperson objectclass: posixAccount sn: bug975250new7666 cn: bug975250new7666 homeDirectory: /home/bug975250new7666 loginShell: /bin/bash userPassword: {SSHA}aBIV4atRWyMZqiWucSiZgYGVEw1bJa7V uidNumber: 999 gidNumber: 999 6) No error messages encountered while running the tests. Hence, marking the bug as Verified. Checked both M1 and M2 error logs.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-1653.html