I encountered a crash in the indexing code while doing some memberOf stress testing. I was using a local build with the current ldapserver code from CVS and the current freeIPA memberOf plug-in code. My test procedure consists of setting up 2 masters replicating to each other with a fractional agreement that excludes the memberOf attribute. I run some load scripts against both masters at the same time which do various memberOf operations on the same 3 entries. These entries are creted, deleted, and modified very, very often. After some time under this load, one of the masters crashed with this stack trace: (gdb) bt #0 0x00136962 in slapi_attr_value_cmp (a=0x0, v1=0x9dd50e8, v2=0xa002718) at ../threadlocal/ldap/servers/slapd/attr.c:526 #1 0x001a8a17 in slapi_value_compare (a=0x0, v1=0x9dd50e8, v2=0xa002718) at ../threadlocal/ldap/servers/slapd/value.c:486 #2 0x001a91b2 in valuearray_find (a=0x0, va=0x9dd5188, v=0x9dd50e8) at ../threadlocal/ldap/servers/slapd/valueset.c:364 #3 0x009a5bd3 in index_add_mods (be=0x99db7c8, mods=0x9e9d970, olde=0x98841fa0, newe=0x9ca6180, txn=0xa853a0f4) at ../threadlocal/ldap/servers/slapd/back-ldbm/index.c:657 #4 0x009ba3bc in ldbm_back_modify (pb=0xa00f468) at ../threadlocal/ldap/servers/slapd/back-ldbm/ldbm_modify.c:401 #5 0x0017138d in op_shared_modify (pb=0xa00f468, pw_change=0, old_pw=0x0) at ../threadlocal/ldap/servers/slapd/modify.c:789 #6 0x00170424 in do_modify (pb=0xa00f468) at ../threadlocal/ldap/servers/slapd/modify.c:341 #7 0x0805678d in connection_dispatch_operation (conn=0xadf4ccb8, op=0x9e84440, pb=0xa00f468) at ../threadlocal/ldap/servers/slapd/connection.c:504 #8 0x08057dc2 in connection_threadmain () at ../threadlocal/ldap/servers/slapd/connection.c:2163 #9 0x024eaf51 in ?? () from /lib/libnspr4.so #10 0x008e332f in start_thread () from /lib/libpthread.so.0 #11 0x0081e27e in clone () from /lib/libc.so.6 The code from frame 3 shows that we are not checking if the call to slapi_entry_attr_find() was successful before attempting to use the Slapi_Attr it returns upon success. It seems that we are assuming that the old copy of the entry will contain the attribute we are looking for. Inspection of the mod we are processing and the old entry produces some interesting findings. The attribute that we are looking for is the "member" attribute. The operation is deleting a specific value from the entry. The old copy of the entry doesn't have a "member" attribute present, it does however have the "member" attribute in it's deleted attributes list. Another interesting thing is that the old copy of the entry has a "nsds5ReplConflict" attribute value present that indicates that there is a namingConflict.
I took a look at the new entry copy in the code where this crashes, and it's a conflict entry (dn: nsuniqueid=<uuid>, <olddn>. It seems that the indexing code shouldn't assume that the attribute will be present since the replication URP code may find conflicts. The code where this fails in index_add_mods() is specific to a modify operation where an attribute value to delete is specified. We want to check if the value being deleted is present in the entry for a subtype of the attribute whose value is being deleted (for example, we're trying to delete "member: foo", but we want to see if something like "member;blah: foo exists). We do this check so we know whether or not we should delete the equality index for this value. Here is the code I'm referring to: /* If the same value doesn't exist in a subtype, set * BE_INDEX_EQUALITY flag so the equality index is * removed. */ slapi_entry_attr_find( olde->ep_entry, mods[i]->mod_type, &curr_attr); for (j = 0; mods_valueArray[j] != NULL; j++ ) { if ( valuearray_find(curr_attr, evals, mods_valueArray[j]) == -1 ) { if (!(flags & BE_INDEX_EQUALITY)) { flags |= BE_INDEX_EQUALITY; } } } I see two things wrong with this code. The first is that we need to check if curr_attr is NULL before diving into the for loop and attempting to pass it to valuearray_find. If the attribute doesn't exist in the entry, we can assume that we should be getting rid of the equality index. The second thing that seems wrong is that we are performing this check against the copy of the old entry (olde). It seems to me that the proper thing to do would be to perform the check against the new copy of the entry (newe).
Created attachment 309879 [details] CVS Diffs
Your fixes look good!
Checked into ldapserver (HEAD). Thanks to Noriko for her review! Checking in index.c; /cvs/dirsec/ldapserver/ldap/servers/slapd/back-ldbm/index.c,v <-- index.c new revision: 1.14; previous revision: 1.13 done
Checked into Directory71RtmBranch. Checking in slapd/back-ldbm/index.c; /cvs/dirsec/ldapserver/ldap/servers/slapd/back-ldbm/index.c,v <-- index.c new revision: 1.5.2.3; previous revision: 1.5.2.2 done
Checked into Directory_Server_8_0_Branch. Checking in ldap/servers/slapd/back-ldbm/index.c; /cvs/dirsec/ldapserver/ldap/servers/slapd/back-ldbm/index.c,v <-- index.c new revision: 1.13.2.1; previous revision: 1.13 done
With DS80/errata build I am unable to crash with Nathan's scripts against RHEL or Chandra's simple scripts against Sol9 and HPUX.
Created attachment 314595 [details] tarball with scripts test tarball attached. has simple scripts that does add,modrdn,del on 2 master mmr.
fix verified 7.1 and 8.0 RHEL4, RHEL5, SOLARIS and HPUX
with ds71sp7, per the test attached in comment #11, I did see crashes on RHEL4,SunOS9. not on HPUX. That bug has been reported separately as bug 459433 I was not able to observe the crash reported in this bug with ds71sp7. Hence verified.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2008-0602.html