Bug 452169 - Crash in indexing code under heavy memberOf load with replication
Crash in indexing code under heavy memberOf load with replication
Status: CLOSED ERRATA
Product: 389
Classification: Community
Component: Database - Indexes/Searches (Show other bugs)
1.1.1
All Linux
low Severity low
: ---
: ---
Assigned To: Nathan Kinder
Chandrasekar Kannan
:
Depends On:
Blocks: 249650 FDS112 453229
  Show dependency treegraph
 
Reported: 2008-06-19 15:24 EDT by Nathan Kinder
Modified: 2015-01-04 18:32 EST (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-08-27 16:39:29 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
CVS Diffs (1.83 KB, patch)
2008-06-19 16:32 EDT, Nathan Kinder
no flags Details | Diff
tarball with scripts (10.00 KB, application/octet-stream)
2008-08-19 18:48 EDT, Chandrasekar Kannan
no flags Details

  None (edit)
Description Nathan Kinder 2008-06-19 15:24:45 EDT
I encountered a crash in the indexing code while doing some memberOf stress
testing.  I was using a local build with the current ldapserver code from CVS
and the current freeIPA memberOf plug-in code.

My test procedure consists of setting up 2 masters replicating to each other
with a fractional agreement that excludes the memberOf attribute.  I run some
load scripts against both masters at the same time which do various memberOf
operations on the same 3 entries.  These entries are creted, deleted, and
modified very, very often.  After some time under this load, one of the masters
crashed with this stack trace:

(gdb) bt
#0  0x00136962 in slapi_attr_value_cmp (a=0x0, v1=0x9dd50e8, v2=0xa002718)
    at ../threadlocal/ldap/servers/slapd/attr.c:526
#1  0x001a8a17 in slapi_value_compare (a=0x0, v1=0x9dd50e8, v2=0xa002718)
    at ../threadlocal/ldap/servers/slapd/value.c:486
#2  0x001a91b2 in valuearray_find (a=0x0, va=0x9dd5188, v=0x9dd50e8)
    at ../threadlocal/ldap/servers/slapd/valueset.c:364
#3  0x009a5bd3 in index_add_mods (be=0x99db7c8, mods=0x9e9d970, olde=0x98841fa0,
newe=0x9ca6180, 
    txn=0xa853a0f4) at ../threadlocal/ldap/servers/slapd/back-ldbm/index.c:657
#4  0x009ba3bc in ldbm_back_modify (pb=0xa00f468)
    at ../threadlocal/ldap/servers/slapd/back-ldbm/ldbm_modify.c:401
#5  0x0017138d in op_shared_modify (pb=0xa00f468, pw_change=0, old_pw=0x0)
    at ../threadlocal/ldap/servers/slapd/modify.c:789
#6  0x00170424 in do_modify (pb=0xa00f468) at
../threadlocal/ldap/servers/slapd/modify.c:341
#7  0x0805678d in connection_dispatch_operation (conn=0xadf4ccb8, op=0x9e84440,
pb=0xa00f468)
    at ../threadlocal/ldap/servers/slapd/connection.c:504
#8  0x08057dc2 in connection_threadmain () at
../threadlocal/ldap/servers/slapd/connection.c:2163
#9  0x024eaf51 in ?? () from /lib/libnspr4.so
#10 0x008e332f in start_thread () from /lib/libpthread.so.0
#11 0x0081e27e in clone () from /lib/libc.so.6

The code from frame 3 shows that we are not checking if the call to
slapi_entry_attr_find() was successful before attempting to use the Slapi_Attr
it returns upon success.  It seems that we are assuming that the old copy of the
entry will contain the attribute we are looking for.

Inspection of the mod we are processing and the old entry produces some
interesting findings.  The attribute that we are looking for is the "member"
attribute.  The operation is deleting a specific value from the entry.  The old
copy of the entry doesn't have a "member" attribute present, it does however
have the "member" attribute in it's deleted attributes list.  Another
interesting thing is that the old copy of the entry has a "nsds5ReplConflict"
attribute value present that indicates that there is a namingConflict.
Comment 1 Nathan Kinder 2008-06-19 16:13:59 EDT
I took a look at the new entry copy in the code where this crashes, and it's a
conflict entry (dn: nsuniqueid=<uuid>, <olddn>.  It seems that the indexing code
shouldn't assume that the attribute will be present since the replication URP
code may find conflicts.

The code where this fails in index_add_mods() is specific to a modify operation
where an attribute value to delete is specified.  We want to check if the value
being deleted is present in the entry for a subtype of the attribute whose value
is being deleted (for example, we're trying to delete "member: foo", but we want
to see if something like "member;blah: foo exists).  We do this check so we know
whether or not we should delete the equality index for this value.  Here is the
code I'm referring to:

 /* If the same value doesn't exist in a subtype, set
  * BE_INDEX_EQUALITY flag so the equality index is
  * removed.
  */
 slapi_entry_attr_find( olde->ep_entry, mods[i]->mod_type, &curr_attr);
 for (j = 0; mods_valueArray[j] != NULL; j++ ) {
     if ( valuearray_find(curr_attr, evals, mods_valueArray[j]) == -1 ) {
         if (!(flags & BE_INDEX_EQUALITY)) {
             flags |= BE_INDEX_EQUALITY;
         }
     }
 }

I see two things wrong with this code.  The first is that we need to check if
curr_attr is NULL before diving into the for loop and attempting to pass it to
valuearray_find.  If the attribute doesn't exist in the entry, we can assume
that we should be getting rid of the equality index.

The second thing that seems wrong is that we are performing this check against
the copy of the old entry (olde).  It seems to me that the proper thing to do
would be to perform the check against the new copy of the entry (newe).
Comment 2 Nathan Kinder 2008-06-19 16:32:39 EDT
Created attachment 309879 [details]
CVS Diffs
Comment 3 Noriko Hosoi 2008-06-19 16:38:54 EDT
Your fixes look good!
Comment 4 Nathan Kinder 2008-06-20 11:10:39 EDT
Checked into ldapserver (HEAD).  Thanks to Noriko for her review!

Checking in index.c;
/cvs/dirsec/ldapserver/ldap/servers/slapd/back-ldbm/index.c,v  <--  index.c
new revision: 1.14; previous revision: 1.13
done
Comment 5 Nathan Kinder 2008-07-09 13:04:03 EDT
Checked into Directory71RtmBranch.

Checking in slapd/back-ldbm/index.c;
/cvs/dirsec/ldapserver/ldap/servers/slapd/back-ldbm/index.c,v  <--  index.c
new revision: 1.5.2.3; previous revision: 1.5.2.2
done
Comment 6 Nathan Kinder 2008-07-10 18:47:49 EDT
Checked into Directory_Server_8_0_Branch.

Checking in ldap/servers/slapd/back-ldbm/index.c;
/cvs/dirsec/ldapserver/ldap/servers/slapd/back-ldbm/index.c,v  <--  index.c
new revision: 1.13.2.1; previous revision: 1.13
done
Comment 10 Jenny Galipeau 2008-08-18 14:43:05 EDT
With DS80/errata build I am unable to crash with Nathan's scripts against RHEL or Chandra's simple scripts against Sol9 and HPUX.
Comment 11 Chandrasekar Kannan 2008-08-19 18:48:04 EDT
Created attachment 314595 [details]
tarball with scripts

test tarball attached. has simple scripts that does add,modrdn,del on 2 master mmr.
Comment 12 Jenny Galipeau 2008-08-21 13:54:38 EDT
fix verified 7.1 and 8.0 RHEL4, RHEL5, SOLARIS and HPUX
Comment 13 Chandrasekar Kannan 2008-08-21 14:19:23 EDT
with ds71sp7, per the test attached in comment #11, I did see crashes on RHEL4,SunOS9. not on HPUX. That bug has been reported separately as bug 459433

I was not able to observe the crash reported in this bug with ds71sp7. 
Hence verified.
Comment 15 errata-xmlrpc 2008-08-27 16:39:29 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2008-0602.html

Note You need to log in before you can comment on or make changes to this bug.