Bug 982326

Summary: rhds90 hotfix crash on tombstone search
Product: Red Hat Enterprise Linux 6 Reporter: Marc Sauton <msauton>
Component: 389-ds-baseAssignee: Rich Megginson <rmeggins>
Status: CLOSED DUPLICATE QA Contact: Sankar Ramalingam <sramling>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.4CC: jgalipea, nhosoi, nkinder
Target Milestone: rc   
Target Release: 6.5   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-07-10 16:16:26 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
sf.00899573.gdb.stacktrace.core.23423.1373300613.txt
none
sf.00899573.gdb.stacktrace.core.8807.1373300620.txt none

Description Marc Sauton 2013-07-08 16:56:10 UTC
Description of problem:

for SF case number 00899573

customer with RHEL 6 RHDS 9 4xMMR  - crash on 4 nodes using hotfix
389-ds-base-1.2.11.15-14.2.el6.x86_64.rpm
issue likely exist in other released bits, likely not related in particular to the hotfix in use.


Version-Release number of selected component (if applicable):
hotfix
389-ds-base-1.2.11.15-14.2.el6.x86_64.rpm


How reproducible:
unknown, seem so far very difficult


Steps to Reproduce:
unknown yet, may be a simple scenario exist
but customer has a high LDAP traffic

1.
2.
3.


Actual results:
4 nodes crash


Expected results:
?


Additional info:
have 4 cores files on a beaker system
2 did not read well, but the 2 other do have good info:


hp-dl140g2-01.lab.eng.rdu.redhat.com
root: redhat

I added a stack trace for each of the 4 core files, about to open a bz.

Rich M. already looked quickly at the core files, and indicated it looks like the issue may be related to a search on ou=* or cn=* and accessing a tombstone, and eng. (likely Noriko or Rich are working on a new hotfix, no eta, yet)

some details below, for core.23423, file sf.00899573.gdb.stacktrace.core.23423.1373300613.txt

Thread 1 (Thread 0x7f60045e6700 (LWP 1860)):
#0  slapi_sdn_get_ndn (sdn=0x30) at ldap/servers/slapd/dn.c:2274
No locals.
#1  0x00007f6039e79fd1 in entry_same_dn (e=<value optimized out>, k=0x2b1f5c0) at ldap/servers/slapd/back-ldbm/cache.c:167
        be = <value optimized out>
        ndn = <value optimized out>
#2  0x00007f6039e78df5 in add_hash (ht=0x1c9bc50, key=0x2b1f5c0, keylen=<value optimized out>, entry=0x2993510, alt=0x7f60045defd0) at ldap/servers/slapd/back-ldbm/cache.c:217
        val = <value optimized out>
        slot = 1915
        e = 0x7f5fe82e3c10
#3  0x00007f6039e796eb in entrycache_add_int (cache=0x1c80ae8, e=0x2993510, state=0, alt=0x7f60045df0d8) at ldap/servers/slapd/back-ldbm/cache.c:1270
        eflush = 0x0
        eflushtemp = 0x0
        ndn = 0x2b1f5c0 "nsuniqueid=993ddf81-8b3611e2-a255acde-ab26a899,nsuniqueid=993ddf81-8b3611e2-a255acde-ab26a899,ou=e3,ou=bems,ou=servers,dc=edited"
...
#10 0x00000000004264d3 in do_search (pb=<value optimized out>) at ldap/servers/slapd/search.c:400
        operation = 0x29b2220
        ber = 0x30af3d0
        i = <value optimized out>
        err = <value optimized out>
        attrsonly = 0
        scope = 1
        deref = 0
        sizelimit = 0
        timelimit = 0
        rawbase = 0x31227b0 "ou=BEMS,ou=Servers,dc=edited"
        fstr = 0x3122720 "(|(ou=*)(cn=*))"
...




and same nsuniqueid reference in this other core file:

core.8807
file sf.00899573.gdb.stacktrace.core.8807.1373300620.txt
...
Thread 1 (Thread 0x7f9fb4bed700 (LWP 8845)):
#0  __strcmp_ssse3 () at ../sysdeps/x86_64/strcmp.S:213
No locals.
#1  0x00007f9fe68b6fdc in entry_same_dn (e=<value optimized out>, k=0x7f9fcc59a1c0) at ldap/servers/slapd/back-ldbm/cache.c:169
        be = <value optimized out>
        ndn = <value optimized out>
#2  0x00007f9fe68b5df5 in add_hash (ht=0x10eed30, key=0x7f9fcc59a1c0, keylen=<value optimized out>, entry=0x7f9fcc2067f0, alt=0x7f9fb4be5fd0) at ldap/servers/slapd/back-ldbm/cache.c:217
        val = <value optimized out>
        slot = 1915
        e = 0x7f9f9814a380
#3  0x00007f9fe68b66eb in entrycache_add_int (cache=0x100d448, e=0x7f9fcc2067f0, state=0, alt=0x7f9fb4be60d8) at ldap/servers/slapd/back-ldbm/cache.c:1270
        eflush = 0x0
        eflushtemp = 0x0
        ndn = 0x7f9fcc59a1c0 "nsuniqueid=993ddf81-8b3611e2-a255acde-ab26a899,nsuniqueid=993ddf81-8b3611e2-a255acde-ab26a899,ou=e3,ou=bems,ou=servers,dc=edited"
        my_alt = <value optimized out>
...



core.23423 and core.8807 did not seem to read anything, files
sf.00899573.gdb.stacktrace.core.23423.1373300613.txt
sf.00899573.gdb.stacktrace.core.8807.1373300620.txt

Comment 1 Marc Sauton 2013-07-08 17:05:59 UTC
Created attachment 770585 [details]
sf.00899573.gdb.stacktrace.core.23423.1373300613.txt

Comment 2 Marc Sauton 2013-07-08 17:06:48 UTC
Created attachment 770586 [details]
sf.00899573.gdb.stacktrace.core.8807.1373300620.txt

Comment 8 Rich Megginson 2013-07-10 13:20:52 UTC
Can we close this bug as a dup of https://bugzilla.redhat.com/show_bug.cgi?id=947583 ?

Comment 9 Noriko Hosoi 2013-07-10 15:53:01 UTC
(In reply to Rich Megginson from comment #8)
> Can we close this bug as a dup of
> https://bugzilla.redhat.com/show_bug.cgi?id=947583 ?

+1

Comment 10 Rich Megginson 2013-07-10 16:16:26 UTC

*** This bug has been marked as a duplicate of bug 947583 ***