Bug 1261218 - replication and ns-slapd crash in csnset_dup in ipa context
replication and ns-slapd crash in csnset_dup in ipa context
Status: CLOSED DUPLICATE of bug 1243970
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: 389-ds-base (Show other bugs)
Unspecified Unspecified
unspecified Severity high
: rc
: ---
Assigned To: Noriko Hosoi
Viktor Ashirov
Depends On:
  Show dependency treegraph
Reported: 2015-09-08 20:05 EDT by Marc Sauton
Modified: 2015-09-10 15:38 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2015-09-10 15:38:44 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Marc Sauton 2015-09-08 20:05:02 EDT
Description of problem:

ns-slapd crashes in ipa context, on several replica, multiple times, with same exact stack trace signature, in customer environment, not reproduced in house.

the toplogy is like this in the customer report:
                /      |      \
          m2.1     m1.2 - m1.3
                \      /
at the time of this report, the crashes happened on the masters I called m1.2 and m1.3

The stack trace is showing a MOD operation for replication on a host group entry
on one of the member entries
possibly a host enrollment.

Program terminated with signal 11, Segmentation fault.
#0  csnset_dup (csnset=<optimized out>) at ldap/servers/slapd/csnset.c:381
381                     csnset_add_csn(curnode,n->type,&n->csn);

(gdb) list
376             CSNSet *newcsnset= NULL;
377             CSNSet **curnode = &newcsnset;
378             const CSNSet *n= csnset;
379             while(n!=NULL)  
380             {
381                     csnset_add_csn(curnode,n->type,&n->csn);
382                     n= n->next;
383                     curnode = &((*curnode)->next);
384             }
385             return newcsnset;

the "n" value seem "large", like "corrupted":

Thread 1 (Thread 0x7fc0227f4700 (LWP 41887)):
#0  csnset_dup (csnset=<optimized out>) at ldap/servers/slapd/csnset.c:381
        newcsnset = 0x7fbfcc028f50
        curnode = 0x7fbfcc028f68
        n = 0x6e632c736e696775

(gdb) p curnode
$6 = (CSNSet **) 0x7f5b7000ccd8
(gdb) p *curnode
$7 = (CSNSet *) 0x0

(gdb) p n
$1 = (const CSNSet *) 0x6e632c736e696775

(gdb) p n->type
Cannot access memory at address 0x6e632c736e696775

#0  csnset_dup (csnset=<optimized out>) at ldap/servers/slapd/csnset.c:381
#1  0x00007fc05f6891ce in slapi_value_dup (v=0x7fbfcc023d90) at ldap/servers/slapd/value.c:173
#2  0x00007fc05f68b269 in valueset_set_valueset (vs1=0x7fbfcc027fb8, vs2=0x7fbfcc022dc8) at ldap/servers/slapd/valueset.c:1205
#3  0x00007fc05f5ffc67 in slapi_attr_dup (attr=attr@entry=0x7fbfcc022dc0) at ldap/servers/slapd/attr.c:440
#4  0x00007fc05f615cc8 in slapi_entry_dup (e=0x7fbfcc011f00) at ldap/servers/slapd/entry.c:2161
#5  0x00007fc053b669ef in backentry_dup (e=0x7fbfcc011e90) at ldap/servers/slapd/back-ldbm/backentry.c:114
#6  0x00007fc053bad484 in ldbm_back_modify (pb=<optimized out>) at ldap/servers/slapd/back-ldbm/ldbm_modify.c:669
#7  0x00007fc05f6430e1 in op_shared_modify (pb=pb@entry=0x7fc0227f3ae0, pw_change=pw_change@entry=0, old_pw=0x0) at ldap/servers/slapd/modify.c:1086
#8  0x00007fc05f64442f in do_modify (pb=pb@entry=0x7fc0227f3ae0) at ldap/servers/slapd/modify.c:419
#9  0x00007fc05fb24361 in connection_dispatch_operation (pb=0x7fc0227f3ae0, op=0x7fc066361740, conn=0x7fc03c359410) at ldap/servers/slapd/connection.c:660
#10 connection_threadmain () at ldap/servers/slapd/connection.c:2534
#11 0x00007fc05da4b9db in _pt_root () from /lib64/libnspr4.so
#12 0x00007fc05d3ecdf5 in start_thread () from /lib64/libpthread.so.0
#13 0x00007fc05d11a1ad in clone () from /lib64/libc.so.6

Version-Release number of selected component (if applicable):
RHEL 7.1

How reproducible:
but multiple crashes in customer environment.

Steps to Reproduce:
1. N/A

Actual results:

Expected results:

Additional info:
Comment 4 Marc Sauton 2015-09-08 21:34:15 EDT
actually, the "newer" 389-ds-base version is 

this is the one used when the crashes happened.
with have several sosreports, and I picked an old version number in the initial description of the bz 1261218 report
Comment 11 Marc Sauton 2015-09-10 15:38:44 EDT
closing bz 1261218 as a dup of bz 1243970 as per dev review and request
bz 1243970 has all acks and is in the 7.2 errata
there will be a cloned bz to backport the 48226 patch to 7.1 and include it in 7.1.z

*** This bug has been marked as a duplicate of bug 1243970 ***

Note You need to log in before you can comment on or make changes to this bug.