Bug 680555 - ns-slapd segfaults if I have more than 100 DBs
Summary: ns-slapd segfaults if I have more than 100 DBs
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: 389
Classification: Retired
Component: Directory Server
Version: 1.2.8
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
Assignee: Rich Megginson
QA Contact: Viktor Ashirov
URL:
Whiteboard:
: 709968 (view as bug list)
Depends On:
Blocks: 639035 389_1.2.8 681379
TreeView+ depends on / blocked
 
Reported: 2011-02-25 21:40 UTC by Diego Woitasen
Modified: 2015-12-07 16:52 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 681379 (view as bug list)
Environment:
Last Closed: 2015-12-07 16:52:19 UTC
Embargoed:


Attachments (Terms of Use)
0001-Bug-680555-ns-slapd-segfaults-if-I-have-more-than-10.patch (3.44 KB, patch)
2011-03-01 22:40 UTC, Rich Megginson
nhosoi: review+
nkinder: review+
Details | Diff

Description Diego Woitasen 2011-02-25 21:40:15 UTC
Description of problem:
If you have more than 100 sub suffixes with their databases, 389 DS segfaults.


Version-Release number of selected component (if applicable):
1.2.8.a2


How reproducible:
Execute this from command line 150 times:
ldapadd -x -D uid=superuser,ou=People,dc=domain,dc=ar -w superpass << EOF
dn: cn=ou=$1\,dc=domain\,dc=ar,cn=mapping tree,cn=config
objectclass: top
objectclass: extensibleObject
objectclass: nsMappingTree
nsslapd-state: backend 
nsslapd-backend: $1
nsslapd-parent-suffix: dc=domain,dc=ar
cn: ou=$1,dc=domain,dc=ar

dn: cn=$1,cn=ldbm database,cn=plugins,cn=config
objectclass: extensibleObject
objectclass: nsBackendInstance
nsslapd-suffix: ou=$1,dc=domain,dc=ar

dn: ou=$1,dc=domain,dc=ar
objectClass: organizationalUnit
objectClass: top
ou: $1  
description: $1
EOF     

  
Actual results:
Running ns-slapd from gdb:

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff55c11c4 in PR_RWLock_Unlock () from /usr/lib/libnspr4.so.0d
(gdb) bt
#0  0x00007ffff55c11c4 in PR_RWLock_Unlock () from /usr/lib/libnspr4.so.0d
#1  0x00007ffff7b5cb60 in slapi_mapping_tree_free_all (be_list=<value optimized out>,
    referral_list=0x7fffffffd2e0) at ldap/servers/slapd/mapping_tree.c:2272
#2  0x00007ffff7b68710 in op_shared_search (pb=0x4787410,
    send_result=<value optimized out>) at ldap/servers/slapd/opshared.c:872
#3  0x00007ffff7b72609 in search_internal_callback_pb (pb=0x4787410,
    callback_data=<value optimized out>, prc=<value optimized out>,
    psec=0x7fffed0ab760 <views_dn_views_cb>, prec=0)
    at ldap/servers/slapd/plugin_internal_op.c:761
#4  0x00007fffed0ac3ea in views_cache_add_dn_views ()
    at ldap/servers/plugins/views/views.c:1307
#5  views_cache_build_view_list () at ldap/servers/plugins/views/views.c:1177
#6  views_cache_create () at ldap/servers/plugins/views/views.c:439
#7  0x00007fffed0ac554 in views_start (pb=<value optimized out>)
    at ldap/servers/plugins/views/views.c:254
#8  0x00007ffff7b6f6cd in plugin_call_func (list=0x99a500, operation=212, pb=0x8398f0,
    call_one=1) at ldap/servers/slapd/plugin.c:1428
#9  0x00007ffff7b70046 in plugin_call_one (argc=9, argv=0x7fffffffe5e8,
    errmsg=<value optimized out>, operation=<value optimized out>)
    at ldap/servers/slapd/plugin.c:1396
#10 plugin_dependency_startall (argc=9, argv=0x7fffffffe5e8,
    errmsg=<value optimized out>, operation=<value optimized out>)
    at ldap/servers/slapd/plugin.c:1187
#11 0x000000000041dfaa in main (argc=9, argv=0x7fffffffe5e8)


I found that DS works if I set BE_LIST_SIZE to 300 in ldap/servers/slapd/slap.h

The problem is in ldap/servers/slapd/mapping_tree.c:slapi_mapping_tree_select_all() too.

Line 2181 should be: "while ((node) &&(index < BE_LIST_SIZE -1))" to let space for the NULL value assigned after the loop. A warning if we have more than BE_LIST_SIZE datases would be great.

Comment 1 Rich Megginson 2011-03-01 22:40:58 UTC
Created attachment 481729 [details]
0001-Bug-680555-ns-slapd-segfaults-if-I-have-more-than-10.patch

Comment 2 Rich Megginson 2011-03-01 23:41:45 UTC
To ssh://git.fedorahosted.org/git/389/ds.git
   e3c72d0..6c4eac9  master -> master
commit 6c4eac9ca642b99d7664d3a6b04067c3091f5694
Author: Rich Megginson <rmeggins>
Date:   Tue Mar 1 15:36:11 2011 -0700
    Reviewed by: nhosoi, nkinder (Thanks!)
    Branch: master
    Fix Description: 1) slapi_mapping_tree_select_all() does
    be_list[BE_LIST_SIZE] = NULL
    so be_list must be of size BE_LIST_SIZE+1
    2) loop counter should check be_index, not index, to see if the loop is
    completed
    3) if the search is going to hit more backends than we can process, just
    return ADMINLIMIT_EXCEEDED with an explanatory error message
    4) increase the BE_LIST_SIZE to 1000
    Platforms tested: RHEL6 x86_64
    Flag Day: no
    Doc impact: no
To ssh://git.fedorahosted.org/git/389/ds.git
   1ba8420..ef1cb3d  389-ds-base-1.2.8 -> 389-ds-base-1.2.8
commit ef1cb3d053888274a8b7d0f59c8392427b01e783

Comment 3 Diego Woitasen 2011-04-25 23:27:17 UTC
I confirm that the bug was fixed for me with the patch. 1.2.8.2 is running for 148 DBs without any problem.

Comment 4 Rich Megginson 2011-06-02 14:10:23 UTC
*** Bug 709968 has been marked as a duplicate of this bug. ***

Comment 6 Rich Megginson 2011-06-08 15:48:55 UTC
Use a script - in shell script, something like this:

ii=1
while [ $ii -le 999 ] ; do
  suffix="cn=suffix$ii"
  bename="be$ii"
  # add the backend and suffix entries
  sed -e "s/%ds_bename%/$bename/g" -e "s/%ds_suffix%/$suffix/g" /usr/share/dirsrv/data/template-suffix-db.ldif | ldapmodify -x -D "cn=directory manager" -w thepassword -a
  # add the root level entry for the suffix
  ldapmodify -x -D "cn=directory manager" -w thepassword -a <<EOF
dn: $suffix
objectclass: top
objectclass: extensibleObject
description: this is suffix $suffix
EOF
  # increment ii
  ii=`expr $ii + 1`
done

Comment 7 Amita Sharma 2011-06-09 10:22:29 UTC
I am incrementing ii
I am creating Sub-Suffix17
I am calling AddSuffix
adding new entry "cn=sub_exam_db17,cn=ldbm database,cn=plugins,cn=config"

adding new entry "cn="dc=sub_exam17,dc=exam1,dc=com",cn=mapping tree,cn=config"

adding new entry "dc=sub_exam17,dc=exam1,dc=com"

adding new entry "ou=people,dc=sub_exam17,dc=exam1,dc=com"

I am incrementing ii
I am creating Sub-Suffix18
I am calling AddSuffix
adding new entry "cn=sub_exam_db18,cn=ldbm database,cn=plugins,cn=config"

adding new entry "cn="dc=sub_exam18,dc=exam1,dc=com",cn=mapping tree,cn=config"

adding new entry "dc=sub_exam18,dc=exam1,dc=com"

adding new entry "ou=people,dc=sub_exam18,dc=exam1,dc=com"

I am incrementing ii
I am creating Sub-Suffix19
I am calling AddSuffix
adding new entry "cn=sub_exam_db19,cn=ldbm database,cn=plugins,cn=config"

adding new entry "cn="dc=sub_exam19,dc=exam1,dc=com",cn=mapping tree,cn=config"

adding new entry "dc=sub_exam19,dc=exam1,dc=com"

adding new entry "ou=people,dc=sub_exam19,dc=exam1,dc=com"

I am incrementing ii
I am creating Sub-Suffix20
I am calling AddSuffix
adding new entry "cn=sub_exam_db20,cn=ldbm database,cn=plugins,cn=config"

adding new entry "cn="dc=sub_exam20,dc=exam1,dc=com",cn=mapping tree,cn=config"

adding new entry "dc=sub_exam20,dc=exam1,dc=com"
ldap_add: Operations error (1)

I am incrementing ii
I am creating Sub-Suffix21
I am calling AddSuffix
adding new entry "cn=sub_exam_db21,cn=ldbm database,cn=plugins,cn=config"

adding new entry "cn="dc=sub_exam21,dc=exam1,dc=com",cn=mapping tree,cn=config"
ldap_result: Can't contact LDAP server (-1)

I am incrementing ii
I am creating Sub-Suffix22
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix23
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix24
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix25
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix26
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix27
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix28
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix29
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix30
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix31
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix32
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix33
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix34
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix35
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix36
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix37
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix38
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix39
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix40
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix41
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix42
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix43
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix44
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix45
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix46
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix47
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix48
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix49
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix50
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix51
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix52
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix53
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix54
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix55
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix56
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix57
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix58
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix59
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix60
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix61
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix62
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix63
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix64
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix65
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix66
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix67
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix68
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix69
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix70
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix71
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix72
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix73
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix74
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix75
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix76
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix77
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix78
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix79
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix80
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix81
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix82
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix83
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix84
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix85
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix86
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix87
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix88
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix89
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix90
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix91
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix92
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix93
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix94
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix95
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix96
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix97
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix98
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix99
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii


Error Log
=========
[09/Jun/2011:15:42:08 +051800] - libdb: /var/lib/dirsrv/slapd-testvm/db/log.0000000032: log file unreadable: Too many open files
[09/Jun/2011:15:42:08 +051800] - libdb: PANIC: Too many open files
[09/Jun/2011:15:42:08 +051800] - libdb: DB_TXN->abort: log undo failed for LSN: 32 5857733: DB_NOTFOUND: No matching key/data pair found
[09/Jun/2011:15:42:08 +051800] - libdb: PANIC: DB_NOTFOUND: No matching key/data pair found
[09/Jun/2011:15:42:08 +051800] - libdb: /var/lib/dirsrv/slapd-testvm/db/log.0000000032: log file unreadable: Too many open files
[09/Jun/2011:15:42:08 +051800] - libdb: PANIC: Too many open files
[09/Jun/2011:15:42:08 +051800] - libdb: DB_TXN->abort: log undo failed for LSN: 32 5857733: DB_NOTFOUND: No matching key/data pair found
[09/Jun/2011:15:42:08 +051800] - libdb: PANIC: DB_NOTFOUND: No matching key/data pair found
[09/Jun/2011:15:42:08 +051800] - libdb: PANIC: DB_RUNRECOVERY: Fatal error, run database recovery
[09/Jun/2011:15:42:08 +051800] - libdb: PANIC: fatal region error detected; run recovery
[09/Jun/2011:15:42:08 +051800] - <= index_read NULL (could not open index attr objectClass)
[09/Jun/2011:15:42:08 +051800] - database index operation failed BAD 1210, err=24 Too many open files
[09/Jun/2011:15:42:08 +051800] - database index operation failed BAD 1030, err=24 Too many open files
[09/Jun/2011:15:42:08 +051800] - add: attempt to index 1 failed
[09/Jun/2011:15:42:08 +051800] - libdb: PANIC: fatal region error detected; run recovery
[09/Jun/2011:15:42:08 +051800] - Serious Error---Failed in dblayer_txn_abort, err=-30974 (DB_RUNRECOVERY: Fatal error, run database recovery)
[09/Jun/2011:15:42:08 +051800] - libdb: PANIC: fatal region error detected; run recovery
[09/Jun/2011:15:42:08 +051800] - Serious Error---Failed to trickle, err=-30974 (DB_RUNRECOVERY: Fatal error, run database recovery)
[09/Jun/2011:15:42:08 +051800] - PR_Accept() failed, Netscape Portable Runtime error -5971 (Process open FD table is full.)
[09/Jun/2011:15:42:08 +051800] - libdb: PANIC: fatal region error detected; run recovery
[09/Jun/2011:15:42:08 +051800] - Serious Error---Failed in deadlock detect (aborted at 0x0), err=-30974 (DB_RUNRECOVERY: Fatal error, run database recovery)
[09/Jun/2011:15:42:08 +051800] - PR_Accept() failed, Netscape Portable Runtime error -5971 (Process open FD table is full.)
[09/Jun/2011:15:42:08 +051800] - libdb: PANIC: fatal region error detected; run recovery
[09/Jun/2011:15:42:09 +051800] - Serious Error---Failed in deadlock detect (aborted at 0x0), err=-30974 (DB_RUNRECOVERY: Fatal error, run database recovery)
[09/Jun/2011:15:42:09 +051800] dse - Cannot open temporary DSE file "/etc/dirsrv/slapd-testvm/dse.ldif.tmp" for update: OS error 24 (Too many open files)
[09/Jun/2011:15:42:09 +051800] - libdb: PANIC: fatal region error detected; run recovery
[09/Jun/2011:15:42:09 +051800] - Serious Error---Failed to trickle, err=-30974 (DB_RUNRECOVERY: Fatal error, run database recovery)
[09/Jun/2011:15:42:09 +051800] dse - Cannot open temporary DSE file "/etc/dirsrv/slapd-testvm/dse.ldif.tmp" for update: OS error 24 (Too many open files)
[09/Jun/2011:15:42:09 +051800] - libdb: PANIC: fatal region error detected; run recovery
[09/Jun/2011:15:42:09 +051800] - Serious Error---Failed in deadlock detect (aborted at 0x0), err=-30974 (DB_RUNRECOVERY: Fatal error, run database recovery)
[09/Jun/2011:15:42:09 +051800] dse - Cannot open temporary DSE file "/etc/dirsrv/slapd-testvm/dse.ldif.tmp" for update: OS error 24 (Too many open files)
[09/Jun/2011:15:42:09 +051800] dse - Cannot open temporary DSE file "/etc/dirsrv/slapd-testvm/dse.ldif.tmp" for update: OS error 24 (Too many open files)
[09/Jun/2011:15:42:09 +051800] dse - Cannot open temporary DSE file "/etc/dirsrv/slapd-testvm/dse.ldif.tmp" for update: OS error 24 (Too many open files)
[09/Jun/2011:15:42:09 +051800] - libdb: PANIC: fatal region error detected; run recovery
[09/Jun/2011:15:42:09 +051800] - Serious Error---Failed in deadlock detect (aborted at 0x0), err=-30974 (DB_RUNRECOVERY: Fatal error, run database recovery)
[09/Jun/2011:15:42:09 +051800] dse - Cannot open temporary DSE file "/etc/dirsrv/slapd-testvm/dse.ldif.tmp" for update: OS error 24 (Too many open files)
[09/Jun/2011:15:42:09 +051800] - libdb: PANIC: fatal region error detected; run recovery

[root@testvm scripts]# tail -f /var/log/dirsrv/slapd-testvm/errors
[09/Jun/2011:15:42:09 +051800] - Serious Error---Failed to trickle, err=-30974 (DB_RUNRECOVERY: Fatal error, run database recovery)
[09/Jun/2011:15:42:10 +051800] - libdb: PANIC: fatal region error detected; run recovery
[09/Jun/2011:15:42:10 +051800] - Serious Error---Failed in deadlock detect (aborted at 0x0), err=-30974 (DB_RUNRECOVERY: Fatal error, run database recovery)
[09/Jun/2011:15:42:10 +051800] - libdb: PANIC: fatal region error detected; run recovery
[09/Jun/2011:15:42:10 +051800] NSMMReplicationPlugin - replica_get_replica_from_dn: failed to locate replication extension of mapping tree node for dc=sub_exam21,dc=exam1,dc=com
[09/Jun/2011:15:42:10 +051800] entryrdn-index - entryrdn_index_read: Failed to make a cursor: DB_RUNRECOVERY: Fatal error, run database recovery(-30974)
[09/Jun/2011:15:42:10 +051800] - dn2entry: Failed to get id for dc=sub_exam20,dc=exam1,dc=com from entryrdn index (-30974)
[09/Jun/2011:15:42:10 +051800] - libdb: PANIC: fatal region error detected; run recovery
[09/Jun/2011:15:42:10 +051800] - FATAL ERROR at idl_new.c (1); server stopping as database recovery needed.
[09/Jun/2011:15:42:10 +051800] - libdb: PANIC: fatal region error detected; run recovery

Comment 8 Sankar Ramalingam 2011-06-09 11:09:07 UTC
adding new entry "cn=subSuffN135,cn=ldbm database,cn=plugins,cn=config"

adding new entry "cn="dc=subsuffN135,dc=basesuffixN,dc=com",cn=mapping tree,cn=config"

adding new entry "dc=subsuffN135,dc=basesuffixN,dc=com"

adding new entry "cn=subSuffN136,cn=ldbm database,cn=plugins,cn=config"
adding new entry "cn="dc=subsuffN136,dc=basesuffixN,dc=com",cn=mapping tree,cn=config"

adding new entry "dc=subsuffN136,dc=basesuffixN,dc=com"
ldap_result: Can't contact LDAP server (-1)

ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)

Result: I could successfully add 135 sub suffixes.

Comment 9 Rich Megginson 2011-06-09 14:15:30 UTC
(In reply to comment #8)
> adding new entry "cn=subSuffN135,cn=ldbm database,cn=plugins,cn=config"
> 
> adding new entry "cn="dc=subsuffN135,dc=basesuffixN,dc=com",cn=mapping
> tree,cn=config"
> 
> adding new entry "dc=subsuffN135,dc=basesuffixN,dc=com"
> 
> adding new entry "cn=subSuffN136,cn=ldbm database,cn=plugins,cn=config"
> adding new entry "cn="dc=subsuffN136,dc=basesuffixN,dc=com",cn=mapping
> tree,cn=config"
> 
> adding new entry "dc=subsuffN136,dc=basesuffixN,dc=com"
> ldap_result: Can't contact LDAP server (-1)
> 
> ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
> ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
> 
> Result: I could successfully add 135 sub suffixes.

Can you attach your errors log?  Do you have errors similar to Amita's?

Comment 10 Rich Megginson 2011-06-09 14:55:26 UTC
The problem is that you are running out of file descriptors for the process.  999 is too many - just use
 ii=1
 while [ $ii -le 102 ] ; do ... rest of script
in order to verify the bug.  See http://directory.fedoraproject.org/wiki/Performance_Tuning#Linux for information about how to increase the number of file descriptors for the slapd process.

Comment 11 Amita Sharma 2011-06-10 10:42:31 UTC
yeah, I was using 99 instead of 999.
I did this - echo "64000" > /proc/sys/fs/file-max

After this I am able to create till 94.. pretty better now.
Again when I am hitting the error that is because of the thing you pointed Rich "running out of file descriptors for the process".

I am incrementing ii
I am creating Sub-Suffix94
I am calling AddSuffix
adding new entry "cn=sub_exam_db94,cn=ldbm database,cn=plugins,cn=config"

adding new entry "cn="dc=sub_exam94,dc=exam1,dc=com",cn=mapping tree,cn=config"
ldap_result: Can't contact LDAP server (-1)

I am incrementing ii
I am creating Sub-Suffix95
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix96
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii
I am creating Sub-Suffix97
I am calling AddSuffix
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
I am incrementing ii


Error Logs:
===============
[10/Jun/2011:12:53:16 +051800] - libdb: /var/lib/dirsrv/slapd-testvm/db/log.0000000050: log file unreadable: Too many open files
[10/Jun/2011:12:53:16 +051800] - libdb: PANIC: Too many open files
[10/Jun/2011:12:53:16 +051800] - libdb: DB_TXN->abort: log undo failed for LSN: 50 5385980: DB_NOTFOUND: No matching key/data pair found
[10/Jun/2011:12:53:16 +051800] - libdb: PANIC: DB_NOTFOUND: No matching key/data pair found
[10/Jun/2011:12:53:16 +051800] - libdb: /var/lib/dirsrv/slapd-testvm/db/log.0000000050: log file unreadable: Too many open files
[10/Jun/2011:12:53:16 +051800] - libdb: PANIC: Too many open files
[10/Jun/2011:12:53:16 +051800] - libdb: DB_TXN->abort: log undo failed for LSN: 50 5385980: DB_NOTFOUND: No matching key/data pair found
[10/Jun/2011:12:53:16 +051800] - libdb: PANIC: DB_NOTFOUND: No matching key/data pair found
[10/Jun/2011:12:53:16 +051800] - libdb: PANIC: DB_RUNRECOVERY: Fatal error, run database recovery
[10/Jun/2011:12:53:16 +051800] - dbp->open("sub_exam_db94/id2entry.db4") failed: Too many open files (24)
[10/Jun/2011:12:53:16 +051800] - Could not open file "/var/lib/dirsrv/slapd-testvm/db/sub_exam_db94/DBVERSION" for writing Netscape Portable Runtime -5971 (Process open FD table is full.)

But I am not hitting the original bug hence marking the bug as VERIFIED.


Note You need to log in before you can comment on or make changes to this bug.