Description of problem: It was found when investigating this bug: Bug 508792 - Ldap: search operation is slow after ldif2db of about 130K entries. I copy the comment here: Most likely, db2index "all" has some bugs related to multiple db backends. It runs on multiple db backends simultaneously. It borrows the import code which runs on one backend only. While debugging, I found this odd entry (see dn and ndn, each belong to 2 different db backends: $ p *newEntry->ep_entry $18 = {e_sdn = {flag = 6 '\006', dn = 0x1f430fd0 "cn=TUS Agents,ou=Groups,dc=switch.dsdev.sjc.redhat.com-rhpki-tps", ndn = 0x1fa87120 "cn=1,ou=ca,ou=requests,dc=switch.dsdev.sjc.redhat.com-rhpki-ca", ndn_len = 64}, e_uniqueid = 0x1f432030 "08af15bf-672011de-aa10877e-796b1883", e_dncsnset = 0x0, e_maxcsn = 0x0, e_attrs = 0x1f4310a0, e_deleted_attrs = 0x0, e_virtual_attrs = 0x0, e_virtual_watermark = 0, e_virtual_lock = 0x1f430e40, e_extension = 0x0, e_flags = 0 '\0'} Since DS in CS has this many db backends, the bug was easily revealed, I think. NetscapeRoot switch.dsdev.sjc.redhat.com-rhpki-ca switch.dsdev.sjc.redhat.com-rhpki-kra switch.dsdev.sjc.redhat.com-rhpki-ocsp switch.dsdev.sjc.redhat.com-rhpki-tks switch.dsdev.sjc.redhat.com-rhpki-tps userRoot Probably, we should eliminate db2index (all) and support db2index -n <instance> | -s <suffix> as ldif2db does for the correct behaviour.
Created attachment 356581 [details] git patch file [Files] ldap/servers/slapd/back-ldbm/dblayer.c ldap/servers/slapd/back-ldbm/import-threads.c ldap/servers/slapd/back-ldbm/import.c ldap/servers/slapd/back-ldbm/import.h [Problem Description] db2index all (internally, called upgradedb) reads through the main db id2entry.db# and reindex all the associated indexed attributes. The reindex borrows the import code where the entry id is newly assigned. The new entry id's are connective. On the other hand, entry id's of the entries in the db to be reindexed are not. The borrowed import code assumes the entry id and the index of the fifo are tightly coupled and the timing when the writing to and reading from the fifo are calculated based upon the assumption. [Fix Description] The assumption should have been revised so that the entry id which is available up to is kept in ready_EID in the job structure and entry id from each entry (entry->ep_id) is compared with ready_EID instead of ready_ID that holds the sequential number. Additionally, I eliminated unused variable "shift" from import_fifo_fetch. Also, _dblayer_delete_instance_dir cleans up files and directories, recursively.
Thanks to Nathan for the reviews! Push to master & TET was also checked in into HEAD. $ git merge db2index Updating 0565e8c..a26ba73 Fast forward ldap/servers/slapd/back-ldbm/dblayer.c | 2 +- ldap/servers/slapd/back-ldbm/import-threads.c | 37 ++++++++++++++----------- ldap/servers/slapd/back-ldbm/import.c | 15 +++++----- ldap/servers/slapd/back-ldbm/import.h | 3 +- 4 files changed, 31 insertions(+), 26 deletions(-) $ git push Counting objects: 19, done. Delta compression using 4 threads. Compressing objects: 100% (10/10), done. Writing objects: 100% (10/10), 1.70 KiB, done. Total 10 (delta 8), reused 0 (delta 0) To ssh://git.fedorahosted.org/git/389/ds.git 0565e8c..a26ba73 master -> master
Created attachment 358139 [details] additional git patch file (import.c) File: ldap/servers/slapd/back-ldbm/import.c Fix Description: The previous commit put the assertion at the wrong place. It should be applied just for the worker thread.
Thanks to Rich for reviewing the change. Pushed to master: $ git merge db2index Updating 1cc186a..74093d6 Fast forward ldap/servers/slapd/back-ldbm/import.c | 10 ++++++---- 1 files changed, 6 insertions(+), 4 deletions(-) $ git push Counting objects: 13, done. Delta compression using 4 threads. Compressing objects: 100% (7/7), done. Writing objects: 100% (7/7), 783 bytes, done. Total 7 (delta 5), reused 0 (delta 0) To ssh://git.fedorahosted.org/git/389/ds.git 1cc186a..74093d6 master -> master
It is not really clear how to verify this bug, can you please add clear steps to reproduce with directory server only? Thanks
(In reply to comment #5) > It is not really clear how to verify this bug, can you please add clear steps > to reproduce with directory server only? Thanks Steps to verify. 1. install DS 2. create multiple backends, e.g., userRoot, backend1, backend2, backend3 3. stop-slapd 4. import data to each backend 5. run reindex command line utility: /usr/lib[64]/dirsrv/slapd-ID/db2index (db2index reindexes all backends if no option is given) 6. check the errors log if it logged any errors 7. start-slapd and run search with the filter containing the indexed type (e.g., cn, mail, etc.). If it returns the expected result, reindex worked fine.
Verified - RHEL 4 version: redhat-ds-base-8.2.0-2010052404.el4dsrv 1. created 5 new backends via DS console. 2. stopped slap 3. generated data ldifs for all backends with dbgen.pl 4. imported the data into the backends using ldif2db 5. reindexed all backends * no errors 6. started slapd Ran searches with cn filter against all backends * All searches returned expected results