Bug 1749595 - Extremely slow LDIF import with ldif2db
Summary: Extremely slow LDIF import with ldif2db
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: 389-ds-base
Version: 7.3
Hardware: All
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: mreynolds
QA Contact: RHDS QE
Marc Muehlfeld
URL:
Whiteboard:
Depends On:
Blocks: 1763622 1801696
TreeView+ depends on / blocked
 
Reported: 2019-09-06 00:26 UTC by joel
Modified: 2020-09-13 22:12 UTC (History)
14 users (show)

Fixed In Version: 389-ds-base-1.3.10.1-3.el7
Doc Type: Bug Fix
Doc Text:
.Importing large LDIF files to Directory Server databases with many nested-subtrees is now significantly faster Previously, if the Directory Server database contained many nested sub-trees, importing a large LDIF file using the `ldif2db` and `ldif2db.pl` utilities was slow. With this update, Directory Server adds the `ancestorid` index after all entries. As a result, importing LDIF files to a database with many nested sub-trees is now significantly faster.
Clone Of:
: 1763622 1801696 (view as bug list)
Environment:
Last Closed: 2020-03-31 19:46:15 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github 389ds 389-ds-base issues 2909 0 None closed ldbm_get_nonleaf_ids() painfully slow for databases with many non-leaf entries 2021-01-21 16:22:33 UTC
Red Hat Product Errata RHBA-2020:1064 0 None None None 2020-03-31 19:46:55 UTC

Description joel 2019-09-06 00:26:03 UTC
Description of problem:
I'm trying to import a big LDIF (~36 million entries) and I'm suffering a really bad performance during the "Gathering ancestorid " phase.
Importing this LDIF using ldif2db or ldif2db.pl file takes  around 1 day. 

Version-Release number of selected component (if applicable):
389-ds-base-1.3.5.10-18.el7_3.x86_64

How reproducible:
very

Steps to Reproduce:
1.import very large database
2.
3.

Actual results:
can take a day for 30M entries

Expected results:
faster

Additional info:
- from logs
[04/Sep/2019:10:38:20.790500639 +0000] import bbvaES: Gathering ancestorid non-leaf IDs: processed 82% (ID count 14900000)
[04/Sep/2019:10:57:44.853250820 +0000] import bbvaES: Gathering ancestorid non-leaf IDs: processed 83% (ID count 15000000)
[04/Sep/2019:11:17:49.509786966 +0000] import bbvaES: Gathering ancestorid non-leaf IDs: processed 83% (ID count 15100000)
[04/Sep/2019:11:38:00.980343788 +0000] import bbvaES: Gathering ancestorid non-leaf IDs: processed 84% (ID count 15200000)
[04/Sep/2019:11:57:17.381658686 +0000] import bbvaES: Gathering ancestorid non-leaf IDs: processed 84% (ID count 15300000)
[04/Sep/2019:12:16:30.000451563 +0000] import bbvaES: Gathering ancestorid non-leaf IDs: processed 85% (ID count 15400000)

- upstream bug
https://pagure.io/389-ds-base/issue/49850

Comment 11 errata-xmlrpc 2020-03-31 19:46:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:1064


Note You need to log in before you can comment on or make changes to this bug.