Bug 1749595

Summary: Extremely slow LDIF import with ldif2db
Product: Red Hat Enterprise Linux 7 Reporter: joel <jwooten>
Component: 389-ds-baseAssignee: mreynolds
Status: CLOSED ERRATA QA Contact: RHDS QE <ds-qe-bugs>
Severity: high Docs Contact: Marc Muehlfeld <mmuehlfe>
Priority: high    
Version: 7.3CC: afarley, Egarciad, jwooten, lkrispen, mreynolds, msauton, nkinder, pasik, rsevilla, sgouvern, spichugi, tbordaz, tmihinto, vashirov
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: 389-ds-base-1.3.10.1-3.el7 Doc Type: Bug Fix
Doc Text:
.Importing large LDIF files to Directory Server databases with many nested-subtrees is now significantly faster Previously, if the Directory Server database contained many nested sub-trees, importing a large LDIF file using the `ldif2db` and `ldif2db.pl` utilities was slow. With this update, Directory Server adds the `ancestorid` index after all entries. As a result, importing LDIF files to a database with many nested sub-trees is now significantly faster.
Story Points: ---
Clone Of:
: 1763622 1801696 (view as bug list) Environment:
Last Closed: 2020-03-31 19:46:15 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1763622, 1801696    

Description joel 2019-09-06 00:26:03 UTC
Description of problem:
I'm trying to import a big LDIF (~36 million entries) and I'm suffering a really bad performance during the "Gathering ancestorid " phase.
Importing this LDIF using ldif2db or ldif2db.pl file takes  around 1 day. 

Version-Release number of selected component (if applicable):
389-ds-base-1.3.5.10-18.el7_3.x86_64

How reproducible:
very

Steps to Reproduce:
1.import very large database
2.
3.

Actual results:
can take a day for 30M entries

Expected results:
faster

Additional info:
- from logs
[04/Sep/2019:10:38:20.790500639 +0000] import bbvaES: Gathering ancestorid non-leaf IDs: processed 82% (ID count 14900000)
[04/Sep/2019:10:57:44.853250820 +0000] import bbvaES: Gathering ancestorid non-leaf IDs: processed 83% (ID count 15000000)
[04/Sep/2019:11:17:49.509786966 +0000] import bbvaES: Gathering ancestorid non-leaf IDs: processed 83% (ID count 15100000)
[04/Sep/2019:11:38:00.980343788 +0000] import bbvaES: Gathering ancestorid non-leaf IDs: processed 84% (ID count 15200000)
[04/Sep/2019:11:57:17.381658686 +0000] import bbvaES: Gathering ancestorid non-leaf IDs: processed 84% (ID count 15300000)
[04/Sep/2019:12:16:30.000451563 +0000] import bbvaES: Gathering ancestorid non-leaf IDs: processed 85% (ID count 15400000)

- upstream bug
https://pagure.io/389-ds-base/issue/49850

Comment 11 errata-xmlrpc 2020-03-31 19:46:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:1064