Bug 2229999

Summary:	ns-slapd crashs when lmdb import fails or is aborted
Product:	Red Hat Enterprise Linux 9	Reporter:	Pierre Rogier <progier>
Component:	389-ds-base	Assignee:	Pierre Rogier <progier>
Status:	CLOSED MIGRATED	QA Contact:	LDAP QA Team <idm-ds-qe-bugs>
Severity:	unspecified	Docs Contact:
Priority:	high
Version:	9.3	CC:	idm-ds-dev-bugs, tbordaz, vashirov
Target Milestone:	rc	Keywords:	MigratedToJIRA, Triaged
Target Release:	9.4	Flags:	pm-rhel: mirror+
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2023-09-19 14:19:45 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Pierre Rogier 2023-08-08 12:59:10 UTC

Description of problem:
The following issue was found while trying to test BZ 2116948 LMDB import is too slow
but configuring a database size too small to hosts the users

After adding a new test case:
diff --git a/dirsrvtests/tests/suites/import/import_test.py b/dirsrvtests/tests/suites/import/import_test.py
index 84c8cf290..ee71e0bea 100644
--- a/dirsrvtests/tests/suites/import/import_test.py
+++ b/dirsrvtests/tests/suites/import/import_test.py
@@ -22,6 +22,7 @@ from lib389.tasks import ImportTask
 from lib389.index import Indexes
 from lib389.monitor import Monitor
 from lib389.backend import Backends
+from lib389.config import LMDB_LDBMConfig
 from lib389.config import LDBMConfig
 from lib389.utils import ds_is_newer, get_default_db_lib
 from lib389.idm.user import UserAccount
@@ -550,6 +551,15 @@ def test_import_wrong_file_path(topo):
         dbtasks_ldif2db(topo.standalone, log, args)
     assert "The LDIF file does not exist" in str(e.value)

+def test_crash_on_ldif2db_with_lmdb(topo, _import_clean):
+    BIG_MAP_SIZE = 20 * 1024 * 1024 * 1024
+    if get_default_db_lib() == "mdb":
+        handler = LMDB_LDBMConfig(topo.standalone)
+        mapsize = BIG_MAP_SIZE
+        log.info(f'Set lmdb map size to {mapsize}.')
+        handler.replace('nsslapd-mdb-max-size', str(mapsize))
+        topo.standalone.restart()
+    _import_offline(topo, 10_000_000)

 if __name__ == '__main__':
     # Run isolated


While running it with mdb, it crashes
NSSLAPD_DB_LIB=mdb py.test -v import_test.py::test_crash_on_ldif2db_with_lmdb

core was not generated, but I attached gdb during the test and then generated the core.

Thread 7 "ns-slapd" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f9269bfa640 (LWP 3924)]
__strncmp_avx2_rtm () at ../sysdeps/x86_64/multiarch/strcmp-avx2.S:284
284             VMOVU   (%rdi), %ymm0
(gdb) bt
#0  __strncmp_avx2_rtm () at ../sysdeps/x86_64/multiarch/strcmp-avx2.S:284
#1  0x00007f976d18fb7d in dbmdb_import_prepare_worker_entry (wqelmnt=0x55cd73989a40)
    at ldap/servers/slapd/back-ldbm/db-mdb/mdb_import_threads.c:1347
#2  0x00007f976d1958ce in dbmdb_import_worker (param=<optimized out>)
    at ldap/servers/slapd/back-ldbm/db-mdb/mdb_import_threads.c:3191
#3  0x00007f9770ab4c34 in _pt_root (arg=0x55cd73973dc0) at pthreads/../../../../nspr/pr/src/pthreads/ptthread.c:201
#4  0x00007f977089f822 in start_thread (arg=<optimized out>) at pthread_create.c:443
#5  0x00007f977083f450 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

Full backtrace is attached, core is available at https://drive.google.com/file/d/1bGYpP0JuifLKVGdXRVMYj-SkuWCvlnXZ/view?usp=sharing


Version-Release number of selected component (if applicable):

How reproducible:
  Always

Steps to Reproduce:
1. See test case in the description

Actual results:
ns-slapd crashes


Expected results:
ns-slapd should fail without crashing.

Additional info:
Database size should be at least 50 Gb for 10 000 000 users

Crash is caused by a double free when freeing the import pipe line resources.

Comment 1 RHEL Program Management 2023-09-19 14:19:25 UTC

Issue migration from Bugzilla to Jira is in process at this time. This will be the last message in Jira copied from the Bugzilla bug.

Comment 2 RHEL Program Management 2023-09-19 14:19:45 UTC

This BZ has been automatically migrated to the issues.redhat.com Red Hat Issue Tracker. All future work related to this report will be managed there.

Due to differences in account names between systems, some fields were not replicated.  Be sure to add yourself to Jira issue's "Watchers" field to continue receiving updates and add others to the "Need Info From" field to continue requesting information.

To find the migrated issue, look in the "Links" section for a direct link to the new issue location. The issue key will have an icon of 2 footprints next to it, and begin with "RHEL-" followed by an integer.  You can also find this issue by visiting https://issues.redhat.com/issues/?jql= and searching the "Bugzilla Bug" field for this BZ's number, e.g. a search like:

"Bugzilla Bug" = 1234567

In the event you have trouble locating or viewing this issue, you can file an issue by sending mail to rh-issues. You can also visit https://access.redhat.com/articles/7032570 for general account information.