Bug 1934067

Summary: dbscan crash due to a segmentation fault when looking up for key "=ffffffff-ffffffff-ffffffff-ffffffff"
Product: Red Hat Directory Server Reporter: Têko Mihinto <tmihinto>
Component: 389-ds-baseAssignee: Pierre Rogier <progier>
Status: CLOSED ERRATA QA Contact: RHDS QE <ds-qe-bugs>
Severity: high Docs Contact: Marc Muehlfeld <mmuehlfe>
Priority: medium    
Version: 11.3CC: afarley, bsmejkal, gkimetto, ldap-maint, mreynolds, progier, tbordaz
Target Milestone: ---Keywords: Triaged
Target Release: dirsrv-12.1   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: redhat-ds-12-9010020220729141346.12_1_910 Doc Type: Bug Fix
Doc Text:
Cause: Search key is not properly reset if it is not found in the database. Consequence: An attempt to free data that are not allocated in the heap generated a core dump (with SIGSEGV exception) in the final cleanup (i.e: just before exiting). Fix: data is now reset. Result: No more core dump in this case.
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-12-06 15:44:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Têko Mihinto 2021-03-02 12:36:50 UTC
Description of problem:
dbscan is crashing when trying to get the specific key "=ffffffff-ffffffff-ffffffff-ffffffff".
The crash happens whether replication is enabled or not.

Version-Release number of selected component (if applicable):
# cat /etc/redhat-release
Red Hat Enterprise Linux release 8.3 (Ootpa)
#
# rpm -qa | grep ^389-ds
389-ds-base-debugsource-1.4.3.13-1.module+el8dsrv+8334+69a46a2e.x86_64
389-ds-base-1.4.3.13-1.module+el8dsrv+8334+69a46a2e.x86_64
389-ds-base-libs-1.4.3.13-1.module+el8dsrv+8334+69a46a2e.x86_64
389-ds-base-debuginfo-1.4.3.13-1.module+el8dsrv+8334+69a46a2e.x86_64
#

How reproducible:
Always.

Steps to Reproduce:
For instance:
# dbscan -f /var/lib/dirsrv/slapd-alps9/db/userroot/nsuniqueid.db  -k =ffffffff-ffffffff-ffffffff-ffffffff -r =ffffffff-ffffffff-ffffffff-ffffffff

Actual results:
dbscan crashed.

Expected results:
Get the requested information.

Additional info:

# more core_backtrace
{   "signal": 11
,   "executable": "/usr/bin/dbscan"
,   "only_crash_thread": true
,   "stacktrace":
      [ {   "crash_thread": true
        ,   "frames":
              [ {   "address": 140446569548067
                ,   "build_id": "d10b1fe6e4b5cd05ec7461fb44b06713ba81337a"
                ,   "build_id_offset": 551203
                ,   "function_name": "__libc_free"
                ,   "file_name": "/usr/lib64/libc-2.28.so"
                }
              , {   "address": 93831412266707
                ,   "build_id": "1c61a51641c2902b1c56361660219abf1689292e"
                ,   "build_id_offset": 5843
                ,   "function_name": "main"
                ,   "file_name": "/usr/bin/dbscan"
                } ]
        } ]
}
#

# gdb /usr/bin/dbscan ./coredump
...
Core was generated by `dbscan -f /var/lib/dirsrv/slapd-alps9/db/userroot/nsuniqueid.db -k =ffffffff-ff'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007fbc43e34923 in __GI___libc_free (mem=<optimized out>) at malloc.c:3131
3131      ar_ptr = arena_for_chunk (p);
(gdb)
(gdb) where
#0  0x00007fbc43e34923 in __GI___libc_free (mem=<optimized out>) at malloc.c:3131
#1  0x00005556d3ff36d3 in main (argc=<optimized out>, argv=<optimized out>) at ldap/servers/slapd/tools/dbscan.c:1358
(gdb) bt full
#0  0x00007fbc43e34923 in __GI___libc_free (mem=<optimized out>) at malloc.c:3131
        ar_ptr = <optimized out>
        p = <optimized out>
        hook = <optimized out>
#1  0x00005556d3ff36d3 in main (argc=<optimized out>, argv=<optimized out>) at ldap/servers/slapd/tools/dbscan.c:1358
        env = 0x5556d60762a0
        db = 0x5556d60781d0
        cursor = 0x5556d6078d80
        filename = 0x7ffed256b6a4 "/var/lib/dirsrv/slapd-alps9/db/userroot/nsuniqueid.db"
        key = {data = 0x7ffed256b6dd, size = 36, ulen = 0, dlen = 0, doff = 0, app_data = 0x0, flags = 128}
        data = {data = 0x5556d60797d0, size = 4, ulen = 0, dlen = 0, doff = 0, app_data = 0x0, flags = 128}
        ret = 1
        find_key = <optimized out>
        entry_id = 4294967295
        c = <optimized out>
(gdb)
#

Comment 6 Pierre Rogier 2021-08-12 17:24:11 UTC
I looked at  dbscan code im 1.4.3 branch and my suspicion was right:
   we should clear the key (to avoid freeing it) while handling user specified key error case. 

  if (ret != 0) {
                printf("Can't find key '%s'\n", find_key);
                ret = 1;
+               key->data = NULL;
                goto done;
  }

Note:  The bug was fixed upstream a few month ago while preparing the lmdb migration
 because bdb data struct are no more used but encapsuled by dbimpl API struct 
 (that takes care whether the data must be freed or not).
Since dbimpl is a very large change we should not try to frontport it but rather use the above solution instead.

The good news is that the crash occurs when doing the final cleanup just before exiting so all data are rightly displayed)
The bad news is that the key is missing in the db which is not expected according to the initial issue error message ...

And  to answer the last customer question:
 The fact that ns-slapd is running is OK and should not be an issue.

Comment 11 errata-xmlrpc 2022-12-06 15:44:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (redhat-ds:12 bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:8836