1538994 – [abrt] [faf] 389-ds-base: tc_calloc(): /usr/sbin/ns-slapd killed by 11

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1538994 - [abrt] [faf] 389-ds-base: tc_calloc(): /usr/sbin/ns-slapd killed by 11

Summary: [abrt] [faf] 389-ds-base: tc_calloc(): /usr/sbin/ns-slapd killed by 11

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Enterprise Linux 8
Classification:	Red Hat
Component:	389-ds-base
Sub Component:
Version:	8.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	pre-dev-freeze
Target Release:	8.2
Assignee:	mreynolds
QA Contact:	RHDS QE
Docs Contact:
URL:	http://faf.lab.eng.brq.redhat.com/faf...
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-01-26 10:45 UTC by Viktor Ashirov
Modified:	2019-11-15 00:30 UTC (History)
CC List:	8 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-11-15 00:30:10 UTC
Type:	Bug
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Viktor Ashirov 2018-01-26 10:45:08 UTC

This bug has been created based on an anonymous crash report requested by the package maintainer.

Report URL: http://faf.lab.eng.brq.redhat.com/faf/reports/bthash/5b7ed1bc20ce308e93fe9ab96e4d90bca09f6832/

Comment 1 thierry bordaz 2018-01-26 10:51:56 UTC

backtrace is 

(gdb) where
#0  0x00007f84f621d0df in tc_calloc () at /lib64/libtcmalloc.so.4
#1  0x00007f84f73cafe4 in slapi_ch_calloc (nelem=1, size=232) at ldap/servers/slapd/ch_malloc.c:180
#2  0x00007f84f76baece in new_ns_job (tp=tp@entry=0x5559930c8cc0, fd=fd@entry=0x555994eed980, job_type=job_type@entry=276, func=func@entry=0x555991a5bfc0 <ns_handle_pr_read_ready>, data=data@entry=0x555993b9b290) at src/nunc-stans/ns/ns_thrpool.c:695
#3  0x00007f84f76bb1c8 in alloc_io_context (data=0x555993b9b290, func=0x555991a5bfc0 <ns_handle_pr_read_ready>, job_type=276, fd=0x555994eed980, tp=0x5559930c8cc0) at src/nunc-stans/ns/ns_thrpool.c:726
#4  0x00007f84f76bb1c8 in ns_add_io_timeout_job (tp=0x5559930c8cc0, fd=0x555994eed980, tv=tv@entry=0x7f84de543b80, job_type=job_type@entry=260, func=func@entry=0x555991a5bfc0 <ns_handle_pr_read_ready>, data=data@entry=0x555993b9b290, job=job@entry=0x555993b9b3f0)
    at src/nunc-stans/ns/ns_thrpool.c:954
#5  0x0000555991a5bdf6 in ns_connection_post_io_or_closing (conn=0x555993b9b290) at ldap/servers/slapd/daemon.c:1751
#6  0x0000555991a5d230 in ns_handle_new_connection (job=0x555993b49a40) at ldap/servers/slapd/daemon.c:2470
#7  0x00007f84f76bac89 in work_job_execute (job=0x555993b49a40) at src/nunc-stans/ns/ns_thrpool.c:291
#8  0x00007f84f76bbbe3 in event_cb (fd=<optimized out>, event=<optimized out>, arg=<optimized out>)
    at src/nunc-stans/ns/ns_event_fw_event.c:118
#9  0x00007f84f4710a14 in event_base_loop () at /lib64/libevent-2.0.so.5
#10 0x00007f84f76bbeae in ns_event_fw_loop (ns_event_fw_ctx=<optimized out>) at src/nunc-stans/ns/ns_event_fw_event.c:308
#11 0x00007f84f76baac9 in event_loop_thread_func (arg=0x5559930c8cc0) at src/nunc-stans/ns/ns_thrpool.c:581
#12 0x00007f84f4b7fdd5 in start_thread () at /lib64/libpthread.so.0
#13 0x00007f84f422daed in clone () at /lib64/libc.so.6


Something specific to this crash (compare to similar one https://bugzilla.redhat.com/show_bug.cgi?id=1517383) is that it is in the middle of ns_handle_new_connection

Retesting with ASAN would help because it looks a heap corruption

Comment 2 thierry bordaz 2018-01-26 16:55:51 UTC

According to the latest operation in the core. The crash occurred at just after 25/Jan/2018:17:02:35.488798428.
The error logs are showing others operation after the crash that are likely due to ipa installer that was continuing

Comment 3 thierry bordaz 2018-01-26 16:59:03 UTC

Last operations just before the crash were:
...
[25/Jan/2018:17:02:35.153276363 -0500] conn=63 fd=70 slot=70 connection from 10.16.46.53 to 10.16.46.53
[25/Jan/2018:17:02:35.161447480 -0500] conn=63 op=0 BIND dn="cn=Directory Manager" method=128 version=2
[25/Jan/2018:17:02:35.161534444 -0500] conn=63 op=0 RESULT err=0 tag=97 nentries=0 etime=0.0008006553 dn="cn=directory manager"
[25/Jan/2018:17:02:35.163161348 -0500] conn=64 fd=74 slot=74 connection from 10.16.46.53 to 10.16.46.53
[25/Jan/2018:17:02:35.165369347 -0500] conn=64 op=0 BIND dn="cn=Directory Manager" method=128 version=3
[25/Jan/2018:17:02:35.165450276 -0500] conn=64 op=0 RESULT err=0 tag=97 nentries=0 etime=0.0000249121 dn="cn=directory manager"
[25/Jan/2018:17:02:35.166722817 -0500] conn=64 op=1 SRCH base="ou=certificateProfiles,ou=ca,o=ipaca" scope=0 filter="(objectClass=*)" attrs=ALL
[25/Jan/2018:17:02:35.166862624 -0500] conn=64 op=1 RESULT err=0 tag=101 nentries=1 etime=0.0000212450
[25/Jan/2018:17:02:35.167547072 -0500] conn=64 op=2 SRCH base="ou=certificateProfiles,ou=ca,o=ipaca" scope=2 filter="(objectClass=*)" attrs="* entryusn nsUniqueId numSubordinates aci" options=persistent
[25/Jan/2018:17:02:35.478741848 -0500] conn=63 op=1 UNBIND
[25/Jan/2018:17:02:35.478774334 -0500] conn=63 op=1 fd=70 closed - U1
[25/Jan/2018:17:02:35.479422386 -0500] conn=65 fd=70 slot=70 connection from 10.16.46.53 to 10.16.46.53
[25/Jan/2018:17:02:35.479804147 -0500] conn=65 op=0 BIND dn="cn=Directory Manager" method=128 version=2
[25/Jan/2018:17:02:35.479893600 -0500] conn=65 op=0 RESULT err=0 tag=97 nentries=0 etime=0.0000226436 dn="cn=directory manager"
[25/Jan/2018:17:02:35.480454340 -0500] conn=66 fd=75 slot=75 connection from 10.16.46.53 to 10.16.46.53
[25/Jan/2018:17:02:35.480862636 -0500] conn=66 op=0 BIND dn="cn=Directory Manager" method=128 version=3
[25/Jan/2018:17:02:35.480917975 -0500] conn=66 op=0 RESULT err=0 tag=97 nentries=0 etime=0.0000236591 dn="cn=directory manager"
[25/Jan/2018:17:02:35.487832755 -0500] conn=67 fd=76 slot=76 connection from 10.16.46.53 to 10.16.46.53
[25/Jan/2018:17:02:35.488300505 -0500] conn=68 fd=77 slot=77 connection from 10.16.46.53 to 10.16.46.53
[25/Jan/2018:17:02:35.488798428 -0500] conn=69 fd=78 slot=78 connection from 10.16.46.53 to 10.16.46.53

Comment 17 Lukas Slebodnik 2018-02-19 15:26:39 UTC

Another crash:
(gdb) bt
#0  0x00007ff839a6b29b in tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, unsigned int, int) () at /lib64/libtcmalloc.so.4
#1  0x00007ff839a6b68c in tcmalloc::ThreadCache::ListTooLong(tcmalloc::ThreadCache::FreeList*, unsigned int) () at /lib64/libtcmalloc.so.4
#2  0x00007ff83ac283a1 in slapi_ch_free (ptr=ptr@entry=0x7ffeaaf3ad08) at ldap/servers/slapd/ch_malloc.c:270
#3  0x0000555b65a7ad63 in nunc_stans_free (ptr=0x555b73093860) at ldap/servers/slapd/main.c:176
#4  0x00007ff83af17bff in ns_job_done (job=0x555b73093860) at src/nunc-stans/ns/ns_thrpool.c:802
#5  0x0000555b65a74cd7 in ns_connection_post_io_or_closing (conn=conn@entry=0x555b7186bd28) at ldap/servers/slapd/daemon.c:1693
#6  0x0000555b65a70b04 in disconnect_server_nomutex_ext (conn=conn@entry=0x555b7186bd28, opconnid=<optimized out>, opid=opid@entry=-1, reason=reason@entry=-5000, error=error@entry=125, schedule_closure_job=schedule_closure_job@entry=1) at ldap/servers/slapd/connection.c:2301
#7  0x0000555b65a713cb in disconnect_server_nomutex (conn=conn@entry=0x555b7186bd28, opconnid=<optimized out>, opid=opid@entry=-1, reason=reason@entry=-5000, error=error@entry=125)
    at ldap/servers/slapd/connection.c:2314
#8  0x0000555b65a727a4 in connection_table_disconnect_all (ct=0x555b6c720e00) at ldap/servers/slapd/conntable.c:101
#9  0x0000555b65a76cfe in slapd_daemon (ports=ports@entry=0x7ffeaaf3b0f0, tp=tp@entry=0x555b67526300) at ldap/servers/slapd/daemon.c:1190
#10 0x0000555b65a6880f in main (argc=<optimized out>, argv=0x7ffeaaf3b568) at ldap/servers/slapd/main.c:1200

Comment 20 Viktor Ashirov 2018-02-20 16:37:23 UTC

Looking at the dates of reports and versions of 389-ds-base, it looks like the problem started to appear in build 389-ds-base-1.3.7.5-13.el7 and higher. It might be another regression introduced by the coverity fixes, that this version contains.

Comment 21 Viktor Ashirov 2018-02-20 16:41:43 UTC

Further looking at the code, it seems that this assertion was added in 1418fc327245a9b214f107ae3ba2e192b48d16d3, upstream ticket: https://pagure.io/389-ds-base/issue/48184, bz1517383. Thierry already pointed out that it might be related.

Comment 25 Lukas Slebodnik 2018-02-28 11:34:30 UTC

Another crash:
(gdb) bt full
#0  0x00007fee88ef929b in tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, unsigned int, int) () at /lib64/libtcmalloc.so.4
#1  0x00007fee88ef968c in tcmalloc::ThreadCache::ListTooLong(tcmalloc::ThreadCache::FreeList*, unsigned int) () at /lib64/libtcmalloc.so.4
#2  0x00007fee8a0b63a1 in slapi_ch_free (ptr=ptr@entry=0x7ffcbdc2be18) at ldap/servers/slapd/ch_malloc.c:270
#3  0x000055aa9c13dd63 in nunc_stans_free (ptr=0x55aaaaa87590) at ldap/servers/slapd/main.c:176
#4  0x00007fee8a3a5bff in ns_job_done (job=0x55aaaaa87590) at src/nunc-stans/ns/ns_thrpool.c:802
        shutdown_state = 1
#5  0x000055aa9c137cd7 in ns_connection_post_io_or_closing (conn=conn@entry=0x55aaa8572480) at ldap/servers/slapd/daemon.c:1693
        tv = {tv_sec = 94191457084544, tv_usec = 112}
#6  0x000055aa9c133b04 in disconnect_server_nomutex_ext (conn=conn@entry=0x55aaa8572480, opconnid=<optimized out>, opid=opid@entry=-1, reason=reason@entry=-5000, error=error@entry=125, schedule_closure_job=schedule_closure_job@entry=1) at ldap/servers/slapd/connection.c:2301
#7  0x000055aa9c1343cb in disconnect_server_nomutex (conn=conn@entry=0x55aaa8572480, opconnid=<optimized out>, opid=opid@entry=-1, reason=reason@entry=-5000, error=error@entry=125)
    at ldap/servers/slapd/connection.c:2314
#8  0x000055aa9c1357a4 in connection_table_disconnect_all (ct=0x55aaa7820e00) at ldap/servers/slapd/conntable.c:101
        c = 0x55aaa8572480
        i = 112
#9  0x000055aa9c139cfe in slapd_daemon (ports=ports@entry=0x7ffcbdc2c200, tp=tp@entry=0x55aa9dbf0300) at ldap/servers/slapd/daemon.c:1190
        n_tcps = 0x0
        s_tcps = 0x0
        i_unix = 0x0
        fdesp = <optimized out>
        num_poll = 3
        pr_timeout = 250
        threads = <optimized out>
        in_referral_mode = 0
#10 0x000055aa9c12b80f in main (argc=<optimized out>, argv=0x7ffcbdc2c678) at ldap/servers/slapd/main.c:1200
        return_value = <optimized out>
        mcfg = 
          {extraname = 0x0, slapd_exemode = 1, n_port = 389, i_port = 1, s_port = 636, myname = 0x55aa9da62060 "ns-slapd", ldif_file = 0x0, ldif_files = 0, cmd_line_instance_name = 0x0, cmd_line_instance_names = 0x0, skip_db_protect_check = 0, db2ldif_include = 0x0, db2ldif_exclude = 0x0, ldif2db_removedupvals = 1, ldif2db_noattrindexes = 0, db2index_attrs = 0x0, ldif_printkey = 5, archive_name = 0x0, db2ldif_dump_replica = 0, db2ldif_dump_uniqueid = 1, ldif2db_generate_uniqueid = 1, ldif2db_namespaceid = 0x0, importexport_encrypt = 0, upgradedb_flags = 0, upgradednformat_dryrun = 0, is_quiet = 0, dbverify_verbose = 0, dbverify_dbdir = 0x0}
        slapdFrontendConfig = <optimized out>
        ports_info = 
          {n_port = 389, s_port = 636, n_listenaddr = 0x0, s_listenaddr = 0x0, n_socket = 0x55aa9dc300d0, i_listenaddr = 0x0, i_port = 1, i_socket = 0x55aa9de5bfe0, s_socket = 0x55aa9dc300b0}
        tp = 0x55aa9dbf0300
        m = <optimized out>
        notify = <optimized out>
(gdb)

Comment 41 mreynolds 2019-11-15 00:30:10 UTC

Since this crash occurs in nunc-stans, and nunc-stans has been abandoned, I am going to close this bug as won't fix.

Note You need to log in before you can comment on or make changes to this bug.