RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1783190 - [abrt] [faf] sssd: raise(): /usr/libexec/sssd/sssd_autofs killed by 6
Summary: [abrt] [faf] sssd: raise(): /usr/libexec/sssd/sssd_autofs killed by 6
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: sssd
Version: 8.2
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: 8.2
Assignee: Alexey Tikhonov
QA Contact: sssd-qe
URL: http://faf.lab.eng.brq.redhat.com/faf...
Whiteboard: sync-to-jira
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-12-13 09:20 UTC by Steeve Goveas
Modified: 2023-04-20 15:24 UTC (History)
8 users (show)

Fixed In Version: sssd-2.2.3-9.el8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-04-28 16:56:29 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github SSSD sssd issues 5096 0 None closed util/sss_ptr_hash.c: potential double free in `sss_ptr_hash_delete_cb()` 2020-08-17 17:38:25 UTC
Red Hat Issue Tracker RHELPLAN-32144 0 None None None 2023-04-20 15:24:33 UTC
Red Hat Issue Tracker SSSD-1828 0 None None None 2023-04-20 15:24:58 UTC
Red Hat Product Errata RHBA-2020:1863 0 None None None 2020-04-28 16:56:40 UTC

Description Steeve Goveas 2019-12-13 09:20:56 UTC
This bug has been created based on an anonymous crash report requested by the package maintainer.

Report URL: http://faf.lab.eng.brq.redhat.com/faf/reports/bthash/fa78a2abc202ce1ff059042775915deb61d9e069/

time:           Thu 12 Dec 2019 07:53:30 AM EST
package:        sssd-common-2.2.3-3.el8
reason:         sssd_autofs killed by SIGABRT
crash_function: talloc_abort
cmdline:        /usr/libexec/sssd/sssd_autofs --uid 0 --gid 0 --logger=files
executable:     /usr/libexec/sssd/sssd_autofs
component:      sssd
uid:            0
username:       root
hostname:       ci-vm-10-0-154-155.hosted.upshift.rdu2.redhat.com
os_release:     Red Hat Enterprise Linux release 8.2 Beta (Ootpa)
architecture:   x86_64
pwd:            /
kernel:         4.18.0-161.el8.x86_64
abrt_version:   2.10.9

Reports:
uReport: BTHASH=fa78a2abc202ce1ff059042775915deb61d9e069
ABRT Server: URL=http://faf.lab.eng.brq.redhat.com/faf/reports/bthash/fa78a2abc202ce1ff059042775915deb61d9e069
ABRT Server: URL=http://faf.lab.eng.brq.redhat.com/faf/reports/14853/
CI Job: https://idm-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/sgoveas-dev-pipeline/job/automount82/6/

Full Backtrace:
could not find '.gnu_debugaltlink' file for /var/cache/abrt-di/usr/lib/debug/usr/libexec/sssd/sssd_autofs-2.2.3-3.el8.x86_64.debug
[New LWP 29128]
Error while reading shared library symbols for /usr/lib64/sssd/libsss_util.so:
could not find '.gnu_debugaltlink' file for /var/cache/abrt-di/usr/lib/debug/usr/lib64/sssd/libsss_util.so-2.2.3-3.el8.x86_64.debug
Error while reading shared library symbols for /lib64/libbasicobjects.so.0:
could not find '.gnu_debugaltlink' file for /var/cache/abrt-di/usr/lib/debug/usr/lib64/libbasicobjects.so.0.1.0-0.6.1-39.el8.x86_64.debug
Error while reading shared library symbols for /lib64/libref_array.so.1:
could not find '.gnu_debugaltlink' file for /var/cache/abrt-di/usr/lib/debug/usr/lib64/libref_array.so.1.2.1-0.6.1-39.el8.x86_64.debug
Error while reading shared library symbols for /lib64/libcollection.so.4:
could not find '.gnu_debugaltlink' file for /var/cache/abrt-di/usr/lib/debug/usr/lib64/libcollection.so.4.1.1-0.6.1-39.el8.x86_64.debug
Error while reading shared library symbols for /usr/lib64/sssd/libsss_cert.so:
could not find '.gnu_debugaltlink' file for /var/cache/abrt-di/usr/lib/debug/usr/lib64/sssd/libsss_cert.so-2.2.3-3.el8.x86_64.debug
Error while reading shared library symbols for /usr/lib64/sssd/libsss_crypt.so:
could not find '.gnu_debugaltlink' file for /var/cache/abrt-di/usr/lib/debug/usr/lib64/sssd/libsss_crypt.so-2.2.3-3.el8.x86_64.debug
Error while reading shared library symbols for /usr/lib64/sssd/libsss_child.so:
could not find '.gnu_debugaltlink' file for /var/cache/abrt-di/usr/lib/debug/usr/lib64/sssd/libsss_child.so-2.2.3-3.el8.x86_64.debug
Error while reading shared library symbols for /lib64/libsss_certmap.so.0:
could not find '.gnu_debugaltlink' file for /var/cache/abrt-di/usr/lib/debug/usr/lib64/libsss_certmap.so.0.1.0-2.2.3-3.el8.x86_64.debug
Error while reading shared library symbols for /usr/lib64/sssd/libsss_debug.so:
could not find '.gnu_debugaltlink' file for /var/cache/abrt-di/usr/lib/debug/usr/lib64/sssd/libsss_debug.so-2.2.3-3.el8.x86_64.debug
Error while reading shared library symbols for /lib64/libsystemd.so.0:
could not find '.gnu_debugaltlink' file for /var/cache/abrt-di/usr/lib/debug/usr/lib64/libsystemd.so.0.23.0-239-21.el8.x86_64.debug
Error while reading shared library symbols for /usr/lib64/sssd/libsss_iface.so:
could not find '.gnu_debugaltlink' file for /var/cache/abrt-di/usr/lib/debug/usr/lib64/sssd/libsss_iface.so-2.2.3-3.el8.x86_64.debug
Error while reading shared library symbols for /usr/lib64/sssd/libsss_sbus.so:
could not find '.gnu_debugaltlink' file for /var/cache/abrt-di/usr/lib/debug/usr/lib64/sssd/libsss_sbus.so-2.2.3-3.el8.x86_64.debug
Error while reading shared library symbols for /lib64/libdhash.so.1:
could not find '.gnu_debugaltlink' file for /var/cache/abrt-di/usr/lib/debug/usr/lib64/libdhash.so.1.1.0-0.6.1-39.el8.x86_64.debug
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Error while reading shared library symbols for /lib64/libpath_utils.so.1:
could not find '.gnu_debugaltlink' file for /var/cache/abrt-di/usr/lib/debug/usr/lib64/libpath_utils.so.1.0.1-0.6.1-39.el8.x86_64.debug
Error while reading shared library symbols for /lib64/liblz4.so.1:
could not find '.gnu_debugaltlink' file for /var/cache/abrt-di/usr/lib/debug/usr/lib64/liblz4.so.1.8.1-1.8.1.2-4.el8.x86_64.debug
Error while reading shared library symbols for /lib64/libmount.so.1:
could not find '.gnu_debugaltlink' file for /var/cache/abrt-di/usr/lib/debug/usr/lib64/libmount.so.1.1.0-2.32.1-17.el8.x86_64.debug
Error while reading shared library symbols for /lib64/libgcc_s.so.1:
could not find '.gnu_debugaltlink' file for /var/cache/abrt-di/usr/lib/debug/lib64/libgcc_s-8-20191121.so.1-8.3.1-5.el8.x86_64.debug
Error while reading shared library symbols for /lib64/libblkid.so.1:
could not find '.gnu_debugaltlink' file for /var/cache/abrt-di/usr/lib/debug/usr/lib64/libblkid.so.1.1.0-2.32.1-17.el8.x86_64.debug
Error while reading shared library symbols for /lib64/libuuid.so.1:
could not find '.gnu_debugaltlink' file for /var/cache/abrt-di/usr/lib/debug/usr/lib64/libuuid.so.1.3.0-2.32.1-17.el8.x86_64.debug
Error while reading shared library symbols for /lib64/libsasl2.so.3:
could not find '.gnu_debugaltlink' file for /var/cache/abrt-di/usr/lib/debug/usr/lib64/libsasl2.so.3.0.0-2.1.27-1.el8.x86_64.debug
Error while reading shared library symbols for /lib64/libk5crypto.so.3:
could not find '.gnu_debugaltlink' file for /var/cache/abrt-di/usr/lib/debug/usr/lib64/libk5crypto.so.3.1-1.17-17.el8.x86_64.debug
Error while reading shared library symbols for /lib64/libcom_err.so.2:
could not find '.gnu_debugaltlink' file for /var/cache/abrt-di/usr/lib/debug/usr/lib64/libcom_err.so.2.1-1.44.6-3.el8.x86_64.debug
Error while reading shared library symbols for /lib64/libkrb5support.so.0:
could not find '.gnu_debugaltlink' file for /var/cache/abrt-di/usr/lib/debug/usr/lib64/libkrb5support.so.0.1-1.17-17.el8.x86_64.debug
Error while reading shared library symbols for /lib64/libkeyutils.so.1:
could not find '.gnu_debugaltlink' file for /var/cache/abrt-di/usr/lib/debug/usr/lib64/libkeyutils.so.1.6-1.5.10-6.el8.x86_64.debug
Error while reading shared library symbols for /usr/lib64/ldb/modules/ldb/memberof.so:
could not find '.gnu_debugaltlink' file for /var/cache/abrt-di/usr/lib/debug/usr/lib64/ldb/modules/ldb/memberof.so-2.2.3-3.el8.x86_64.debug
Core was generated by `/usr/libexec/sssd/sssd_autofs --uid 0 --gid 0 --logger=files'.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50	../sysdeps/unix/sysv/linux/raise.c: No such file or directory.

Thread 1 (Thread 0x7f8ac8c50b40 (LWP 29128)):
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
       set = {__val = {6786, 94046636117600, 140234044167343, 94046636110576, 2333820375184795732, 3689919478405018161, 94042607072058, 674131205115182592, 94046636078016, 94046636109776, 140234044167343, 2162335232, 1, 304394, 94046635903744, 140234005728470}}
       pid = <optimized out>
       tid = <optimized out>
       ret = <optimized out>
#1  0x00007f8ac4aa9b25 in __GI_abort () at abort.c:79
       save_stage = 1
       act = {__sigaction_handler = {sa_handler = 0xf0571c20, sa_sigaction = 0xf0571c20}, sa_mask = {__val = {94046635761680, 94046635761778, 5, 0, 94046636102928, 94046636102832, 0, 0, 0, 0, 674131205115182592, 1576155200, 40, 94046636102256, 140233991290053, 0}}, sa_flags = -262802240, sa_restorer = 0x5588f055f4c0}
       sigs = {__val = {32, 0 <repeats 15 times>}}
#2  0x00007f8ac53b9de0 in talloc_abort (reason=0x7f8ac53c7a88 "Bad talloc magic value - unknown value") at ../../talloc.c:505
No locals.
#3  0x00007f8ac53b9d91 in talloc_abort_unknown_value () at ../../talloc.c:534
No locals.
#4  talloc_chunk_from_ptr (ptr=0x5588f0567350) at ../../talloc.c:534
       pp = 0x5588f0567350 ""
       tc = 0x5588f05672f0
       pp = <optimized out>
       tc = <optimized out>
#5  _talloc_free (ptr=0x5588f0567350, location=0x7f8ac86354af "src/util/sss_ptr_hash.c:163") at ../../talloc.c:1767
       tc = <optimized out>
#6  0x00007f8ac861a5ea in ?? () from /usr/lib64/sssd/libsss_util.so
No symbol table info available.
#7  0x0000000000000000 in ?? ()
No symbol table info available.
From                To                  Syms Read   Shared Object Library
0x00007f8ac8848090  0x00007f8ac8848e89  Yes         /lib64/libdl.so.2


Crash is seen in ldap_provider/automount test

2019-12-12T12:51:29 [ci-vm-10-0-154-155.h] ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

2019-12-12T12:51:29 [ci-vm-10-0-154-155.h] ::   Set entry cache autofs timeout value

2019-12-12T12:51:29 [ci-vm-10-0-154-155.h] ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

2019-12-12T12:51:40 [ci-vm-10-0-153-171.h] *** Current Time: Thu Dec 12 07:51:40 2019 Localwatchdog at: Fri Dec 13 07:47:39 2019

2019-12-12T12:51:49 [ci-vm-10-0-154-155.h] Redirecting to /bin/systemctl restart autofs.service

2019-12-12T12:51:50 [ci-vm-10-0-154-155.h] :: [ 07:51:50 ] :: [  BEGIN   ] :: Running 'cd /share2/key1;cd -'

2019-12-12T12:51:51 [ci-vm-10-0-154-155.h] /mnt/tests/gitlab.cee.redhat.com/sgoveas/sssd-qe/-/archive/automount82/sssd-qe-automount82.tar.gz/client/ldap_provider/automount

2019-12-12T12:51:51 [ci-vm-10-0-154-155.h] :: [ 07:51:50 ] :: [   PASS   ] :: Command 'cd /share2/key1;cd -' (Expected 0, got 0)

2019-12-12T12:51:51 [ci-vm-10-0-154-155.h] :: [ 07:51:50 ] :: [  BEGIN   ] :: Running 'mount | grep /share2/key1 | grep ci-vm-10-0-153-171.hosted.upshift.rdu2.redhat.com:/export/shared2/key1'

2019-12-12T12:51:51 [ci-vm-10-0-154-155.h] ci-vm-10-0-153-171.hosted.upshift.rdu2.redhat.com:/export/shared2/key1 on /share2/key1 type nfs4 (rw,relatime,vers=4.2,rsize=524288,wsize=524288,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.0.154.155,local_lock=none,addr=10.0.153.171)

2019-12-12T12:51:51 [ci-vm-10-0-154-155.h] :: [ 07:51:50 ] :: [   PASS   ] :: Command 'mount | grep /share2/key1 | grep ci-vm-10-0-153-171.hosted.upshift.rdu2.redhat.com:/export/shared2/key1' (Expected 0, got 0)

2019-12-12T12:51:51 [ci-vm-10-0-154-155.h] :: [ 07:51:50 ] :: [   PASS   ] :: File '/var/log/sssd/sssd_LDAP.log' should contain 'Searching for automount map entries with base \[ou=mount,dc=example,dc=com\]' 

2019-12-12T12:51:51 [ci-vm-10-0-154-155.h] :: [ 07:51:50 ] :: [   LOG    ] :: Sleeping for 12 seconds

2019-12-12T12:52:03 [ci-vm-10-0-154-155.h] :: [ 07:52:02 ] :: [  BEGIN   ] :: Running 'cd /share2/key1;cd -'

2019-12-12T12:52:03 [ci-vm-10-0-154-155.h] /mnt/tests/gitlab.cee.redhat.com/sgoveas/sssd-qe/-/archive/automount82/sssd-qe-automount82.tar.gz/client/ldap_provider/automount

2019-12-12T12:52:03 [ci-vm-10-0-154-155.h] :: [ 07:52:02 ] :: [   PASS   ] :: Command 'cd /share2/key1;cd -' (Expected 0, got 0)

2019-12-12T12:52:03 [ci-vm-10-0-154-155.h] :: [ 07:52:02 ] :: [  BEGIN   ] :: Running 'mount | grep /share2/key1 | grep ci-vm-10-0-153-171.hosted.upshift.rdu2.redhat.com:/export/shared2/key1'

2019-12-12T12:52:03 [ci-vm-10-0-154-155.h] ci-vm-10-0-153-171.hosted.upshift.rdu2.redhat.com:/export/shared2/key1 on /share2/key1 type nfs4 (rw,relatime,vers=4.2,rsize=524288,wsize=524288,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.0.154.155,local_lock=none,addr=10.0.153.171)

2019-12-12T12:52:03 [ci-vm-10-0-154-155.h] :: [ 07:52:02 ] :: [   PASS   ] :: Command 'mount | grep /share2/key1 | grep ci-vm-10-0-153-171.hosted.upshift.rdu2.redhat.com:/export/shared2/key1' (Expected 0, got 0)

2019-12-12T12:52:03 [ci-vm-10-0-154-155.h] :: [ 07:52:02 ] :: [   PASS   ] :: File '/var/log/sssd/sssd_LDAP.log' should not contain 'Searching for automount map entries with base \[ou=mount,dc=example,dc=com\]' 

2019-12-12T12:52:03 [ci-vm-10-0-154-155.h] :: [ 07:52:02 ] :: [   LOG    ] :: Sleeping for 110 seconds

2019-12-12T12:52:15 [ci-vm-10-0-154-155.h] *** Current Time: Thu Dec 12 07:52:15 2019 Localwatchdog at: Fri Dec 13 07:48:15 2019

2019-12-12T12:52:40 [ci-vm-10-0-153-171.h] *** Current Time: Thu Dec 12 07:52:40 2019 Localwatchdog at: Fri Dec 13 07:47:39 2019

2019-12-12T12:53:15 [ci-vm-10-0-154-155.h] *** Current Time: Thu Dec 12 07:53:15 2019 Localwatchdog at: Fri Dec 13 07:48:15 2019

2019-12-12T12:53:40 [ci-vm-10-0-153-171.h] *** Current Time: Thu Dec 12 07:53:40 2019 Localwatchdog at: Fri Dec 13 07:47:39 2019

2019-12-12T12:53:53 [ci-vm-10-0-154-155.h] :: [ 07:53:52 ] :: [  BEGIN   ] :: Running 'cd /share2/key1;cd -'

2019-12-12T12:53:53 [ci-vm-10-0-154-155.h] /mnt/tests/gitlab.cee.redhat.com/sgoveas/sssd-qe/-/archive/automount82/sssd-qe-automount82.tar.gz/client/ldap_provider/automount

2019-12-12T12:53:53 [ci-vm-10-0-154-155.h] :: [ 07:53:53 ] :: [   PASS   ] :: Command 'cd /share2/key1;cd -' (Expected 0, got 0)

2019-12-12T12:53:53 [ci-vm-10-0-154-155.h] :: [ 07:53:53 ] :: [  BEGIN   ] :: Running 'mount | grep /share2/key1 | grep ci-vm-10-0-153-171.hosted.upshift.rdu2.redhat.com:/export/shared2/key1'

2019-12-12T12:53:53 [ci-vm-10-0-154-155.h] ci-vm-10-0-153-171.hosted.upshift.rdu2.redhat.com:/export/shared2/key1 on /share2/key1 type nfs4 (rw,relatime,vers=4.2,rsize=524288,wsize=524288,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.0.154.155,local_lock=none,addr=10.0.153.171)

2019-12-12T12:53:53 [ci-vm-10-0-154-155.h] :: [ 07:53:53 ] :: [   PASS   ] :: Command 'mount | grep /share2/key1 | grep ci-vm-10-0-153-171.hosted.upshift.rdu2.redhat.com:/export/shared2/key1' (Expected 0, got 0)

2019-12-12T12:53:54 [ci-vm-10-0-154-155.h] :: [ 07:53:53 ] :: [   FAIL   ] :: File '/var/log/sssd/sssd_LDAP.log' should contain 'Searching for automount map entries with base \[ou=mount,dc=example,dc=com\]' 

2019-12-12T12:53:54 [ci-vm-10-0-154-155.h] :: [ 07:53:53 ] :: [   LOG    ] :: sleep 86400

Comment 1 Alexey Tikhonov 2019-12-17 18:51:10 UTC
...
#3  0x00007fad87794d91 in talloc_abort_unknown_value () at ../../talloc.c:534
#4  talloc_chunk_from_ptr (ptr=0x55977819ea00) at ../../talloc.c:534
#5  _talloc_free (ptr=0x55977819ea00, location=0x7fad8aa104af "src/util/sss_ptr_hash.c:163") at ../../talloc.c:1767
#6  0x00007fad8a9f55ea in sss_ptr_hash_delete_cb (item=0x559778196f70, deltype=HASH_ENTRY_DESTROY, pvt=<optimized out>)
    at src/util/sss_ptr_hash.c:163
#7  0x00007fad87bbcd7d in hash_delete (table=table@entry=0x55977818f4c0, key=key@entry=0x7ffce9a9b9d0) at dhash/dhash.c:1077
#8  0x00007fad8a9f5d46 in sss_ptr_hash_delete (free_value=false, key=0x55977819e980 "auto.master", table=0x55977818f4c0)
    at src/util/sss_ptr_hash.c:348
#9  sss_ptr_hash_delete (table=0x55977818f4c0, key=0x55977819e980 "auto.master", free_value=<optimized out>)
    at src/util/sss_ptr_hash.c:323
#10 0x00007fad879b33b9 in tevent_common_invoke_timer_handler (te=te@entry=0x5597781989d0, current_time=..., 
    removed=removed@entry=0x0) at ../../tevent_timed.c:370
#11 0x00007fad879b355e in tevent_common_loop_timer_delay (ev=0x559778164b00) at ../../tevent_timed.c:442
#12 0x00007fad879b4a99 in epoll_event_loop (tvalp=0x7ffce9a9bab0, epoll_ev=0x559778164d90) at ../../tevent_epoll.c:667
...


(gdb) frame 7
#7  0x00007fad87bbcd7d in hash_delete (table=table@entry=0x55977818f4c0, key=key@entry=0x7ffce9a9b9d0) at dhash/dhash.c:1077
1077	        hdelete_callback(table, HASH_ENTRY_DESTROY, &element->entry);
(gdb) p *key
$3 = {type = HASH_KEY_STRING, {str = 0x55977819e980 "auto.master"...}}
(gdb) p *((struct talloc_chunk *)((char *)element->entry.value.ptr - 96))
$1 = {flags = 2014977200, next = 0x0, prev = 0x0, parent = 0x0, child = 0x0, refs = 0x0, destructor = 0x0, 
  name = 0x7fad8aa104af "src/util/sss_ptr_hash.c:163", size = 16...}

[where 96 == TC_HDR_SIZE == TC_ALIGN16(sizeof(struct talloc_chunk))]

=> libdhash tries to delete entry with key=="auto.master" which was already freed with `sss_ptr_hash_delete_cb()`
so when it get to `sss_ptr_hash_delete_cb()` again in frame #6
#6  0x00007fad8a9f55ea in sss_ptr_hash_delete_cb (item=0x559778196f70, deltype=HASH_ENTRY_DESTROY, pvt=<optimized out>)
    at src/util/sss_ptr_hash.c:163
163	    talloc_free(value);
  --  it is double free => raise


(gdb) frame 11
#11 0x00007fad879b355e in tevent_common_loop_timer_delay (ev=0x559778164b00) at ../../tevent_timed.c:442
442		ret = tevent_common_invoke_timer_handler(te, current_time, NULL);
(gdb) p te
$4 = (struct tevent_timer *) 0x5597781989d0
(gdb) p ev->timer_events 
$5 = (struct tevent_timer *) 0x55977817ee40

but https://git.samba.org/samba.git/?p=samba.git;a=blob;f=lib/tevent/tevent_timed.c#l409 :
struct tevent_timer *te = ev->timer_events;

And the moment it is not clear to me how could it happen that te != ev->timer_events

(gdb) p *te
$6 = {prev = 0x0, next = 0x0, event_ctx = 0x0, wrapper = 0x0, busy = true, destroyed = true, next_event = {tv_sec = 1575917185, 
    tv_usec = 379978}, handler = 0x559777be26e0 <autofs_enumctx_lifetime_timeout>, private_data = 0x559778195510, 
  handler_name = 0x559777bf9e38 "autofs_enumctx_lifetime_timeout", location = 0x559777bf9e08 "src/responder/autofs/autofssrv_cmd.c:167"...}
(gdb) p *((struct autofs_enum_ctx *)(te->private_data))
$10 = {result = 0x55977819fe30, found = true, ready = true, key = 0x55977819e980 "auto.master", table = 0x55977818f4c0, notify_list = 0x0}
(gdb) p *((struct talloc_chunk *)((char *)te - 96))
$7 = {flags = 6406304, next = 0x0, prev = 0x0, parent = 0x0, child = 0x0, refs = 0x0, 
  destructor = 0x7fad879b2c60 <tevent_common_timed_destructor>, name = 0x7fad879b5e33 "struct tevent_timer", size = 96, limit = 0x0, pool = 0x0}

 => this `te` was already freed and most probably executed before

What should actually be executed here is:
(gdb) p *ev->timer_events 
$8 = {prev = 0x559778198520, next = 0x559778198520, event_ctx = 0x559778164b00, wrapper = 0x0, busy = false, destroyed = false, next_event = {
    tv_sec = 1575917185, tv_usec = 406319}, handler = 0x559777be26e0 <autofs_enumctx_lifetime_timeout>, private_data = 0x559778186800, 
  handler_name = 0x559777bf9e38 "autofs_enumctx_lifetime_timeout", location = 0x559777bf9e08 "src/responder/autofs/autofssrv_cmd.c:167",...}
(gdb) p *((struct autofs_enum_ctx *)(ev->timer_events->private_data))
$9 = {result = 0x5597781a64c0, found = true, ready = true, key = 0x5597781a1530 "auto.direct", table = 0x55977818f4c0, notify_list = 0x0}


And also strange thing is:

(gdb) frame
#7  0x00007fad87bbcd7d in hash_delete (table=table@entry=0x55977818f4c0, key=key@entry=0x7ffce9a9b9d0) at dhash/dhash.c:1077
1077	        hdelete_callback(table, HASH_ENTRY_DESTROY, &element->entry);
(gdb) p *key
$61 = {type = HASH_KEY_STRING, {str = 0x55977819e980 "auto.master", c_str = 0x55977819e980 "auto.master", ul = 94109043386752}}
(gdb) p *table
$53 = {p = 0, maxp = 4, entry_count = 1, bucket_count = 4, segment_count = 1, min_load_factor = 1, max_load_factor = 5, directory_size = 4, 
  directory_size_shift = 2, segment_size = 4, segment_size_shift = 2, delete_callback = 0x7fad8a9f5550 <sss_ptr_hash_delete_cb>..., statistics = {hash_accesses = 12, hash_collisions = 0, table_expansions = 0, table_contractions = 0}}

  -- entry_count = 1

And this single entry is:
(gdb) p *table->directory[0][1]
$49 = {entry = {key = {type = HASH_KEY_STRING, {str = 0x5597781a0d00 "auto.direct"...}}, 
       value = {type = HASH_VALUE_PTR, {ptr = 0x5597781a14b0...}}}, next = 0x0}

how is it possible
(gdb) p *element
$56 = {entry = {key = {type = HASH_KEY_STRING, {str = 0x55977818bf70 "auto.master"...}}, 
       value = {type = HASH_VALUE_PTR, {ptr = 0x55977819ea00...}}}, next = 0x0}



So at this moment I have two questions:
(1) how does it happen libtevent tries to execute already destroyed timer event instead of event contained in tevent_context?
(2) even so, how does libdhash find entry with key=="auto.master" if it was already removed from the hash table?


Steeve, do you have a reproducer?

Comment 2 Alexey Tikhonov 2019-12-18 16:01:22 UTC
(In reply to Alexey Tikhonov from comment #1)
> (gdb) p *te
> $6 = {prev = 0x0, next = 0x0, event_ctx = 0x0, wrapper = 0x0, busy = true,
> destroyed = true, next_event = {tv_sec = 1575917185, tv_usec = 379978}, handler = 0x559777be26e0
> <autofs_enumctx_lifetime_timeout>, private_data = 0x559778195510, 
> handler_name = 0x559777bf9e38 "autofs_enumctx_lifetime_timeout", location
> = 0x559777bf9e08 "src/responder/autofs/autofssrv_cmd.c:167"...}

> (gdb) p *((struct talloc_chunk *)((char *)te - 96))
> $7 = {flags = 6406304, next = 0x0, prev = 0x0, parent = 0x0, child = 0x0,
> refs = 0x0, destructor = 0x7fad879b2c60 <tevent_common_timed_destructor>, name =
> 0x7fad879b5e33 "struct tevent_timer", size = 96, limit = 0x0, pool = 0x0}
> 
>  => this `te` was already freed and most probably executed before

Actually event was destroyed (destroyed=true) but not yet freed (name in talloc_chunk header still has type name) - d-tor failed due to busy=true.

Comment 5 Alexey Tikhonov 2020-01-06 18:39:39 UTC
(In reply to Alexey Tikhonov from comment #1)

> And the moment it is not clear to me how could it happen that te !=
> ev->timer_events

This is actually clear:
`tevent_common_invoke_timer_handler()` calls `DLIST_REMOVE(te->event_ctx->timer_events, te)`
so `te` that is being processed was removed from queue (located in the heap) already.

> (1) how does it happen libtevent tries to execute already destroyed timer
> event instead of event contained in tevent_context?

So this statement is just wrong.

Comment 6 Alexey Tikhonov 2020-01-06 20:01:17 UTC
(In reply to Alexey Tikhonov from comment #1)

> (2) even so, how does libdhash find entry with key=="auto.master" if it was
> already removed from the hash table?

This is something I still do not understand:

#7  0x00007fad87bbcd7d in hash_delete (table=table@entry=0x55977818f4c0, key=key@entry=0x7ffce9a9b9d0) at dhash/dhash.c:1077
1077	        hdelete_callback(table, HASH_ENTRY_DESTROY, &element->entry);

sss_ptr_hash_delete_cb(element->entry) -> 
     autofs_maps_delete_cb() ->
          talloc_unlink(autofs_ctx->maps, enum_ctx = element->entry->value.ptr->ptr)  --  enum_ctx doesn't have talloc d-tor
     _talloc_free(element->entry->value.ptr) -> ... abort
--  this execution chain looks like it doesn't touch `element` and `element->entry.key.str` but:

(gdb) p element->entry.key.str 
$78 = 0x55977818bf70 "auto.master"
(gdb) p *((struct talloc_chunk *)((char *)element->entry.key.str - 96))
$79 = {flags = 2014964112, next = 0x0, prev = 0x0, parent = 0x0, child = 0x0, refs = 0x0, destructor = 0x0, name = 0x7fad8aa0a7a0 "src/util/util.c:379", size = 12, limit = 0x0, pool = 0x0}

and

(gdb) p *((struct talloc_chunk *)((char *)element - 96))
$75 = {flags = 3927370867, next = 0x0, prev = 0x0, parent = 0x0, child = 0x0, refs = 0x0, destructor = 0x0, name = 0x7fad8aa0a7a0 "src/util/util.c:379", size = 40, limit = 0x0, pool = 0x0}

https://pagure.io/SSSD/sssd/blob/master/f/src/util/util.c#_379 : `talloc_free(ptr)`  --  those elements are freed in the heap.
And hash `table` contained in the heap on the momemnt of crash doesn't contain entry with "auto.master" as well.

So looks like element/element->entry.key/element->entry.value are deleted and removed from the table after `hash_delete()->lookup()` found it in hash table but before `hdelete_callback()` returned...

Comment 7 Alexey Tikhonov 2020-01-07 12:27:43 UTC
(In reply to Alexey Tikhonov from comment #6)
> So looks like element/element->entry.key/element->entry.value are deleted
> and removed from the table after `hash_delete()->lookup()` found it in hash
> table but before `hdelete_callback()` returned...

Exactly.

Thanks to Pavel's hint that `sss_ptr_hash_spy` is linked to memory context of actually stored data, I figured out what is happening:

hash_delete() -> hdelete_callback [sss_ptr_hash_delete_cb] -> data->callback [autofs_maps_delete_cb] -> talloc_unlink(autofs_ctx->maps, enum_ctx) -> spy is freed -> sss_ptr_hash_spy_destructor() -> sss_ptr_hash_delete() -> hash_delete() called recursively and frees element, its key and value -> stack unwinds back to sss_ptr_hash_delete_cb::talloc_free(value)  --  double free

I guess this is quite old bug in `sss_ptr_hash` that was only revealed by recent autofs changes.

It is also present in sssd-1-16 so I think this bz is worth cloning for RHEL7.

Comment 9 Alexey Tikhonov 2020-01-11 15:25:30 UTC
(In reply to Alexey Tikhonov from comment #7)
> I guess this is quite old bug in `sss_ptr_hash` that was only revealed by
> recent autofs changes.

No. Actually bug was intoduced recently in f95db37aa8486304d0569d12a876b1c74ee1b0d1 and affects only 2.x branch.

> It is also present in sssd-1-16 so I think this bz is worth cloning for
> RHEL7.

So it doesn't affect RHEL7.

Comment 12 Pavel Březina 2020-01-14 10:08:02 UTC
* `master`
    * 26e33b1984cce3549df170f58f8221201ad54cfd - util/sss_ptr_hash: fixed double free in sss_ptr_hash_delete_cb()

Comment 16 errata-xmlrpc 2020-04-28 16:56:29 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:1863


Note You need to log in before you can comment on or make changes to this bug.