RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2097014 - SSSD -> sssd_be and sssd_ifp coredump [rhel-7.9.z]
Summary: SSSD -> sssd_be and sssd_ifp coredump [rhel-7.9.z]
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: sssd
Version: 7.9
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Alexey Tikhonov
QA Contact: Anuj Borah
URL:
Whiteboard: sync-to-jira
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-06-14 17:31 UTC by Chino Soliard
Modified: 2022-12-13 11:19 UTC (History)
11 users (show)

Fixed In Version: sssd-1.16.5-10.el7_9.14
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-12-13 11:19:42 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github SSSD sssd issues 6269 0 None open [1-16 only] `responder_idle_timeout` doesn't work for `sssd_ifp` 2022-07-19 11:38:35 UTC
Github SSSD sssd issues 6270 0 None open [1-16 only] `sss_dp_callback_destructor()` can crash if invoked while responder is being shutdown 2022-07-19 11:38:35 UTC
Github SSSD sssd pull 6272 0 None open RESPONDER: don't execute sss_dp_callback_destructor() 2022-07-19 11:39:39 UTC
Github SSSD sssd pull 6273 0 None open IFP: reset idle timer on any DBUS activity 2022-07-19 11:39:39 UTC
Github SSSD sssd pull 6277 0 None open RESPONDER: fix responder_idle_handler() 2022-07-19 12:39:45 UTC
Red Hat Issue Tracker RHELPLAN-125266 0 None None None 2022-06-14 17:36:32 UTC
Red Hat Issue Tracker SSSD-4891 0 None None None 2022-07-18 17:22:36 UTC
Red Hat Product Errata RHBA-2022:8950 0 None None None 2022-12-13 11:19:46 UTC

Description Chino Soliard 2022-06-14 17:31:06 UTC
--------------------------------------------------------------------------------
Description of problem:
--------------------------------------------------------------------------------

Looks like SSSD is working fine, but coredumps are being generated frequently.
The issue is happening in 4 different servers. 2QA and 2Prod.
AD login is not used directly. These particular systems are Apache server that
are configured to use the SSSD system to check the user's Windows kerberos
ticket and their AD groups.

Users are tested by running dbus-send, e.g.,

    dbus-send --print-reply --system --dest=org.freedesktop.sssd.infopipe /org/freedesktop/sssd/infopipe org.freedesktop.sssd.infopipe.GetUserAttr string:hostname@domain array:string:gecos,name,firstname,lastname

or

    dbus-send --print-reply --system --dest=org.freedesktop.sssd.infopipe /org/freedesktop/sssd/infopipe org.freedesktop.sssd.infopipe.GetUserGroups string:hostname@domain
____________


When you run "file" against the dump files, you get:

    core.11154: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from '/usr/libexec/sssd/sssd_ifp --uid 0 --gid 0 --dbus-activated --logger=files', real uid: 0, effective uid: 0, real gid: 0, effective gid: 0, execfn: '/usr/libexec/sssd/sssd_ifp', platform: 'x86_64'

or

    core.994:   ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from '/usr/libexec/sssd/sssd_be --domain hostname.domain --uid 0 --gid 0 --logger=files', real uid: 0, effective uid: 0, real gid: 0, effective gid: 0, execfn: '/usr/libexec/sssd/sssd_be', platform: 'x86_64'

____________

sssd.conf:

    [sssd]
    domains = example.com
    config_file_version = 2
    services = nss, pam

    [ifp]
    allowed_uids = root, apache
    user_attributes = +firstname, +lastname

    [pam]

    [domain/example.com]
    ad_domain = example.com
    krb5_realm = EXAMPLE.COM
    ad_site = EXAMPLE
    ad_server = adserver01.example.com, adserver03.example.com
    ad_backup_server = adserver02.example.com, adserver04.example.com
    realmd_tags = manages-system joined-with-samba 
    cache_credentials = True
    id_provider = ad
    ad_enabled_domains = example.com
    subdomains_provider = none
    krb5_store_password_if_offline = True
    default_shell = /bin/bash
    ldap_id_mapping = True
    fallback_homedir = /home/%u
    access_provider = simple
    simple_allow_groups = aaa,bbb,...
    dyndns_update = false
    ldap_user_extra_attrs = firstname:givenname, lastname:sn
    filter_users = apache, ...
    filter_groups = apache, ...
____________

The following is a backtrace from a sssd_ifp coredump:

    Core was generated by `/usr/libexec/sssd/sssd_ifp --uid 0 --gid 0 --dbus-activated --logger=files'.
    Program terminated with signal 11, Segmentation fault.

    (gdb) bt full
    #0  0x000055be731809a4 in sss_dp_callback_destructor (ptr=ptr@entry=0x55be75c1dae0) at src/responder/common/responder_dp.c:61
            cb = 0x55be75c1dae0
    #1  0x00007faf5a2bf9fb in _tc_free_internal (location=0x7faf5ab1c965 "../../tevent_req.c:291", tc=0x55be75c1da80) at ../../talloc.c:1137
            d = <optimized out>
            ptr_to_free = <optimized out>
            ptr = 0x55be75c1dae0
    #2  _tc_free_children_internal (location=0x7faf5ab1c965 "../../tevent_req.c:291", ptr=0x55be75c393a0, tc=0x55be75c39340) at ../../talloc.c:1646
            child = 0x55be75c1dae0
            new_parent = <optimized out>
    #3  _tc_free_internal (tc=0x55be75c39340, location=0x7faf5ab1c965 "../../tevent_req.c:291") at ../../talloc.c:1163
            ptr_to_free = <optimized out>
            ptr = 0x55be75c393a0
    #4  0x00007faf5ab169a0 in tevent_req_received (req=req@entry=0x55be75c391f0) at ../../tevent_req.c:291
    No locals.
    #5  0x00007faf5ab169d9 in tevent_req_destructor (req=req@entry=0x55be75c391f0) at ../../tevent_req.c:128
    No locals.
    #6  0x00007faf5a2bf83b in _tc_free_internal (tc=0x55be75c39190, location=0x7faf5ab1c965 "../../tevent_req.c:291") at ../../talloc.c:1137
            d = 0x7faf5ab169d0 <tevent_req_destructor>
            ptr_to_free = <optimized out>
            ptr = 0x55be75c391f0
    #7  0x00007faf5a2bf18f in _tc_free_children_internal (location=0x7faf5ab1c965 "../../tevent_req.c:291", ptr=0x55be7523d0d0, tc=0x55be7523d070) at ../../talloc.c:1646
            child = 0x55be75c391f0
            new_parent = 0x0
    #8  _tc_free_internal (location=0x7faf5ab1c965 "../../tevent_req.c:291", tc=0x55be7523d070) at ../../talloc.c:1163
            ptr_to_free = <optimized out>
            ptr = 0x55be7523d0d0
    #9  _tc_free_children_internal (location=0x7faf5ab1c965 "../../tevent_req.c:291", ptr=0x55be7523d030, tc=0x55be7523cfd0) at ../../talloc.c:1646
            child = <optimized out>
            new_parent = <optimized out>
    #10 _tc_free_internal (tc=0x55be7523cfd0, location=0x7faf5ab1c965 "../../tevent_req.c:291") at ../../talloc.c:1163
            ptr_to_free = <optimized out>
            ptr = 0x55be7523d030
    #11 0x00007faf5ab169a0 in tevent_req_received (req=req@entry=0x55be7523ce80) at ../../tevent_req.c:291
    No locals.
    #12 0x00007faf5ab169d9 in tevent_req_destructor (req=req@entry=0x55be7523ce80) at ../../tevent_req.c:128
    No locals.
    #13 0x00007faf5a2bf9fb in _tc_free_internal (location=0x7faf5ab1c965 "../../tevent_req.c:291", tc=0x55be7523ce20) at ../../talloc.c:1137
            d = <optimized out>
            ptr_to_free = <optimized out>
            ptr = 0x55be7523ce80
    #14 _tc_free_children_internal (location=0x7faf5ab1c965 "../../tevent_req.c:291", ptr=0x55be75233d10, tc=0x55be75233cb0) at ../../talloc.c:1646
            child = 0x55be7523ce80
            new_parent = <optimized out>
    #15 _tc_free_internal (tc=0x55be75233cb0, location=0x7faf5ab1c965 "../../tevent_req.c:291") at ../../talloc.c:1163
            ptr_to_free = <optimized out>
            ptr = 0x55be75233d10
    #16 0x00007faf5ab169a0 in tevent_req_received (req=req@entry=0x55be75233b60) at ../../tevent_req.c:291
    No locals.
    #17 0x00007faf5ab169d9 in tevent_req_destructor (req=req@entry=0x55be75233b60) at ../../tevent_req.c:128
    No locals.
    #18 0x00007faf5a2bf9fb in _tc_free_internal (location=0x55be73195440 "src/responder/common/responder_common.c:382", tc=0x55be75233b00) at ../../talloc.c:1137
            d = <optimized out>
            ptr_to_free = <optimized out>
            ptr = 0x55be75233b60
    #19 _tc_free_children_internal (location=0x55be73195440 "src/responder/common/responder_common.c:382", ptr=0x55be7521bff0, tc=0x55be7521bf90) at ../../talloc.c:1646
            child = 0x55be75233b60
            new_parent = <optimized out>
    #20 _tc_free_internal (tc=0x55be7521bf90, location=0x55be73195440 "src/responder/common/responder_common.c:382") at ../../talloc.c:1163
            ptr_to_free = <optimized out>
            ptr = 0x55be7521bff0
    #21 0x00007faf5a2bf18f in _tc_free_children_internal (location=0x55be73195440 "src/responder/common/responder_common.c:382", ptr=0x55be752299f0, tc=0x55be75229990) at ../../talloc.c:1646
            child = 0x55be7521bff0
            new_parent = 0x0
    #22 _tc_free_internal (location=0x55be73195440 "src/responder/common/responder_common.c:382", tc=0x55be75229990) at ../../talloc.c:1163
            ptr_to_free = <optimized out>
            ptr = 0x55be752299f0
    #23 _tc_free_children_internal (location=0x55be73195440 "src/responder/common/responder_common.c:382", ptr=0x55be75223be0, tc=0x55be75223b80) at ../../talloc.c:1646
            child = <optimized out>
            new_parent = <optimized out>
    #24 _tc_free_internal (tc=0x55be75223b80, location=0x55be73195440 "src/responder/common/responder_common.c:382") at ../../talloc.c:1163
            ptr_to_free = <optimized out>
            ptr = 0x55be75223be0
    #25 0x00007faf5a2bf18f in _tc_free_children_internal (location=0x55be73195440 "src/responder/common/responder_common.c:382", ptr=0x55be7522bca0, tc=0x55be7522bc40) at ../../talloc.c:1646
            child = 0x55be75223be0
            new_parent = 0x0
    #26 _tc_free_internal (location=0x55be73195440 "src/responder/common/responder_common.c:382", tc=0x55be7522bc40) at ../../talloc.c:1163
            ptr_to_free = <optimized out>
            ptr = 0x55be7522bca0
    #27 _tc_free_children_internal (location=0x55be73195440 "src/responder/common/responder_common.c:382", ptr=0x55be75228e40, tc=0x55be75228de0) at ../../talloc.c:1646
            child = <optimized out>
            new_parent = <optimized out>
    #28 _tc_free_internal (tc=0x55be75228de0, location=0x55be73195440 "src/responder/common/responder_common.c:382") at ../../talloc.c:1163
            ptr_to_free = <optimized out>
            ptr = 0x55be75228e40
    #29 0x00007faf5a2bf18f in _tc_free_children_internal (location=0x55be73195440 "src/responder/common/responder_common.c:382", ptr=0x55be75227640, tc=0x55be752275e0) at ../../talloc.c:1646
            child = 0x55be75228e40
            new_parent = 0x0
    #30 _tc_free_internal (location=0x55be73195440 "src/responder/common/responder_common.c:382", tc=0x55be752275e0) at ../../talloc.c:1163
            ptr_to_free = <optimized out>
            ptr = 0x55be75227640
    #31 _tc_free_children_internal (location=0x55be73195440 "src/responder/common/responder_common.c:382", ptr=0x55be7521b8f0, tc=0x55be7521b890) at ../../talloc.c:1646
            child = <optimized out>
            new_parent = <optimized out>
    #32 _tc_free_internal (tc=0x55be7521b890, location=0x55be73195440 "src/responder/common/responder_common.c:382") at ../../talloc.c:1163
            ptr_to_free = <optimized out>
            ptr = 0x55be7521b8f0
    #33 0x000055be7317d282 in responder_idle_handler (ev=<optimized out>, te=<optimized out>, current_time=..., data=<optimized out>) at src/responder/common/responder_common.c:382
            rctx = 0x55be7521b8f0
            now = <optimized out>
            __FUNCTION__ = "responder_idle_handler"
    #34 0x00007faf5ab1aabf in tevent_common_invoke_timer_handler (te=te@entry=0x55be752220a0, current_time=..., removed=removed@entry=0x0) at ../../tevent_timed.c:370
            handler_ev = 0x55be75212a90
    #35 0x00007faf5ab1ac39 in tevent_common_loop_timer_delay (ev=ev@entry=0x55be75212a90) at ../../tevent_timed.c:442
            current_time = {tv_sec = 1647522759, tv_usec = 93048}
            te = 0x55be752220a0
            ret = <optimized out>
    #36 0x00007faf5ab1be49 in epoll_event_loop_once (ev=0x55be75212a90, location=<optimized out>) at ../../tevent_epoll.c:922
            epoll_ev = 0x55be75212d20
            tval = {tv_sec = 0, tv_usec = 0}
            panic_triggered = false
    #37 0x00007faf5ab1a057 in std_event_loop_once (ev=0x55be75212a90, location=0x7faf5d48eb87 "src/util/server.c:725") at ../../tevent_standard.c:110
            glue_ptr = <optimized out>
            glue = 0x55be75212bd0
            ret = <optimized out>
    #38 0x00007faf5ab1525d in _tevent_loop_once (ev=ev@entry=0x55be75212a90, location=location@entry=0x7faf5d48eb87 "src/util/server.c:725") at ../../tevent.c:772
            ret = <optimized out>
            nesting_stack_ptr = 0x0
    #39 0x00007faf5ab154bb in tevent_common_loop_wait (ev=0x55be75212a90, location=0x7faf5d48eb87 "src/util/server.c:725") at ../../tevent.c:895
            ret = <optimized out>
    #40 0x00007faf5ab19ff7 in std_event_loop_wait (ev=0x55be75212a90, location=0x7faf5d48eb87 "src/util/server.c:725") at ../../tevent_standard.c:141
            glue_ptr = <optimized out>
            glue = 0x55be75212bd0
            ret = <optimized out>
    #41 0x00007faf5d46a6a3 in server_loop (main_ctx=0x55be75213f60) at src/util/server.c:725
    No locals.
    #42 0x000055be7316df82 in main (argc=7, argv=<optimized out>) at src/responder/ifp/ifpsrv.c:415
            opt = <optimized out>
            pc = 0x55be75211810
            opt_logger = 0x55be7520cfd0 "files"
            main_ctx = 0x55be75213f60
            ret = <optimized out>
            uid = 0
            gid = 0
            long_options = {{longName = 0x0, shortName = 0 '\000', argInfo = 4, arg = 0x7faf5d20b320 <poptHelpOptions>, val = 0, descrip = 0x55be7318fe39 "Help options:", argDescrip = 0x0}, {longName = 0x55be7318fe47 "debug-level", shortName = 100 'd', argInfo = 2, 
                arg = 0x7faf5b58f168 <debug_level>, val = 0, descrip = 0x55be7318fe53 "Debug level", argDescrip = 0x0}, {longName = 0x55be7318fe5f "debug-to-files", shortName = 102 'f', argInfo = 1073741824, arg = 0x7faf5b58f164 <debug_to_file>, val = 0, 
                descrip = 0x55be731902f0 "Send the debug output to files instead of stderr", argDescrip = 0x0}, {longName = 0x55be7318fe6e "debug-to-stderr", shortName = 0 '\000', argInfo = 1073741824, arg = 0x7faf5b58f160 <debug_to_stderr>, val = 0, 
                descrip = 0x55be73190328 "Send the debug output to stderr directly.", argDescrip = 0x0}, {longName = 0x55be7318fe7e "debug-timestamps", shortName = 0 '\000', argInfo = 2, arg = 0x7faf5b58f104 <debug_timestamps>, val = 0, 
                descrip = 0x55be7318fe8f "Add debug timestamps", argDescrip = 0x0}, {longName = 0x55be7318fea4 "debug-microseconds", shortName = 0 '\000', argInfo = 2, arg = 0x7faf5b58f100 <debug_microseconds>, val = 0, 
                descrip = 0x55be73190358 "Show timestamps with microseconds", argDescrip = 0x0}, {longName = 0x55be7318febb "logger", shortName = 0 '\000', argInfo = 1, arg = 0x7ffcf2c5dfd0, val = 0, descrip = 0x55be7318feb7 "Set logger", 
                argDescrip = 0x55be7318fec2 "stderr|files|journald"}, {longName = 0x55be7318fed8 "uid", shortName = 0 '\000', argInfo = 2, arg = 0x7ffcf2c5dfc8, val = 0, descrip = 0x55be73190380 "The user ID to run the server as", argDescrip = 0x0}, {
                longName = 0x55be7318fedc "gid", shortName = 0 '\000', argInfo = 2, arg = 0x7ffcf2c5dfcc, val = 0, descrip = 0x55be731903a8 "The group ID to run the server as", argDescrip = 0x0}, {longName = 0x55be7318fee0 "socket-activated", shortName = 0 '\000', 
                argInfo = 0, arg = 0x7faf5d6a5108 <socket_activated>, val = 0, descrip = 0x55be731903d0 "Informs that the responder has been socket-activated", argDescrip = 0x0}, {longName = 0x55be7318fef1 "dbus-activated", shortName = 0 '\000', argInfo = 0, 
                arg = 0x7faf5d6a5104 <dbus_activated>, val = 0, descrip = 0x55be73190408 "Informs that the responder has been dbus-activated", argDescrip = 0x0}, {longName = 0x0, shortName = 0 '\000', argInfo = 0, arg = 0x0, val = 0, descrip = 0x0, argDescrip = 0x0}}
            __FUNCTION__ = "main"
____________

--------------------------------------------------------------------------------
Version-Release number of selected component (if applicable):
--------------------------------------------------------------------------------

    sssd-1.16.5-10.el7_9.11.x86_64
    sssd-ad-1.16.5-10.el7_9.11.x86_64
    sssd-client-1.16.5-10.el7_9.11.x86_64
    sssd-common-1.16.5-10.el7_9.11.x86_64
    sssd-common-pac-1.16.5-10.el7_9.11.x86_64
    sssd-dbus-1.16.5-10.el7_9.11.x86_64
    sssd-ipa-1.16.5-10.el7_9.11.x86_64
    sssd-krb5-1.16.5-10.el7_9.11.x86_64
    sssd-krb5-common-1.16.5-10.el7_9.11.x86_64
    sssd-ldap-1.16.5-10.el7_9.11.x86_64
    sssd-proxy-1.16.5-10.el7_9.11.x86_64

--------------------------------------------------------------------------------
How reproducible:
--------------------------------------------------------------------------------

- Issue is happening suddenly.

Actual results:

- SSSD (be and ifp) is coredumping

--------------------------------------------------------------------------------
Additional info:
--------------------------------------------------------------------------------

Comment 5 Alexey Tikhonov 2022-06-14 17:56:46 UTC
(In reply to Chino Soliard from comment #0)
> 
> The following is a backtrace from a sssd_ifp coredump:
> 
>     #0  0x000055be731809a4 in sss_dp_callback_destructor
>             ...
>     #33 0x000055be7317d282 in responder_idle_handler

Probably, as a work-around of sssd_ifp crash you could try to set `responder_idle_timeout = 0` in [ifp] section of sssd.conf

This will disable auto termination on idle (that seem to trigger an issue).

But needs a proper investigation, of course.

Comment 8 Alexey Tikhonov 2022-07-18 16:45:52 UTC
(gdb) frame 0
#0  0x000056501ec109a4 in sss_dp_callback_destructor (ptr=ptr@entry=0x565020f91640) at src/responder/common/responder_dp.c:61
61	    DLIST_REMOVE(cb->sdp_req->cb_list, cb);
(gdb) p *cb
$2 = {prev = 0x0, next = 0x0, req = 0x565020f98740, sdp_req = 0x565021317600}
(gdb) p *cb->req
$3 = {async = {fn = 0x56501ec19f70 <cache_req_search_done>, private_data = 0x5650205a0dc0}, data = 0x565020f988f0, private_print = 0x0, 
  private_cancel = 0x0, private_cleanup = {fn = 0x0, state = TEVENT_REQ_INIT}, internal = {private_type = 0x56501ec270b1 "struct sss_dp_req_state", 
    create_location = 0x56501ec27ab8 "src/responder/common/responder_dp.c:497", finish_location = 0x0, cancel_location = 0x0, state = TEVENT_REQ_RECEIVED, 
    error = 0, trigger = 0x0, defer_callback_ev = 0x0, timer = 0x0, profile = 0x0}}
(gdb) p *cb->sdp_req
Cannot access memory at address 0x565021317600


(gdb) frame 33
#33 0x000056501ec0d282 in responder_idle_handler (ev=<optimized out>, te=<optimized out>, current_time=..., data=<optimized out>)
    at src/responder/common/responder_common.c:382
382	        talloc_free(rctx);
(gdb) list
377	    if ((now - rctx->last_request_time) > rctx->idle_timeout) {
378	        /* This responder is idle. Terminate it */
379	        DEBUG(SSSDBG_TRACE_INTERNAL,
380	              "Terminating idle responder [%p]\n", rctx);
381	
382	        talloc_free(rctx);
383	
384	        orderly_shutdown(0);
385	    }
386	
(gdb) p now
$4 = <optimized out>   --   but reading log it is "13:53:45): [ifp] [responder_idle_handler] (0x2000): Terminating idle responder"
(gdb) p rctx->last_request_time
$5 = 1654710524        --   :48:44  --  (53:45 - 48:44) ~ 5 mins  --  matches default value of `responder_idle_timeout`

So, there are 2 bugs:
 - `rctx->last_request_time` wasn't updated properly (despite "...13:53:45): [ifp] [sss_dp_issue_request] (0x0400): Issuing request for...")
 - access after free in `sss_dp_callback_destructor()` when `cb->sdp_req` was already freed

Comment 9 Alexey Tikhonov 2022-07-18 17:50:59 UTC
(In reply to Alexey Tikhonov from comment #8)
> 
> So, there are 2 bugs:
>  - `rctx->last_request_time` wasn't updated properly


`last_request_time` is updated:
 - once in `sss_process_init()` -> `responder_setup_idle_timeout_config()` -> `setup_responder_idle_timer()`
 - on ever socket activity in `sss_client_fd_handler()`, but the issue is `sssd_ifp` doesn't use "client fd", it uses system dbus for communication with clients

2.x branch seems to be affected the same way.


>  - access after free in `sss_dp_callback_destructor()` when `cb->sdp_req` was already freed

Everything is created in the context of `rctx`
It seems when libtalloc gets to free/destruction of `sss_dp_callback`, `sdp_req` it references was already freed.
I think the most simple way to "fix" it is to use the same guard as in `sss_dp_req_destructor()`:
```
    /* Do not call callbacks if the responder is shutting down, because
     * the top level responder context (pam_ctx, sudo_ctx, ...) may be
     * already semi-freed and we may end up accessing freed memory.
     */
    if (sdp_req->rctx->shutting_down) {
        return 0;
    }
```
(https://github.com/SSSD/sssd/blob/ddfc7e99e96ee732586c07342900d287d2378802/src/responder/common/responder_dp.c#L79)

2.x branch has this code re-written, so shouldn't be affected (at least not exactly the same way).

Comment 10 Alexey Tikhonov 2022-07-18 20:08:57 UTC
Upstream PRs:
 - https://github.com/SSSD/sssd/pull/6271
 - https://github.com/SSSD/sssd/pull/6272

Comment 11 Alexey Tikhonov 2022-07-19 09:29:51 UTC
(In reply to Alexey Tikhonov from comment #9)
> (In reply to Alexey Tikhonov from comment #8)
> > 
> > So, there are 2 bugs:
> >  - `rctx->last_request_time` wasn't updated properly
> 
> 
> `last_request_time` is updated:
>  - once in `sss_process_init()` -> `responder_setup_idle_timeout_config()`
> -> `setup_responder_idle_timer()`
>  - on ever socket activity in `sss_client_fd_handler()`, but the issue is
> `sssd_ifp` doesn't use "client fd", it uses system dbus for communication
> with clients


No, there is actually an idle time update is sbus code in `sbus_message_handler()`:
https://github.com/SSSD/sssd/blob/ddfc7e99e96ee732586c07342900d287d2378802/src/sbus/sssd_dbus_interface.c#L1004

Nonetheless, it doesn't work as expected in the provided log:
```
(2022-06-08 13:53:44): [ifp] [sbus_dispatch] (0x4000): dbus conn: 0x56502046cfc0
(2022-06-08 13:53:44): [ifp] [sbus_dispatch] (0x4000): Dispatching.
(2022-06-08 13:53:44): [ifp] [sbus_message_handler] (0x2000): Received SBUS method org.freedesktop.sssd.infopipe.GetUserGroups on path /org/freedesktop/sssd/infopipe
(2022-06-08 13:53:44): [ifp] [sbus_get_sender_id_send] (0x2000): Looking for identity of sender [:1.45894]
(2022-06-08 13:53:44): [ifp] [sbus_add_timeout] (0x2000): 0x5650204673d0
(2022-06-08 13:53:44): [ifp] [sbus_remove_timeout] (0x2000): 0x5650204673d0
(2022-06-08 13:53:44): [ifp] [sbus_dispatch] (0x4000): dbus conn: 0x56502046cfc0
(2022-06-08 13:53:44): [ifp] [sbus_dispatch] (0x4000): Dispatching.
(2022-06-08 13:53:44): [ifp] [ifp_user_get_groups] (0x0200): Looking up groups of user [...] on behalf of ...
(2022-06-08 13:53:44): [ifp] [cache_req_set_plugin] (0x2000): CR #6: Setting "Initgroups by name" plugin
...
(2022-06-08 13:53:44): [ifp] [cache_req_create_and_add_result] (0x0400): CR #6: Found 149 entries in domain ...
(2022-06-08 13:53:44): [ifp] [cache_req_done] (0x0400): CR #6: Finished: Success
(2022-06-08 13:53:44): [ifp] [cache_req_set_plugin] (0x2000): CR #7: Setting "Group by ID" plugin
...
(2022-06-08 13:53:45): [ifp] [cache_req_search_send] (0x0400): CR #7: Object found, but needs to be refreshed.
(2022-06-08 13:53:45): [ifp] [cache_req_search_dp] (0x0400): CR #7: Looking up [...] in data provider
(2022-06-08 13:53:45): [ifp] [sss_dp_issue_request] (0x0400): Issuing request for [...]
(2022-06-08 13:53:45): [ifp] [sss_dp_get_account_msg] (0x0400): Creating request for [...][0x2][BE_REQ_GROUP][...]
(2022-06-08 13:53:45): [ifp] [sbus_add_timeout] (0x2000): 0x565020464380
(2022-06-08 13:53:45): [ifp] [sss_dp_internal_get_send] (0x0400): Entering request [...]
(2022-06-08 13:53:45): [ifp] [responder_idle_handler] (0x2000): Terminating idle responder [0x56502045dda0]
(2022-06-08 13:53:45): [ifp] [sss_responder_ctx_destructor] (0x0400): Responder is being shut down
```

Comment 12 Alexey Tikhonov 2022-07-19 10:14:25 UTC
(In reply to Alexey Tikhonov from comment #11)
> > > 
> > > So, there are 2 bugs:
> > >  - `rctx->last_request_time` wasn't updated properly
> > 
> > `last_request_time` is updated:
> >  - once in `sss_process_init()` -> `responder_setup_idle_timeout_config()`
> > -> `setup_responder_idle_timer()`
> >  - on ever socket activity in `sss_client_fd_handler()`, but the issue is
> > `sssd_ifp` doesn't use "client fd", it uses system dbus for communication
> > with clients
> 
> 
> No, there is actually an idle time update is sbus code in
> `sbus_message_handler()`:
> https://github.com/SSSD/sssd/blob/ddfc7e99e96ee732586c07342900d287d2378802/
> src/sbus/sssd_dbus_interface.c#L1004
> 
> Nonetheless, it doesn't work as expected in the provided log:

This seems to work as expected in 2.x branch (can't reproduce on RHEL9), but not on 1-16 (can reproduce on RHEL7).
It looks `conn->last_request_time` is only set for monitor connection in 1-16 branch.

Comment 13 Alexey Tikhonov 2022-07-19 10:54:10 UTC
(In reply to Alexey Tikhonov from comment #10)
> Upstream PRs:
>  - https://github.com/SSSD/sssd/pull/6271

#6271 replaced by https://github.com/SSSD/sssd/pull/6273

>  - https://github.com/SSSD/sssd/pull/6272

Comment 14 Alexey Tikhonov 2022-07-19 11:04:54 UTC
There is a 3rd minor issue, that affects both 1-16 (RHEL7) and 2.x (RHEL8 & 9): idle timer fires every `timeout / 2` and checks for (now > idle + timeout).
As a result actual timeout can be as long as 3/2*timeout. I think condition should be fixed to `>=`.

Comment 15 Alexey Tikhonov 2022-07-19 12:38:25 UTC
(In reply to Alexey Tikhonov from comment #14)
> There is a 3rd minor issue, that affects both 1-16 (RHEL7) and 2.x (RHEL8 &
> 9): idle timer fires every `timeout / 2` and checks for (now > idle +
> timeout).
> As a result actual timeout can be as long as 3/2*timeout. I think condition
> should be fixed to `>=`.

Upstream PR: https://github.com/SSSD/sssd/pull/6277

Comment 23 Alexey Tikhonov 2022-07-25 16:02:18 UTC
Pushed PR: https://github.com/SSSD/sssd/pull/6272

* `sssd-1-16`
    * ecbc3329788de254346d7ea84770e09cc22dbc9c - RESPONDER: don't execute sss_dp_callback_destructor() if responder is being shut down.


Pushed PR: https://github.com/SSSD/sssd/pull/6273

* `sssd-1-16`
    * 352544558cef26c9ef4a7393cbf98776155efcdb - IFP: reset idle timer on any DBUS activity


Pushed PR: https://github.com/SSSD/sssd/pull/6277

* `master`
    * 473752e261d8d166b16f95fe4ddc8d320ae0faf4 - RESPONDER: fixed condition in responder_idle_handler()
* `sssd-1-16`
    * f37e15e9134ff22d03c4997fd1c76754b67d3539 - RESPONDER: fixed condition in responder_idle_handler()

Comment 33 errata-xmlrpc 2022-12-13 11:19:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (sssd bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:8950


Note You need to log in before you can comment on or make changes to this bug.