Bug 1796044

Summary: [8.2 regression] sssd_be and sssd crash with SIGABRT in sss_ptr_hash_check_type()
Product: Red Hat Enterprise Linux 8 Reporter: Martin Pitt <mpitt>
Component: sssdAssignee: Alexey Tikhonov <atikhono>
Status: CLOSED DUPLICATE QA Contact: sssd-qe <sssd-qe>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 8.2CC: atikhono, grajaiya, jhrozek, lslebodn, mzidek, pbrezina, tscherf
Target Milestone: rc   
Target Release: 8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-01-29 13:57:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
journal with crash information none

Description Martin Pitt 2020-01-29 13:32:01 UTC
Created attachment 1656252 [details]
journal with crash information

Description of problem: In our recent (about one week ago) rhel 8.2 image refresh [1] in cockpit's CI, tests now found a crash:


dbus-daemon[759]: [system] Activating via systemd: service name='org.freedesktop.sssd.infopipe' unit='sssd-ifp.service' requested by ':1.145' (uid=0 pid=3027 comm="/usr/libexec/cockpit-session localhost " label="system_u:system_r:cockpit_session_t:s0")
dbus-daemon[759]: [system] Activation via systemd failed for unit 'sssd-ifp.service': Unit sssd-ifp.service is masked.
cockpit-session[3027]: pam_cockpit_cert: Failed to map certificate to user: [org.freedesktop.systemd1.UnitMasked] Unit sssd-ifp.service is masked.
cockpit-session[3027]: pam_sepermit(cockpit:auth): Cannot determine the user's name
cockpit-session[3027]: pam_succeed_if(cockpit:auth): error retrieving user name: Conversation error
cockpit-session[3027]: pam_succeed_if(cockpit:auth): error retrieving user name: Conversation error
systemd[1]: sssd.service: Main process exited, code=dumped, status=6/ABRT
sssd[nss][1629]: Shutting down (status = 0)
sssd[pam][1630]: Shutting down (status = 0)
sssd[ssh][1631]: Shutting down (status = 0)
sssd[sudo][1632]: Shutting down (status = 0)
sssd[pac][1633]: Shutting down (status = 0)
systemd[1]: sssd.service: Failed with result 'core-dump'.
systemd-coredump[2941]: Process 1626 (sssd) of user 0 dumped core.


                        Stack trace of thread 1626:
                        #0  0x00007f9c2e37470f raise (libc.so.6)
                        #1  0x00007f9c2e35eb25 abort (libc.so.6)
                        #2  0x00007f9c2ec6ede0 talloc_abort.cold.20 (libtalloc.so.2)
                        #3  0x00007f9c2ec6ef2a talloc_check_name.cold.26 (libtalloc.so.2)
                        #4  0x00007f9c31ecf731 sss_ptr_hash_check_type (libsss_util.so)
                        #5  0x00007f9c31ecf80d sss_ptr_hash_lookup_internal (libsss_util.so)
                        #6  0x00007f9c31ecfc62 _sss_ptr_hash_lookup (libsss_util.so)
                        #7  0x00007f9c2f2c133f sbus_server_matchmaker (libsss_sbus.so)
                        #8  0x00007f9c2f2c169f sbus_server_name_owner_changed (libsss_sbus.so)
                        #9  0x00007f9c31ecf62f sss_ptr_hash_delete_cb (libsss_util.so)
                        #10 0x00007f9c2f096d7d hash_delete (libdhash.so.1)
                        #11 0x00007f9c31ecfd86 sss_ptr_hash_delete (libsss_util.so)
                        #12 0x00007f9c31ecfe31 sss_ptr_hash_spy_destructor (libsss_util.so)
                        #13 0x00007f9c2ec75c50 _tc_free_children_internal (libtalloc.so.2)
                        #14 0x00007f9c2ec71034 _talloc_free (libtalloc.so.2)
                        #15 0x00007f9c2ee8d3b9 tevent_common_invoke_timer_handler (libtevent.so.0)
                        #16 0x00007f9c2ee8d55e tevent_common_loop_timer_delay (libtevent.so.0)
                        #17 0x00007f9c2ee8a82f poll_event_loop_once (libtevent.so.0)
                        #18 0x00007f9c2ee87b15 _tevent_loop_once (libtevent.so.0)
                        #19 0x00007f9c2ee87dbb tevent_common_loop_wait (libtevent.so.0)
                        #20 0x00007f9c31ec3927 server_loop (libsss_util.so)
                        #21 0x000055ab6524875e main (sssd)
                        #22 0x00007f9c2e3606a3 __libc_start_main (libc.so.6)
                        #23 0x000055ab652488ae _start (sssd)
systemd[1]: sssd.service: Service RestartSec=100ms expired, scheduling restart.
systemd[1]: sssd.service: Scheduled restart job, restart counter is at 1.
systemd[1]: Stopped System Security Services Daemon.
systemd[1]: Starting System Security Services Daemon...
systemd-coredump[2943]: Process 1627 (sssd_be) of user 0 dumped core.

                        Stack trace of thread 1627:
                        #0  0x00007f3f573a770f raise (libc.so.6)
                        #1  0x00007f3f57391b25 abort (libc.so.6)
                        #2  0x00007f3f57ca1de0 talloc_abort.cold.20 (libtalloc.so.2)
                        #3  0x00007f3f57ca1f2a talloc_check_name.cold.26 (libtalloc.so.2)
                        #4  0x00007f3f5af02731 sss_ptr_hash_check_type (libsss_util.so)
                        #5  0x00007f3f5af0280d sss_ptr_hash_lookup_internal (libsss_util.so)
                        #6  0x00007f3f5af02c62 _sss_ptr_hash_lookup (libsss_util.so)
                        #7  0x00007f3f582f433f sbus_server_matchmaker (libsss_sbus.so)
                        #8  0x00007f3f582f469f sbus_server_name_owner_changed (libsss_sbus.so)
                        #9  0x00007f3f5af0262f sss_ptr_hash_delete_cb (libsss_util.so)
                        #10 0x00007f3f580c9d7d hash_delete (libdhash.so.1)
                        #11 0x00007f3f5af02d86 sss_ptr_hash_delete (libsss_util.so)
                        #12 0x00007f3f5af02e31 sss_ptr_hash_spy_destructor (libsss_util.so)
                        #13 0x00007f3f57ca8c50 _tc_free_children_internal (libtalloc.so.2)
                        #14 0x00007f3f57ca4034 _talloc_free (libtalloc.so.2)
                        #15 0x00007f3f57ec03b9 tevent_common_invoke_timer_handler (libtevent.so.0)
                        #16 0x00007f3f57ec055e tevent_common_loop_timer_delay (libtevent.so.0)
                        #17 0x00007f3f57ec17ab epoll_event_loop_once (libtevent.so.0)
                        #18 0x00007f3f57ebf99b std_event_loop_once (libtevent.so.0)
                        #19 0x00007f3f57ebab15 _tevent_loop_once (libtevent.so.0)
                        #20 0x00007f3f57ebadbb tevent_common_loop_wait (libtevent.so.0)
                        #21 0x00007f3f57ebf92b std_event_loop_wait (libtevent.so.0)
                        #22 0x00007f3f5aef6927 server_loop (libsss_util.so)
                        #23 0x00005635ddaee62b main (sssd_be)
                        #24 0x00007f3f573936a3 __libc_start_main (libc.so.6)
                        #25 0x00005635ddaee7ee _start (sssd_be)
sssd[3034]: Starting up
sssd[be[implicit_files]][3042]: Starting up
sssd[be[cockpit.lan]][3043]: Starting up
sssd[sudo][3047]: Starting up
sssd[ssh][3046]: Starting up
sssd[pam][3045]: Starting up
sssd[pac][3048]: Starting up
sssd[nss][3044]: Starting up
systemd[1]: Started System Security Services Daemon.
sssd_be[3043]: GSSAPI client step 1


The test itself finishes, so the sssd-ifp lookup for mapping a certificate to a user actually works. The test just fails because of the unexpected error message in the journal.

Version-Release number of selected component (if applicable):

sssd-common-2.2.3-11.el8.x86_64
ipa-client-4.8.4-2.module+el8.2.0+5271+3e37a50a.x86_64


How reproducible: Always

I don't have a standalone reproducer yet, I will work on that next.


[1] https://github.com/cockpit-project/bots/pull/469

Comment 1 Martin Pitt 2020-01-29 13:52:16 UTC
Indeed this happens in the part of the test that disables ifp. The reproducer is trivial:

    systemctl mask sssd-ifp && systemctl stop sssd-ifp

It's not necessary to join the machine to a domain or anything, this works straight after booting a pristine RHEL 8.2 install.

Comment 2 Alexey Tikhonov 2020-01-29 13:57:25 UTC
This is dulicate of bz 1792331

Upstream PR: https://github.com/SSSD/sssd/pull/977

*** This bug has been marked as a duplicate of bug 1792331 ***