Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1672584

Summary: [abrt] [faf] sssd: __pthread_rwlock_wrlock(): /usr/libexec/sssd/sssd_be killed by 11
Product: Red Hat Enterprise Linux 8 Reporter: Steeve Goveas <sgoveas>
Component: sssdAssignee: Tomas Halman <thalman>
Status: CLOSED ERRATA QA Contact: sssd-qe <sssd-qe>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 8.0CC: atikhono, dbula, grajaiya, jhrozek, lslebodn, mupadhye, mzidek, pbrezina, sbose, tscherf, wchadwic
Target Milestone: rcFlags: dbula: mirror+
Target Release: 8.2   
Hardware: Unspecified   
OS: Unspecified   
URL: http://faf.lab.eng.brq.redhat.com/faf/reports/bthash/8fa878897513ed0df8703f60876adc2c64eef26b/
Whiteboard: sync-to-jira
Fixed In Version: sssd-2.2.3-2.el8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-04-28 16:55:59 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1682305    
Bug Blocks:    

Description Steeve Goveas 2019-02-05 11:32:44 UTC
This bug has been created based on an anonymous crash report requested by the package maintainer.

Report URL: http://faf.lab.eng.brq.redhat.com/faf/reports/bthash/8fa878897513ed0df8703f60876adc2c64eef26b/

Crash starts at this test case from ad_provider/ldap_krb5 test suite
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
::   Enumerate user belonging to multiple groups
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: [ 06:10:13 ] :: [  BEGIN   ] :: Running 'getent -s sss group lkgroup011-22741'
lkgroup011-22741:*:1122741:lkuser01-22741
:: [ 06:10:13 ] :: [   PASS   ] :: Command 'getent -s sss group lkgroup011-22741' (Expected 0, got 0)
:: [ 06:10:13 ] :: [  BEGIN   ] :: Running 'id lkuser01-22741 | grep lkgroup01-22741 | grep lkgroup011-22741'
uid=1022741(lkuser01-22741) gid=1022741(lkgroup01-22741) groups=1022741(lkgroup01-22741),1122741(lkgroup011-22741),1222741(lkgroup012-22741)
:: [ 06:10:15 ] :: [   PASS   ] :: Command 'id lkuser01-22741 | grep lkgroup01-22741 | grep lkgroup011-22741' (Expected 0, got 0)
:: [ 06:10:15 ] :: [  BEGIN   ] :: Running 'id -g lkuser01-22741 | grep 1022741'
1022741
:: [ 06:10:15 ] :: [   PASS   ] :: Command 'id -g lkuser01-22741 | grep 1022741' (Expected 0, got 0)
:: [ 06:10:15 ] :: [  BEGIN   ] :: Running 'su_success lkuser01-22741 Secret123'
spawn su --shell /bin/sh nobody -- -c su --shell /bin/true -- "$1" -- lkuser01-22741
Password:
:: [ 06:10:17 ] :: [   PASS   ] :: Command 'su_success lkuser01-22741 Secret123' (Expected 0, got 0)
:: [ 06:10:28 ] :: [  BEGIN   ] :: Running 'id lkuser01-22741 | grep lkgroup01-22741 | grep lkgroup011-22741'
uid=1022741(lkuser01-22741) gid=1022741(lkgroup01-22741) groups=1022741(lkgroup01-22741),1122741(lkgroup011-22741),1222741(lkgroup012-22741)
:: [ 06:10:31 ] :: [   PASS   ] :: Command 'id lkuser01-22741 | grep lkgroup01-22741 | grep lkgroup011-22741' (Expected 0, got 0)
:: [ 06:10:31 ] :: [  BEGIN   ] :: Running 'id -g lkuser01-22741 | grep 1022741'
1022741
:: [ 06:10:31 ] :: [   PASS   ] :: Command 'id -g lkuser01-22741 | grep 1022741' (Expected 0, got 0)
:: [ 06:10:31 ] :: [  BEGIN   ] :: Running 'su_success lkuser01-22741 Secret123'
spawn su --shell /bin/sh nobody -- -c su --shell /bin/true -- "$1" -- lkuser01-22741
Password:
:: [ 06:10:33 ] :: [   PASS   ] :: Command 'su_success lkuser01-22741 Secret123' (Expected 0, got 0)
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
::   Duration: 31s
::   Assertions: 7 good, 0 bad
::   RESULT: PASS
** Enumerate-user-belonging-to-multiple-groups PASS Score:0
use_pty:FALSE /usr/share/restraint/plugins/run_task_plugins


Crash report received in email
time:           Tue 05 Feb 2019 06:10:23 AM EST
package:        sssd-common-2.0.0-38.el8
reason:         sssd_be killed by SIGSEGV
crash_function: orderly_shutdown
cmdline:        /usr/libexec/sssd/sssd_be --domain AD --uid 0 --gid 0 --logger=files
executable:     /usr/libexec/sssd/sssd_be
component:      sssd
uid:            0
username:       root
hostname:       host-8-242-67.host.centralci.eng.rdu2.redhat.com
os_release:     Red Hat Enterprise Linux release 8.0 Beta (Ootpa)
architecture:   x86_64
pwd:            /
kernel:         4.18.0-64.el8.x86_64
abrt_version:   2.10.9

Reports:
uReport: BTHASH=8fa878897513ed0df8703f60876adc2c64eef26b
ABRT Server: URL=http://faf.lab.eng.brq.redhat.com/faf/reports/bthash/8fa878897513ed0df8703f60876adc2c64eef26b
ABRT Server: URL=http://faf.lab.eng.brq.redhat.com/faf/reports/9745/
CI Job: https://platform-stg-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/sssd-rhel-8.0.0-candidate-runtest-ad-provider-ldap_krb5-win2k12r2/40/

Full Backtrace:
[New LWP 15889]
Error while reading shared library symbols for /lib64/libbasicobjects.so.0:
could not find '.gnu_debugaltlink' file for /var/cache/abrt-di/usr/lib/debug/usr/lib64/libbasicobjects.so.0.1.0-0.6.1-39.el8.x86_64.debug
Error while reading shared library symbols for /lib64/libref_array.so.1:
could not find '.gnu_debugaltlink' file for /var/cache/abrt-di/usr/lib/debug/usr/lib64/libref_array.so.1.2.1-0.6.1-39.el8.x86_64.debug
Error while reading shared library symbols for /lib64/libcollection.so.4:
could not find '.gnu_debugaltlink' file for /var/cache/abrt-di/usr/lib/debug/usr/lib64/libcollection.so.4.1.1-0.6.1-39.el8.x86_64.debug
Error while reading shared library symbols for /lib64/libsystemd.so.0:
could not find '.gnu_debugaltlink' file for /var/cache/abrt-di/usr/lib/debug/usr/lib64/libsystemd.so.0.23.0-239-11.el8.x86_64.debug
Error while reading shared library symbols for /lib64/libdhash.so.1:
could not find '.gnu_debugaltlink' file for /var/cache/abrt-di/usr/lib/debug/usr/lib64/libdhash.so.1.1.0-0.6.1-39.el8.x86_64.debug
Error while reading shared library symbols for /lib64/libdbus-1.so.3:
could not find '.gnu_debugaltlink' file for /var/cache/abrt-di/usr/lib/debug/usr/lib64/libdbus-1.so.3.19.7-1.12.8-7.el8.x86_64.debug
Error while reading shared library symbols for /lib64/libaudit.so.1:
could not find '.gnu_debugaltlink' file for /var/cache/abrt-di/usr/lib/debug/lib64/libaudit.so.1.0.0-3.0-0.10.20180831git0047a6c.el8.x86_64.debug
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Error while reading shared library symbols for /lib64/libpath_utils.so.1:
could not find '.gnu_debugaltlink' file for /var/cache/abrt-di/usr/lib/debug/usr/lib64/libpath_utils.so.1.0.1-0.6.1-39.el8.x86_64.debug
Error while reading shared library symbols for /lib64/liblz4.so.1:
could not find '.gnu_debugaltlink' file for /var/cache/abrt-di/usr/lib/debug/usr/lib64/liblz4.so.1.8.1-1.8.1.2-4.el8.x86_64.debug
Error while reading shared library symbols for /lib64/libmount.so.1:
could not find '.gnu_debugaltlink' file for /var/cache/abrt-di/usr/lib/debug/usr/lib64/libmount.so.1.1.0-2.32.1-8.el8.x86_64.debug
Error while reading shared library symbols for /lib64/libgcc_s.so.1:
could not find '.gnu_debugaltlink' file for /var/cache/abrt-di/usr/lib/debug/lib64/libgcc_s-8-20180905.so.1-8.2.1-3.5.el8.x86_64.debug
Error while reading shared library symbols for /lib64/libblkid.so.1:
could not find '.gnu_debugaltlink' file for /var/cache/abrt-di/usr/lib/debug/usr/lib64/libblkid.so.1.1.0-2.32.1-8.el8.x86_64.debug
Error while reading shared library symbols for /lib64/libuuid.so.1:
could not find '.gnu_debugaltlink' file for /var/cache/abrt-di/usr/lib/debug/usr/

Comment 1 Alexey Tikhonov 2019-02-07 17:36:59 UTC
So, backtrace boils down to following:

sssd::util/server.c:orderly_shutdown() ->
libc::exit() -> libc::on_exit() -> ... -> 
libtalloc::talloc_lib_atexit() -> [a bunch of libtalloc::_tc_free_*() leading to] -> 
sssd::providers/ldap/sdap_async.c:sdap_handle_destructor() [d-tor of "sdap_handle"] -> (inlined)sdap_handle_release() ->
(inlined)libldap::ldap_unbind_ext() ->ldap_ld_free() -> ...
libcrypto::RAND_get_rand_method()(*) -> CRYPTO_THREAD_write_lock()


RAND_get_rand_method() calls CRYPTO_THREAD_write_lock(rand_meth_lock)
with "rand_meth_lock" being "static CRYPTO_RWLOCK *rand_meth_lock;"
[ https://github.com/openssl/openssl/blob/master/crypto/rand/rand_lib.c ]

Take a note, this is libc::on_exit() path.

Now, https://github.com/openssl/openssl/blob/master/crypto/init.c#L140 :
atexit(OPENSSL_cleanup)

OPENSSL_cleanup() -> rand_cleanup_int() -> CRYPTO_THREAD_lock_free(rand_meth_lock);
[ https://github.com/openssl/openssl/blob/master/crypto/rand/rand_lib.c#L357 ]


So, I am not sure on order of execution of "on_exit" handlers, but it seems it might be so that OpenSSL handler is executed first and mutex is already freed when sssd's "on_exit" handlers tries to lock it...

Comment 2 Jakub Hrozek 2019-02-07 21:10:14 UTC
Thank you very much for the analysis, Alexey.

Given that this is "just" a crash on exit, I don't think it is worth a blocker flag for 8.0 and can wait for either 8.0.z if we will do a z-stream update for another reason or even 8.1

Comment 3 Jakub Hrozek 2019-04-05 13:08:23 UTC
potential place to look at: Sumit found that there is sdap_finalize that calls orderly_shutdown, perhaps we could unbind there?

Comment 19 Sumit Bose 2019-08-23 15:44:43 UTC
Master:
 - f19f8e6b917e77d5d2bfdedc78e5669b522ea265

Comment 20 Sumit Bose 2019-08-23 19:51:10 UTC
(In reply to Sumit Bose from comment #19)
> Master:
>  - f19f8e6b917e77d5d2bfdedc78e5669b522ea265

sorry, this commit had issues

Comment 22 Sumit Bose 2019-08-29 14:36:01 UTC
Master:
 - a9669683de3a1c39dc4e47dd2aca0a9f99b652a9

Comment 25 Alexey Tikhonov 2019-11-15 18:04:20 UTC
*** Bug 1771852 has been marked as a duplicate of this bug. ***

Comment 30 errata-xmlrpc 2020-04-28 16:55:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:1863