Bug 1472954 - Hang while waiting for mutex lock
Hang while waiting for mutex lock
Status: CLOSED DUPLICATE of bug 1470352
Product: Fedora
Classification: Fedora
Component: nss (Show other bugs)
26
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: Kai Engert (:kaie)
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-07-19 12:27 EDT by Jonathan Lebon
Modified: 2017-07-19 12:47 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-07-19 12:47:53 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Jonathan Lebon 2017-07-19 12:27:48 EDT
Description of problem:

rpm-ostree gets hung while waiting for repos to be updated. Inspecting the backtrace reveals that libnss is trying to lock a mutex which is already locked:

(gdb) bt
#0  0x00007f0fc8e3dfad in __lll_lock_wait () from /host/lib64/libpthread.so.0
#1  0x00007f0fc8e36f44 in pthread_mutex_lock () from /host/lib64/libpthread.so.0
#2  0x00007f0fc1fc51b9 in PR_Lock (lock=0x7f0fb45e32c0) at ../../../nspr/pr/src/pthreads/ptsynch.c:177
#3  0x00007f0fc4e3ccb5 in nssSlot_IsTokenPresent () from /host/lib64/libnss3.so
#4  0x00007f0fc4e3cee6 in nssSlot_GetToken () from /host/lib64/libnss3.so
#5  0x00007f0fc4e36d4d in nssTrustDomain_FindCertificatesBySubject () from /host/lib64/libnss3.so
#6  0x00007f0fc4e35b07 in nssCertificate_BuildChain () from /host/lib64/libnss3.so
#7  0x00007f0fc4deec36 in CERT_FindCertIssuer () from /host/lib64/libnss3.so
#8  0x00007f0fc4def13f in cert_VerifyCertChain () from /host/lib64/libnss3.so
#9  0x00007f0fc4defac9 in CERT_VerifyCertChain () from /host/lib64/libnss3.so
#10 0x00007f0fc4df0909 in cert_VerifyCertWithFlags () from /host/lib64/libnss3.so
#11 0x00007f0fc4df0bc2 in CERT_VerifyCert () from /host/lib64/libnss3.so
#12 0x00007f0fc2a5f55d in SSL_AuthCertificate () from /host/lib64/libssl3.so
#13 0x00007f0fc2a56950 in ssl3_AuthCertificate () from /host/lib64/libssl3.so
#14 0x00007f0fc2a57028 in ssl3_CompleteHandleCertificate () from /host/lib64/libssl3.so
#15 0x00007f0fc2a595b6 in ssl3_HandleHandshakeMessage () from /host/lib64/libssl3.so
#16 0x00007f0fc2a5cd0a in ssl3_HandleRecord () from /host/lib64/libssl3.so
#17 0x00007f0fc2a5ea00 in ssl3_GatherCompleteHandshake () from /host/lib64/libssl3.so
#18 0x00007f0fc2a64e69 in SSL_ForceHandshake () from /host/lib64/libssl3.so
#19 0x00007f0fc63fdef5 in nss_connect_common () from /host/lib64/libcurl.so.4
#20 0x00007f0fc63fa550 in Curl_ssl_connect_nonblocking () from /host/lib64/libcurl.so.4
#21 0x00007f0fc63aefd2 in https_connecting () from /host/lib64/libcurl.so.4
#22 0x00007f0fc63d5f36 in multi_runsingle () from /host/lib64/libcurl.so.4
#23 0x00007f0fc63d6fb3 in curl_multi_perform () from /host/lib64/libcurl.so.4
#24 0x00007f0fca4c56fe in lr_download () from /host/lib64/librepo.so.0
#25 0x00007f0fca4c5ce1 in lr_download_single_cb () from /host/lib64/librepo.so.0
#26 0x00007f0fca4d3b0a in lr_yum_perform () from /host/lib64/librepo.so.0
#27 0x00007f0fca4cabb9 in lr_handle_perform () from /host/lib64/librepo.so.0
#28 0x00007f0fcade5c4e in dnf_repo_update (repo=repo@entry=0x7f0fb422a6c0, flags=flags@entry=DNF_REPO_UPDATE_FLAG_FORCE, state=state@entry=0x7f0fb4525c50, error=error@entry=0x7f0fbcb20ca0)
    at /usr/src/debug/rpm-ostree-2017.7/libdnf/libdnf/dnf-repo.c:1609
#29 0x000055f0dcea473d in rpmostree_context_download_metadata (self=self@entry=0x7f0fb40048a0, cancellable=cancellable@entry=0x55f0de354480, error=error@entry=0x7f0fbcb20ca0)
    at src/libpriv/rpmostree-core.c:956
#30 0x000055f0dcea509e in rpmostree_context_prepare (self=self@entry=0x7f0fb40048a0, cancellable=cancellable@entry=0x55f0de354480, error=error@entry=0x7f0fbcb20ca0)
    at src/libpriv/rpmostree-core.c:1488
#31 0x000055f0dcec8940 in do_local_assembly (error=0x7f0fbcb20ca0, cancellable=0x55f0de354480, self=0x55f0de303ef0) at src/daemon/rpmostree-sysroot-upgrader.c:849
#32 maybe_do_local_assembly (error=0x7f0fbcb20ca0, cancellable=0x55f0de354480, self=0x55f0de303ef0) at src/daemon/rpmostree-sysroot-upgrader.c:971
#33 rpmostree_sysroot_upgrader_deploy (self=self@entry=0x55f0de303ef0, cancellable=cancellable@entry=0x55f0de354480, error=error@entry=0x7f0fbcb20ca0) at src/daemon/rpmostree-sysroot-upgrader.c:994
#34 0x000055f0dcec3464 in deploy_transaction_execute (transaction=0x55f0de3580a0, cancellable=0x55f0de354480, error=0x7f0fbcb20ca0) at src/daemon/rpmostreed-transaction-types.c:864
#35 0x000055f0dceba979 in transaction_execute_thread (task=0x55f0de30d380, source_object=<optimized out>, task_data=<optimized out>, cancellable=0x55f0de354480)
    at src/daemon/rpmostreed-transaction.c:296
#36 0x00007f0fc9ef8086 in g_task_thread_pool_thread () from /host/lib64/libgio-2.0.so.0
#37 0x00007f0fc997af00 in g_thread_pool_thread_proxy () from /host/lib64/libglib-2.0.so.0
#38 0x00007f0fc997a536 in g_thread_proxy () from /host/lib64/libglib-2.0.so.0
#39 0x00007f0fc8e3436d in start_thread () from /host/lib64/libpthread.so.0
#40 0x00007f0fc8b6cb8f in clone () from /host/lib64/libc.so.6

The issue is that the thread that locked it is that very same thread, resulting in a deadlock. One can poke around pthread internals to verify this (see also https://stackoverflow.com/a/3491304/308136):

(gdb) frame 2
#2  0x00007f0fc1fc51b9 in PR_Lock (lock=0x7f0fb45e32c0) at ../../../nspr/pr/src/pthreads/ptsynch.c:177
177         rv = pthread_mutex_lock(&lock->mutex);
(gdb) print lock->mutex.__data.__owner
$6 = 9061
(gdb) thread find 9061
Thread 4 has target id 'LWP 9061'
(gdb) thread
[Current thread is 4 (LWP 9061)]
(gdb)

I'm opening this against nss since it owns most of that backtrace. It may be that the issue is in nspr or libcurl. Let me know if so.

Version-Release number of selected component (if applicable):

# rpm -q rpm-ostree librepo libcurl nss nspr
rpm-ostree-2017.7-1.fc26.x86_64
librepo-1.7.20-3.fc26.x86_64
libcurl-7.53.1-7.fc26.x86_64
nss-3.31.0-1.0.fc26.x86_64
nspr-4.15.0-1.fc26.x86_64

How reproducible:

Sporatic.

Steps to Reproduce:
1. rpm-ostree install wget

Actual results:

Hangs.

Expected results:

Doesn't hang.

Additional info:

I can upload a core file somewhere if required.

Thanks!
Comment 1 Jonathan Lebon 2017-07-19 12:39:48 EDT
Core file available at https://jlebon.fedorapeople.org/core.1989.rhbz1472954 (507M).
Comment 2 Daiki Ueno 2017-07-19 12:44:18 EDT
(In reply to Jonathan Lebon from comment #0)

> nss-3.31.0-1.0.fc26.x86_64

Try -1.1 from:
https://bodhi.fedoraproject.org/updates/FEDORA-2017-244f799ac9
(see bug 1470352)
Comment 3 Jonathan Lebon 2017-07-19 12:47:53 EDT
Thanks! Marking as dupe.

*** This bug has been marked as a duplicate of bug 1470352 ***

Note You need to log in before you can comment on or make changes to this bug.