Bug 1214459 - Lock not released of an entry in the entry cache
Summary: Lock not released of an entry in the entry cache
Keywords:
Status: CLOSED DUPLICATE of bug 1273550
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: 389-ds-base
Version: 7.2
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: pre-dev-freeze
: 7.3
Assignee: Noriko Hosoi
QA Contact: Viktor Ashirov
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-04-22 18:27 UTC by thierry bordaz
Modified: 2020-09-13 21:24 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-10-27 16:08:04 UTC
Target Upstream Version:


Attachments (Terms of Use)
error log (36.83 KB, text/plain)
2015-04-22 18:44 UTC, thierry bordaz
no flags Details
access log (188.81 KB, application/x-gzip)
2015-04-22 18:45 UTC, thierry bordaz
no flags Details
pstack (48.85 KB, text/plain)
2015-04-22 18:45 UTC, thierry bordaz
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github 389ds 389-ds-base issues 1496 0 None None None 2020-09-13 21:24:00 UTC

Description thierry bordaz 2015-04-22 18:27:38 UTC
Description of problem:

A bind operation tries to update the target entry. To do this it triggers an internal modify that tries to lock the entry into the entry cache.

The entry lock, is already acquired (so the bind operation hang). But the owner thread is processing an other operation and can not own the lock

Version-Release number of selected component (if applicable):

master branch (d61a073 Tests to support ticket 48026)
+ ipa master branch


How reproducible:

TBD. (so far no clear test case)

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 thierry bordaz 2015-04-22 18:35:40 UTC
Hang description:

Thread 24 is trying to update an entry ("uid=stageadm,cn=users,cn=accounts,dc=idm,dc=lab,dc=eng,dc=brq,dc=redhat,dc=com") during a bind. The bind is successfull.

The owner of the entry (stageadm) in the entry cache is thread 15 that is doing a MOD on a completely different entry (fqdn=vm-yyy.xxx,cn=computers,cn=accounts,dc=idm,dc=lab,dc=eng,dc=brq,dc=redhat,dc=com"

It is like thread 25 did and completed and operation that did not release the entry lock.

(gdb) thread 24
[Switching to thread 24 (Thread 0x7f97a07f8700 (LWP 20310))]
#0  0x00007f97c4ae3590 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
(gdb) where
#0  0x00007f97c4ae3590 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f97c51386b3 in PR_EnterMonitor (mon=0x7f97800cfea0) at ../../../nspr/pr/src/pthreads/ptsynch.c:592
#2  0x00007f97baf0ca5a in cache_lock_entry (cache=cache@entry=0x7f97c95a2b38, e=0x7f9780085f70)
    at ldap/servers/slapd/back-ldbm/cache.c:1548
#3  0x00007f97baf1dd65 in find_entry_internal_dn (flags=0, txn=0x7f97a07f3320, lock=1, sdn=0x7f97800d0450, be=0x7f97c96c1c30, 
    pb=0x7f97800cf9e0) at ldap/servers/slapd/back-ldbm/findentry.c:155
#4  find_entry_internal (pb=pb@entry=0x7f97800cf9e0, be=0x7f97c96c1c30, lock=lock@entry=1, txn=txn@entry=0x7f97a07f3320, flags=0, 
    addr=<optimized out>, addr=<optimized out>) at ldap/servers/slapd/back-ldbm/findentry.c:293
#5  0x00007f97baf1e03c in find_entry2modify (pb=pb@entry=0x7f97800cf9e0, be=<optimized out>, addr=<optimized out>, 
    txn=txn@entry=0x7f97a07f3320) at ldap/servers/slapd/back-ldbm/findentry.c:324
#6  0x00007f97baf51f86 in ldbm_back_modify (pb=<optimized out>) at ldap/servers/slapd/back-ldbm/ldbm_modify.c:576
#7  0x00007f97c6d45035 in op_shared_modify (pb=pb@entry=0x7f97800cf9e0, pw_change=pw_change@entry=0, old_pw=0x0)
    at ldap/servers/slapd/modify.c:1086
#8  0x00007f97c6d45ad7 in modify_internal_pb (pb=pb@entry=0x7f97800cf9e0) at ldap/servers/slapd/modify.c:631
#9  0x00007f97c6d465e3 in slapi_modify_internal_pb (pb=pb@entry=0x7f97800cf9e0) at ldap/servers/slapd/modify.c:486
#10 0x00007f97bc5f7bc3 in ipalockout_postop (pb=0x7f97a07f7b20) at ipa_lockout.c:630
#11 0x00007f97c6d56abf in plugin_call_func (list=0x7f97c950f930, operation=operation@entry=501, pb=pb@entry=0x7f97a07f7b20, 
    call_one=call_one@entry=0) at ldap/servers/slapd/plugin.c:1949
#12 0x00007f97c6d56de3 in plugin_call_list (pb=0x7f97a07f7b20, operation=501, list=<optimized out>)
    at ldap/servers/slapd/plugin.c:1893
#13 plugin_call_plugins (pb=pb@entry=0x7f97a07f7b20, whichfunction=whichfunction@entry=501) at ldap/servers/slapd/plugin.c:467
#14 0x00007f97c722328d in do_bind (pb=pb@entry=0x7f97a07f7b20) at ldap/servers/slapd/bind.c:424
#15 0x00007f97c72299d4 in connection_dispatch_operation (pb=0x7f97a07f7b20, op=0x7f97c9a44410, conn=0x7f97b404e130)
    at ldap/servers/slapd/connection.c:635
#16 connection_threadmain () at ldap/servers/slapd/connection.c:2534
#17 0x00007f97c513dcab in _pt_root (arg=0x7f97c9a8d0e0) at ../../../nspr/pr/src/pthreads/ptthread.c:212
#18 0x00007f97c4ade52a in start_thread () from /lib64/libpthread.so.0
#19 0x00007f97c481a79d in clone () from /lib64/libc.so.6

(gdb) print *pb->pb_op->o_params.p.p_modify.modify_mods[0]
$10 = {mod_op = 130, mod_type = 0x7f97800526b0 "krbLastSuccessfulAuth", mod_vals = {modv_strvals = 0x7f9780080090, 
    modv_bvals = 0x7f9780080090}}
(gdb) print **pb->pb_op->o_params.p.p_modify.modify_mods[0].mod_vals.modv_bvals
$16 = {bv_len = 15, bv_val = 0x7f9780086360 "20150422172417Z"}


(gdb) frame 2
#2  0x00007f97baf0ca5a in cache_lock_entry (cache=cache@entry=0x7f97c95a2b38, e=0x7f9780085f70)
    at ldap/servers/slapd/back-ldbm/cache.c:1548
1548	    PR_EnterMonitor(e->ep_mutexp);
(gdb) print e->ep_mutexp->owner
$20 = 140289133967104 -> 0x7f979bfef700

Thread 15 (Thread 0x7f979bfef700 (LWP 20319)):
#0  0x00007f97c4ae3590 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f97c51386b3 in PR_EnterMonitor (mon=0x7f97c95a4b80) at ../../../nspr/pr/src/pthreads/ptsynch.c:592
#2  0x00007f97baf11612 in dblayer_lock_backend (be=<optimized out>) at ldap/servers/slapd/back-ldbm/dblayer.c:3978
#3  0x00007f97baf165b6 in dblayer_txn_begin (be=0x7f97c96c1c30, parent_txn=0x0, txn=txn@entry=0x7f979bfea6f0) at ldap/servers/slapd/back-ldbm/dblayer.c:3697
#4  0x00007f97baf51995 in ldbm_back_modify (pb=<optimized out>) at ldap/servers/slapd/back-ldbm/ldbm_modify.c:560
#5  0x00007f97c6d45035 in op_shared_modify (pb=pb@entry=0x7f979bfeeb20, pw_change=pw_change@entry=0, old_pw=0x0) at ldap/servers/slapd/modify.c:1086
#6  0x00007f97c6d46337 in do_modify (pb=pb@entry=0x7f979bfeeb20) at ldap/servers/slapd/modify.c:419
#7  0x00007f97c7229985 in connection_dispatch_operation (pb=0x7f979bfeeb20, op=0x7f97c9b8f6d0, conn=0x7f97b4051310) at ldap/servers/slapd/connection.c:660
#8  connection_threadmain () at ldap/servers/slapd/connection.c:2534
#9  0x00007f97c513dcab in _pt_root (arg=0x7f97c9aaf5e0) at ../../../nspr/pr/src/pthreads/ptthread.c:212
#10 0x00007f97c4ade52a in start_thread () from /lib64/libpthread.so.0
#11 0x00007f97c481a79d in clone () from /lib64/libc.so.6

Comment 3 thierry bordaz 2015-04-22 18:44:21 UTC
Created attachment 1017562 [details]
error log

Comment 4 thierry bordaz 2015-04-22 18:45:18 UTC
Created attachment 1017563 [details]
access log

Comment 5 thierry bordaz 2015-04-22 18:45:50 UTC
Created attachment 1017564 [details]
pstack

Comment 6 Noriko Hosoi 2015-04-22 18:59:54 UTC
Hi Tierry, could you put the version of 389-ds-base?  Since you opened it for RHDS10, it is on RHEL-7?

Comment 7 thierry bordaz 2015-04-22 20:06:54 UTC
Humm. 

Hi Noriko,

I had a doubt, either open a ticket or a bz.
I did my test on F21 platform with a private build (389-ds-base-devel-1.3.4.a1-20150419103346.git1bca52e.fc21.x86_64) done with master branch (HEAD 'd61a073 Tests to support ticket 48026') + 1209573.

I understand the product/component/version does not match the test I did.
What should I change in this those settings ?

thanks
theirry

Comment 8 Noriko Hosoi 2015-04-22 21:03:17 UTC
Thanks for the clarification, Theirry.  Then, please open an upstream ticket.  We could talk about the target for the ticket, but you want to fix it in 1.3.4, don't you?  Regarding this bugzilla, I'll turn it into rhel-7.2.  So, could you open a ticket ane let me know the ticket number?  Thanks!

Comment 9 Noriko Hosoi 2015-08-12 20:06:33 UTC
Hi Thierry, do you think we still have this problem?
Thanks,
--noriko

Comment 10 thierry bordaz 2015-10-27 11:09:55 UTC
Hi Noriko,

sorry for the very long delay of my answer.
The upstream ticket 48165 is very likely a duplicate of 47978.
In both cases, a thread may forget to unlock an entry in the entry cache (if by "chance" the entry gets out of the cache between the betxn_postop and the end of the backend operation.

thanks
theirry

Comment 11 Noriko Hosoi 2015-10-27 16:08:04 UTC
(In reply to thierry bordaz from comment #10)
> Hi Noriko,
> 
> sorry for the very long delay of my answer.
> The upstream ticket 48165 is very likely a duplicate of 47978.
> In both cases, a thread may forget to unlock an entry in the entry cache (if
> by "chance" the entry gets out of the cache between the betxn_postop and the
> end of the backend operation.
> 
> thanks
> theirry

Thank you for the good news, Thierry!  One down. ;)

*** This bug has been marked as a duplicate of bug 1273550 ***


Note You need to log in before you can comment on or make changes to this bug.