RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1466724 - Suffix deletion hangs with namedPipeLog tests
Summary: Suffix deletion hangs with namedPipeLog tests
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: 389-ds-base
Version: 7.4
Hardware: x86_64
OS: Linux
medium
high
Target Milestone: rc
: ---
Assignee: mreynolds
QA Contact: RHDS QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-06-30 10:31 UTC by Sankar Ramalingam
Modified: 2019-07-18 14:50 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-07-18 14:50:45 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Comment 4 thierry bordaz 2017-06-30 13:13:52 UTC
The DS server hangs while processing
[30/Jun/2017:06:43:48.711609117 +051800] conn=8 op=8 DEL dn="cn=aci,cn=index,cn=NamedPipeDB,cn=ldbm database,cn=plugins,cn=config"

The DS thread is hanging on a DB mutex

Thread 10 (Thread 0x7f810e3d1700 (LWP 24803)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00007f8130d07403 in __db_pthread_mutex_condwait (env=0x5583130d0a00, mutex=140192266383520, timespec=0x0, mutexp=<optimized out>) at ../../src/mutex/mut_pthread.c:321
#2  __db_hybrid_mutex_suspend (env=env@entry=0x5583130d0a00, mutex=mutex@entry=19492, timespec=timespec@entry=0x0, exclusive=exclusive@entry=1) at ../../src/mutex/mut_pthread.c:577
#3  0x00007f8130d06750 in __db_tas_mutex_lock_int (nowait=0, timeout=0, mutex=<optimized out>, env=0x5583130d0a00) at ../../src/mutex/mut_tas.c:271
#4  __db_tas_mutex_lock (env=env@entry=0x5583130d0a00, mutex=19492, timeout=timeout@entry=0) at ../../src/mutex/mut_tas.c:302
#5  0x00007f8130db0e4a in __lock_get_internal (lt=lt@entry=0x5583132f1a20, sh_locker=sh_locker@entry=0x7f812a63ccc8, flags=flags@entry=0, obj=<optimized out>, lock_mode=<optimized out>, timeout=0, lock=lock@entry=0x7f810e3cee38) at ../../src/lock/lock.c:983
#6  0x00007f8130db21f8 in __lock_vec (env=0x5583130d0a00, sh_locker=sh_locker@entry=0x7f812a63ccc8, flags=<optimized out>, list=list@entry=0x7f810e3cedf0, nlist=nlist@entry=2, elistp=elistp@entry=0x7f810e3ced98) at ../../src/lock/lock.c:140
#7  0x00007f8130e0b0ab in __fop_lock_handle (env=<optimized out>, dbp=0x558313316600, locker=0x7f812a63ccc8, mode=<optimized out>, elockp=0x7f810e3ceec0, flags=<optimized out>) at ../../src/fileops/fop_util.c:144
#8  0x00007f8130e0d078 in __fop_remove_setup (dbp=dbp@entry=0x558313316600, txn=txn@entry=0x0, name=0x558313e2adc0 "/var/lib/dirsrv/slapd-deftestinst/db/NamedPipeDB/aci.db", flags=flags@entry=0) at ../../src/fileops/fop_util.c:1084
#9  0x00007f8130df05da in __db_remove_int (dbp=dbp@entry=0x558313316600, ip=0x0, txn=txn@entry=0x0, name=name@entry=0x7f810e3cf2f0 "/var/lib/dirsrv/slapd-deftestinst/db/NamedPipeDB/aci.db", subdb=subdb@entry=0x0, flags=flags@entry=0) at ../../src/db/db_remove.c:289
#10 0x00007f8130df107f in __db_remove (dbp=dbp@entry=0x558313316600, ip=<optimized out>, txn=txn@entry=0x0, name=name@entry=0x7f810e3cf2f0 "/var/lib/dirsrv/slapd-deftestinst/db/NamedPipeDB/aci.db", subdb=subdb@entry=0x0, flags=flags@entry=0) at ../../src/db/db_remove.c:220
#11 0x00007f8130df11e4 in __db_remove_pp (dbp=0x558313316600, name=0x7f810e3cf2f0 "/var/lib/dirsrv/slapd-deftestinst/db/NamedPipeDB/aci.db", subdb=0x0, flags=0) at ../../src/db/db_remove.c:194
#12 0x00007f812de9ea6d in dblayer_db_remove_ex (env=env@entry=0x5583134194c0, path=0x7f810e3cf2f0 "/var/lib/dirsrv/slapd-deftestinst/db/NamedPipeDB/aci.db", dbName=dbName@entry=0x0, use_lock=use_lock@entry=0) at ldap/servers/slapd/back-ldbm/dblayer.c:3155
#13 0x00007f812de9ecf6 in dblayer_erase_index_file_ex (be=<optimized out>, a=0x5583132bac40, use_lock=use_lock@entry=1, no_force_checkpoint=no_force_checkpoint@entry=0) at ldap/servers/slapd/back-ldbm/dblayer.c:3273
#14 0x00007f812de9ee1c in dblayer_erase_index_file (be=<optimized out>, a=<optimized out>, no_force_chkpt=no_force_chkpt@entry=0) at ldap/servers/slapd/back-ldbm/dblayer.c:3299
#15 0x00007f812ded5c0a in ldbm_instance_index_config_delete_callback (pb=0x7f810e3d0a50, e=0x558313402180, entryAfter=<optimized out>, returncode=0x7f810e3d0428, returntext=<optimized out>, arg=0x558313482820) at ldap/servers/slapd/back-ldbm/ldbm_index_config.c:173
#16 0x00007f813995e3e9 in dse_call_callback (pb=pb@entry=0x7f810e3d0a50, operation=operation@entry=32, flags=flags@entry=1, entryBefore=entryBefore@entry=0x558313402180, entryAfter=entryAfter@entry=0x0, returncode=returncode@entry=0x7f810e3d0428, returntext=returntext@entry=0x7f810e3d0470 "", pdse=<optimized out>) at ldap/servers/slapd/dse.c:2637
#17 0x00007f8139961828 in dse_delete (pb=0x7f810e3d0a50) at ldap/servers/slapd/dse.c:2499
#18 0x00007f8139957fd0 in op_shared_delete (pb=pb@entry=0x7f810e3d0a50) at ldap/servers/slapd/delete.c:331
#19 0x00007f8139958352 in do_delete (pb=pb@entry=0x7f810e3d0a50) at ldap/servers/slapd/delete.c:97
#20 0x000055831199d642 in connection_dispatch_operation (pb=0x7f810e3d0a50, op=0x558313483ee0, conn=0x558313f19cd0) at ldap/servers/slapd/connection.c:623
#21 connection_threadmain () at ldap/servers/slapd/connection.c:1772
#22 0x00007f8137d0a9bb in _pt_root (arg=0x5583130c7cc0) at ../../../nspr/pr/src/pthreads/ptthread.c:216
#23 0x00007f81376aae25 in start_thread (arg=0x7f810e3d1700) at pthread_create.c:308
#24 0x00007f8136f8c34d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

The DS process (24756) is hanging (thread 10) because an other process (16411) acquired that mutex and did not release it.

    21d8 dd= 0 locks held 0    write locks 0    pid/thread 24756/140192266393344 flags 0    priority 100
    21d8 WRITE         1 WAIT    NamedPipeDB/aci.db        handle        0
    2147 dd=47 locks held 1    write locks 0    pid/thread 16411/140050708789440 flags 10   priority 100
    2147 READ          1 HELD    NamedPipeDB/aci.db        handle        0


The process 16411 is no longer running. Nothing in DS error logs or audit logs mentioned that process.

No core was dumped and audit log does not record crashes at that time.

The BZ https://bugzilla.redhat.com/show_bug.cgi?id=1277887, is fixed in the running version libdb-5.3.21-20.el7.x86_64.


Next steps

  - It would be great to find which process was 16411
  - .... :(

Comment 9 mreynolds 2017-10-19 20:47:21 UTC
This test passes for me on 1.3.7 for RHEL 7.5.  I ran the namedPipe tests 10 times without a hang.

Can someone please test the latest version of 1.3.7 from brew and verify the results?  Thanks!

Comment 10 Viktor Ashirov 2017-10-20 08:14:11 UTC
When these tests are executed standalone, they pass. But if they're executed in a full acceptance run, with namedPipeLog tests being last, hang occurs. I have a hunch that this bug is a side effect of bz1463204, but I need to confirm it.

I'll rerun the tests. Since it's a full run, it'll take some time, results should be available by Monday.

Comment 11 Viktor Ashirov 2017-10-26 14:42:00 UTC
Tests still hang, when namedPipeLog is the last test. I'm working on a standalone reproducer.

Comment 12 mreynolds 2018-11-30 16:21:30 UTC
Viktor, was there any progress on a reproducer?  Should be push this to rhel 8?  Thoughts?

Comment 13 Viktor Ashirov 2019-07-18 14:50:45 UTC
(In reply to mreynolds from comment #12)
> Viktor, was there any progress on a reproducer?  Should be push this to rhel
> 8?  Thoughts?

I can't reproduce it anymore.


Note You need to log in before you can comment on or make changes to this bug.