Bug 1939607
Summary: | hang because of incorrect accounting of readers in vattr rwlock | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | thierry bordaz <tbordaz> | |
Component: | 389-ds-base | Assignee: | thierry bordaz <tbordaz> | |
Status: | CLOSED ERRATA | QA Contact: | RHDS QE <ds-qe-bugs> | |
Severity: | unspecified | Docs Contact: | ||
Priority: | unspecified | |||
Version: | 8.4 | CC: | bsmejkal, ldap-maint, mreynolds, sgouvern | |
Target Milestone: | rc | Keywords: | Triaged | |
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | sync-to-jira | |||
Fixed In Version: | 389-ds-1.4-8050020210514191740-d5c171fc | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 2018257 (view as bug list) | Environment: | ||
Last Closed: | 2021-11-09 18:11:20 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 2018257 |
Description
thierry bordaz
2021-03-16 17:06:35 UTC
The first analyse was wrong. There is no vattr lock leak. Actually a pthread rwlock test program shows the exact same lock dump with - T1 reader holding the lock - T2 writer waiting for T1 - T3 reader waiting for T2 - T4 reader waiting for T2 (gdb) print *the_map->lock $41 = {__data = {__readers = 14, __writers = 0, __wrphase_futex = 2, __writers_futex = 1, __pad3 = 0, __pad4 = 0, __cur_writer = 0, __shared = 0, __rwelision = 0 '\000', __pad1 = "\000\000\000\000\000\000", __pad2 = 0, __flags = 2}, __size = "\016\000\000\000\000\000\000\000\002\000\000\000\001", '\000' , "\002\000\000\000\000\000\000", __align = 14} The RC of the deadlock is a 3 threads deadlock scenario: [08/Mar/2021:18:09:15.255947668 +0000] conn=4 op=561 ADD dn="cn=FleetCommander Desktop Profile Administrators,cn=roles,cn=accounts,dc=ipa,dc=test" [08/Mar/2021:18:09:15.261133390 +0000] conn=4 op=562 SRCH base="cn=FleetCommander Desktop Profile Administrators,cn=privileges,cn=pbac,dc=ipa,dc=test" scope=0 filter="(objectClass=*)" attrs="objectClasses aci * attributeTypes" [08/Mar/2021:18:09:15.263940289 +0000] conn=4 op=563 ADD dn="cn=FleetCommander Desktop Profile Administrators,cn=privileges,cn=pbac,dc=ipa,dc=test" [08/Mar/2021:18:09:15.264024493 +0000] conn=4 op=562 RESULT err=32 tag=101 nentries=0 wtime=0.000045722 optime=0.002898152 etime=0.002941253 [08/Mar/2021:18:09:15.261304370 +0000] conn=4 op=561 RESULT err=0 tag=105 nentries=0 wtime=0.000103907 optime=0.005360651 etime=0.005394639 Thread 14 conn=4 op=561 ADD "cn=FleetCommander Desktop Profile Administrators,cn=roles,cn=accounts,dc=ipa,dc=test" Hold vattr lock in read and wait for DB page (WAIT userRoot/objectclass.db) (hold by Thread 20) -> SIDGEN (post-op) -> internal SRCH -b "dc=ipa,dc=test" "(objectclass=ipantdomainattrs)" op_shared_search => hold vattr lock in read -> index read => DB page Thread 20 conn=4 op=563 ADD "cn=FleetCommander Desktop Profile Administrators,cn=privileges,cn=pbac,dc=ipa,dc=test" Hold DB page (HOLD userRoot/objectclass.db) waiting for vattr lock in read -> ADD -> memberof modify (txnbe post) -> DNA -> internal_search -> vattr_map_lookup => wait for vattr in read Thread 6 On backend state change, it rebuild the cos cache Try to acquire vattr in write blocking new readers Internal search SRCH -b "dc=ipa,dc=test" "(&(|(objectclass=cosSuperDefinition)(objectclass=cosDefinition))(objectclass=ldapsubentry))" -> cos_dn_defs_cb -> vattr_map_insert : wait vattr in write Thread 6 is blocked by Thread 14 Thread 14 is blocked by Thread 20 Thread 20 is blocked by Thread 6 Fix pushed upstream => POST Build tested: 389-ds-base-1.4.3.23-1.module+el8.5.0+11016+7e7e9011.x86_64 I had freeipa installation running 141x times in a loop without hang. The fix is in the build. Marking as Verified:Tested, SanityOnly. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (389-ds-base bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:4203 |