1744623 – DB Deadlock on modrdn appears to corrupt database and entry cache

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1744623 - DB Deadlock on modrdn appears to corrupt database and entry cache

Summary: DB Deadlock on modrdn appears to corrupt database and entry cache

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	389-ds-base
Sub Component:
Version:	7.3
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	unspecified
Target Milestone:	rc
Target Release:	---
Assignee:	mreynolds
QA Contact:	RHDS QE
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1744146 1744662 1749289
TreeView+	depends on / blocked

Reported:	2019-08-22 14:40 UTC by mreynolds
Modified:	2020-09-13 22:09 UTC (History)
CC List:	7 users (show)
Fixed In Version:	389-ds-base-1.3.10.1-5.el7
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1744662 1749289 (view as bug list)
Environment:
Last Closed:	2020-03-31 19:46:15 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	389ds 389-ds-base issues 2683	0	None	closed	DB Deadlock on modrdn appears to corrupt database and entry cache	2021-02-03 19:41:57 UTC
Red Hat Product Errata	RHBA-2020:1064	0	None	None	None	2020-03-31 19:46:55 UTC

Description mreynolds 2019-08-22 14:40:36 UTC

This bug is created as a clone of upstream ticket:
https://pagure.io/389-ds-base/issue/49624

#### Issue Description

If a db deadlock error occurs during a MODRDN that operation is tried again, but on the second pass things go wrong on that same operation.

So do a modrdn and move an entry so a new superior.  Then try and move it back to the original subtree.  Note -  I did instrument the code to always trigger a single deadlock error.  When I try to move it back to the original substree/superior I get an error 68! 

ldapsearch shows the entry was not moved as expected since we got an error:

 ldapsearch -D cn=dm -w password -b "ou=groups,dc=example,dc=com" -s sub -xLLL cn=accoun* \* \+
dn: cn=Accounting Managers,ou=MyOU,ou=Groups,dc=example,dc=com
objectClass: top
objectClass: groupOfUniqueNames
cn: Accounting Managers
ou: groups
description: People who can manage accounting entries
uniqueMember: cn=dm
nsUniqueId: 5a508aab-2e9611e8-b333e893-f12dcd9f
creatorsName:
modifiersName: cn=dm
createTimestamp: 20180323123313Z
modifyTimestamp: 20180323130251Z
entryid: 6
parentid: 10
entrydn: cn=accounting managers,ou=myou,ou=groups,dc=example,dc=com

 If I restart the server:

The entry is now in the original subtree (even though we got an error that it failed)

ldapsearch -D cn=dm -w password -b "ou=groups,dc=example,dc=com" -s sub -xLLL cn=accoun* \* \+

dn: cn=Accounting Managers,ou=Groups,dc=example,dc=com
objectClass: top
objectClass: groupOfUniqueNames
cn: Accounting Managers
...

Performing ldapsearch using various scopes also gives inconsistent results for this entry:

[root@localhost BUILD]# ldapsearch -D cn=dm -w password -b "ou=groups,dc=example,dc=com" -s one -xLLL cn=account*

---> no results

[root@localhost BUILD]# ldapsearch -D cn=dm -w password -b "ou=groups,dc=example,dc=com" -s sub -xLLL cn=account*
dn: cn=Accounting Managers,ou=Groups,dc=example,dc=com
objectClass: top
objectClass: groupOfUniqueNames
cn: Accounting Managers
ou: groups
description: People who can manage accounting entries
uniqueMember: cn=dm
nsUniqueId: 5a508aab-2e9611e8-b333e893-f12dcd9f
creatorsName:
modifiersName: cn=dm
createTimestamp: 20180323123313Z
modifyTimestamp: 20180323130251Z
entryid: 6
parentid: 10
entrydn: cn=accounting managers,ou=groups,dc=example,dc=com


dbscan shows that the entry's  parentid is still pointing to the old subtree:

	rdn: cn=Accounting Managers
	objectClass: top
	objectClass: groupOfUniqueNames
	cn: Accounting Managers
	ou: groups
	description: People who can manage accounting entries
	uniqueMember: cn=dm
	nsUniqueId: 5a508aab-2e9611e8-b333e893-f12dcd9f
	creatorsName:
	modifiersName: cn=dm
	createTimestamp: 20180323123313Z
	modifyTimestamp: 20180323130251Z
	entryid: 6
	parentid: 10

parentid should be 3 (not 10) in this case.  Perhaps that is messing up the scoped search?

If I export and reimport the ldif, the parentid is adjusted to the correct value of 3, and the entry is found under the original subtree.

So we are seeing database & entry cache corruption when a db deadlock occurs on modrdn operations.

Comment 2 mreynolds 2019-08-22 16:02:24 UTC

Cloned to RHEL 8:  

https://bugzilla.redhat.com/show_bug.cgi?id=1744662

Comment 7 Viktor Ashirov 2020-01-14 13:01:57 UTC

Build tested: 389-ds-base-1.3.10.1-4.el7

Using an automated reproducer from https://pagure.io/389-ds-base/pull-request/50821
I'm getting error 68 on MODRDN. And with ASAN build I'm getting the following error:

=================================================================
==4427==ERROR: AddressSanitizer: heap-use-after-free on address 0x60400024f3a0 at pc 0x7f9b0b8093c3 bp 0x7f9aeb07e1b0 sp 0x7f9aeb07e1a0
READ of size 8 at 0x60400024f3a0 thread T20
    #0 0x7f9b0b8093c2 in slapi_sdn_get_dn (/usr/lib64/dirsrv/libslapd.so.0+0xeb3c2)
    #1 0x7f9b0b809728 in slapi_sdn_dup (/usr/lib64/dirsrv/libslapd.so.0+0xeb728)
    #2 0x7f9afcb4e457 in ldbm_back_modrdn ldap/servers/slapd/back-ldbm/ldbm_modrdn.c:254
    #3 0x7f9b0b895805 in op_shared_rename ldap/servers/slapd/modrdn.c:612
    #4 0x7f9b0b896deb in do_modrdn (/usr/lib64/dirsrv/libslapd.so.0+0x178deb)
    #5 0x5594c1e03c1a in connection_dispatch_operation ldap/servers/slapd/connection.c:620
    #6 0x5594c1e03c1a in connection_threadmain ldap/servers/slapd/connection.c:1791
    #7 0x7f9b09953bfa in _pt_root ../../../nspr/pr/src/pthreads/ptthread.c:201
    #8 0x7f9b092f3ea4 in start_thread /usr/src/debug/glibc-2.17-c758a686/nptl/pthread_create.c:307
    #9 0x7f9b0899f8dc in __clone (/lib64/libc.so.6+0xfe8dc)

0x60400024f3a0 is located 16 bytes inside of 40-byte region [0x60400024f390,0x60400024f3b8)
freed by thread T20 here:
    #0 0x7f9b0bf79020 in __interceptor_free (/lib64/libasan.so.5+0xee020)
    #1 0x7f9b0b7f11e8 in slapi_ch_free (/usr/lib64/dirsrv/libslapd.so.0+0xd31e8)
    #2 0x7f9afcb4e40e in ldbm_back_modrdn ldap/servers/slapd/back-ldbm/ldbm_modrdn.c:252
    #3 0x7f9b0b895805 in op_shared_rename ldap/servers/slapd/modrdn.c:612
    #4 0x7f9b0b896deb in do_modrdn (/usr/lib64/dirsrv/libslapd.so.0+0x178deb)
    #5 0x5594c1e03c1a in connection_dispatch_operation ldap/servers/slapd/connection.c:620
    #6 0x5594c1e03c1a in connection_threadmain ldap/servers/slapd/connection.c:1791
    #7 0x7f9b09953bfa in _pt_root ../../../nspr/pr/src/pthreads/ptthread.c:201

previously allocated by thread T20 here:
    #0 0x7f9b0bf793e0 in malloc (/lib64/libasan.so.5+0xee3e0)
    #1 0x7f9b0b7f0a03 in slapi_ch_malloc (/usr/lib64/dirsrv/libslapd.so.0+0xd2a03)
    #2 0x7f9b0b807d52 in slapi_sdn_new (/usr/lib64/dirsrv/libslapd.so.0+0xe9d52)
    #3 0x7f9b0b808b7e in slapi_sdn_new_normdn_byval (/usr/lib64/dirsrv/libslapd.so.0+0xeab7e)
    #4 0x7f9afcb545a7 in ldbm_back_modrdn ldap/servers/slapd/back-ldbm/ldbm_modrdn.c:955
    #5 0x7f9b0b895805 in op_shared_rename ldap/servers/slapd/modrdn.c:612
    #6 0x7f9b0b896deb in do_modrdn (/usr/lib64/dirsrv/libslapd.so.0+0x178deb)
    #7 0x5594c1e03c1a in connection_dispatch_operation ldap/servers/slapd/connection.c:620
    #8 0x5594c1e03c1a in connection_threadmain ldap/servers/slapd/connection.c:1791
    #9 0x7f9b09953bfa in _pt_root ../../../nspr/pr/src/pthreads/ptthread.c:201

Thread T20 created by T0 here:
    #0 0x7f9b0bedce9f in pthread_create (/lib64/libasan.so.5+0x51e9f)
    #1 0x7f9b099538cb in _PR_CreateThread ../../../nspr/pr/src/pthreads/ptthread.c:433

SUMMARY: AddressSanitizer: heap-use-after-free (/usr/lib64/dirsrv/libslapd.so.0+0xeb3c2) in slapi_sdn_get_dn
Shadow bytes around the buggy address:
  0x0c0880041e20: fa fa 00 00 00 00 04 fa fa fa 00 00 00 00 04 fa
  0x0c0880041e30: fa fa 00 00 00 00 04 fa fa fa 00 00 00 00 06 fa
  0x0c0880041e40: fa fa 00 00 00 00 06 fa fa fa 00 00 00 00 06 fa
  0x0c0880041e50: fa fa 00 00 00 00 04 fa fa fa 00 00 00 00 04 fa
  0x0c0880041e60: fa fa 00 00 00 00 04 fa fa fa 00 00 00 00 04 fa
=>0x0c0880041e70: fa fa fd fd[fd]fd fd fa fa fa fd fd fd fd fd fa
  0x0c0880041e80: fa fa fd fd fd fd fd fd fa fa fd fd fd fd fd fa
  0x0c0880041e90: fa fa fd fd fd fd fd fa fa fa fd fd fd fd fd fa
  0x0c0880041ea0: fa fa fd fd fd fd fd fa fa fa fd fd fd fd fd fa
  0x0c0880041eb0: fa fa fd fd fd fd fd fa fa fa fd fd fd fd fd fa
  0x0c0880041ec0: fa fa fd fd fd fd fd fa fa fa 00 00 00 00 00 fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==4427==ABORTING

Marking as ASSIGNED.

Comment 9 Viktor Ashirov 2020-02-07 09:48:40 UTC

Patch is upstream https://pagure.io/389-ds-base/c/7abd73c62cc04c38977c119b0d3254ec9e0d496f?branch=389-ds-base-1.3.10
I've tested a scratch ASAN build with this patch, tests passed.

Comment 11 Viktor Ashirov 2020-02-10 10:53:55 UTC

Build tested:
389-ds-base-1.3.10.1-5.el7 with ASAN

No errors reported during the test. Marking as VERIFIED.

Comment 13 errata-xmlrpc 2020-03-31 19:46:15 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:1064

Note You need to log in before you can comment on or make changes to this bug.