Bug 1086908

Summary: Performing deletes during tombstone purging results in operation errors
Product: Red Hat Enterprise Linux 7 Reporter: Noriko Hosoi <nhosoi>
Component: 389-ds-baseAssignee: Noriko Hosoi <nhosoi>
Status: CLOSED ERRATA QA Contact: Viktor Ashirov <vashirov>
Severity: unspecified Docs Contact:
Priority: medium    
Version: 7.1CC: nkinder, rmeggins, vashirov
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: 389-ds-base-1.3.3.1-1.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1086907 Environment:
Last Closed: 2015-03-05 09:34:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Noriko Hosoi 2014-04-11 19:33:47 UTC
+++ This bug was initially created as a clone of Bug #1086907 +++

Description of problem:
There is a race condition when looking up the parent of a deleted entry, where the cached entry is replaced by tombstone purging(updating the parents tombstonenumsubordinate attribute). The cached entry gets removed just after retrieving the entry from the cache, but before it can be locked (cache_lock_entry). This results in an operations error when trying to lock the entry that was just retrieved from the cache.

Comment 2 Jenny Severance 2014-10-23 12:31:57 UTC

Verification steps:

[1]  Create one instance, but enable/setup replication(no agreement needed)
[2]  Set the tombstone purging interval to 20 seconds

  example: 

dn: cn=replica,cn=dc\3Dexample\2Cdc\3Dcom,cn=mapping tree,cn=config
changetype: modify
replace: nsDS5ReplicaTombstonePurgeInterval
nsDS5ReplicaTombstonePurgeInterval: 20

[3]  Import or add 500 entries
[4]  Delete those 500 entries
[5]  All the delete operations should pass, and not report an error 1(operations error), or any error for that fact.

Comment 3 Viktor Ashirov 2015-01-23 15:34:34 UTC
$ rpm -qa | grep 389
389-ds-base-debuginfo-1.3.3.1-11.el7.x86_64
389-ds-base-libs-1.3.3.1-11.el7.x86_64
389-ds-base-1.3.3.1-11.el7.x86_64

1. Create one instance, but enable/setup replication (no agreement needed)
2. Set the tombstone purging interval to 20 seconds

$ ldapmodify -D "cn=Directory Manager" -w Secret123  << EOF
dn: cn=changelog5,cn=config
changetype: add
objectclass: top
objectclass: extensibleObject
cn: changelog5
nsslapd-changelogdir: /var/lib/dirsrv/slapd-rhel7/changelogdb
nsslapd-changelogmaxage: 10d
EOF
adding new entry "cn=changelog5,cn=config"

$ ldapmodify -D "cn=Directory Manager" -w Secret123  << EOF
dn: cn=replica,cn=dc\=example\,dc\=com,cn=mapping tree,cn=config
changetype: add
objectclass: top
objectclass: nsds5replica
objectclass: extensibleObject
cn: replica
nsds5replicaroot: dc=example,dc=com
nsds5replicaid: 7
nsds5replicatype: 3
nsds5flags: 1
nsds5ReplicaPurgeDelay: 604800
nsds5ReplicaBindDN: cn=replication manager,cn=config
nsDS5ReplicaTombstonePurgeInterval: 20
EOF
adding new entry "cn=replica,cn=dc\=example\,dc\=com,cn=mapping tree,cn=config"

3. Import or add 500 entries
$ ldclt -h localhost -p 389 -D "cn=Directory Manager" -w Secret123 -f cn=MrXXXXX -b "ou=people,dc=example,dc=com" -e add,person,incr,noloop,commoncounter -r0 -R499 -n 1 -v
ldclt version 4.23
/usr/bin/ldclt-bin -h localhost -p 389 -D "cn=Directory Manager" -w Secret123 -f cn=MrXXXXX -b ou=people,dc=example,dc=com -e add,person,incr,noloop,commoncounter -r0 -R499 -n 1 -v
Process ID         = 5834
Host to connect    = localhost
Port number        = 389
Bind DN            = cn=Directory Manager
Passwd             = Secret123
Referral           = on
Base DN            = ou=people,dc=example,dc=com
Filter             = "cn=MrXXXXX"
Max times inactive = 3
Max allowed errors = 1000
Number of samples  = -1
Number of threads  = 1
Total op. req.     = -1
Running mode       = 0x0e040201
Running mode       = verbose incremental commoncounter noloop add class=person
LDAP oper. timeout = 30 sec
Sampling interval  = 10 sec
Values range       = [0 , 499]
Filter's head      = "cn=Mr"
Filter's tail      = ""
ldclt[5834]: Starting at Fri Jan 23 16:25:56 2015

ldclt[5834]: T000: Hit top incremental value
ldclt[5834]: T000: thread is dead.
ldclt[5834]: Average rate:  500.00/thr  (  50.00/sec), total:    500
ldclt[5834]: Average rate:    0.00/thr  (   0.00/sec), total:      0
ldclt[5834]: All threads are dead - exit.
ldclt[5834]: Global average rate:  500.00/thr  ( 25.00/sec), total:    500
ldclt[5834]: Global number times "no activity" reports: never
ldclt[5834]: Global number of dead threads: 1
ldclt[5834]: Global no error occurs during this session.
ldclt[5834]: Ending at Fri Jan 23 16:26:16 2015
ldclt[5834]: Exit status 0 - No problem during execution.

4. Delete those 500 entries
$ ldclt -h localhost -p 389 -D "cn=Directory Manager" -w Secret123 -f cn=MrXXXXX -b "ou=people,dc=example,dc=com" -e delete,person,incr,noloop,commoncounter -r0 -R499 -n 1 -v
ldclt version 4.23
/usr/bin/ldclt-bin -h localhost -p 389 -D "cn=Directory Manager" -w Secret123 -f cn=MrXXXXX -b ou=people,dc=example,dc=com -e delete,person,incr,noloop,commoncounter -r0 -R499 -n 1 -v
Process ID         = 5837
Host to connect    = localhost
Port number        = 389
Bind DN            = cn=Directory Manager
Passwd             = Secret123
Referral           = on
Base DN            = ou=people,dc=example,dc=com
Filter             = "cn=MrXXXXX"
Max times inactive = 3
Max allowed errors = 1000
Number of samples  = -1
Number of threads  = 1
Total op. req.     = -1
Running mode       = 0x0b040201
Running mode       = verbose incremental commoncounter noloop delete class=person
LDAP oper. timeout = 30 sec
Sampling interval  = 10 sec
Values range       = [0 , 499]
Filter's head      = "cn=Mr"
Filter's tail      = ""
ldclt[5837]: Starting at Fri Jan 23 16:26:16 2015

ldclt[5837]: T000: Hit top incremental value
ldclt[5837]: T000: thread is dead.
ldclt[5837]: Average rate:  500.00/thr  (  50.00/sec), total:    500
ldclt[5837]: Average rate:    0.00/thr  (   0.00/sec), total:      0
ldclt[5837]: All threads are dead - exit.
ldclt[5837]: Global average rate:  500.00/thr  ( 25.00/sec), total:    500
ldclt[5837]: Global number times "no activity" reports: never
ldclt[5837]: Global number of dead threads: 1
ldclt[5837]: Global no error occurs during this session.
ldclt[5837]: Ending at Fri Jan 23 16:26:36 2015
ldclt[5837]: Exit status 0 - No problem during execution.

No errors were returned. Marking as VERIFIED.

Comment 5 errata-xmlrpc 2015-03-05 09:34:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-0416.html