Environment is below: OS: RHEL 5.x 389DS: 389-ds-base-1.2.8.3-1.e15 Replication: Supplier*3, Consumer*3 We can't remove detritus of tombstones under the system environment above. I first thought this phenomenon was related to Bug 696407 or Bug 684996, but it doesn't seem to be right. After I erased the entry like below, error messages are logged during purge process. cn=111111100,ou=group,o=example,c=jp cn=222222200,cn=111111100,ou=group,o=example,c=jp The error messages are: [errors] [13/Sep/2011:10:34:22 +0900] str2entry - Failed to convert DN cn=222222200 to RDN [13/Sep/2011:10:34:22 +0900] id2entry - str2entry returned NULL for id 99152, string="rdn" Here is a dbscan: [dbscan] # dbscan -f /var/lib/dirsrv/slapd-wam-ldap01/db/example/id2entry.db4 -K 99152 id 99152 rdn: nsuniqueid=4284f902-dd3911e0-883998b7-9c671d85,cn=222222200 objectClass;vucsn-4e6df8ab0001044c0000: top objectClass;vucsn-4e6df8ab0001044c0000: groupOfUniqueNames objectClass;vucsn-4e6df8d10000044c0000: nsTombstone ou;vucsn-4e6df8ab0001044c0000: groups cn;vucsn-4e6df8ab0001044c0000;mdcsn-4e6df8ab0001044c0000: 222222200 creatorsName;vucsn-4e6df8ab0001044c0000: cn=admin modifiersName;vucsn-4e6df8ab0001044c0000: cn=admin createTimestamp;vucsn-4e6df8ab0001044c0000: 20110912121849Z modifyTimestamp;vucsn-4e6df8ab0001044c0000: 20110912121849Z nsUniqueId: 4284f902-dd3911e0-883998b7-9c671d85 parentid: 99151 entryid: 99152 nsParentUniqueId: 4284f901-dd3911e0-883998b7-9c671d85 nscpEntryDN: cn=222222200,cn=111111100,ou=group,o=example,c=jp # dbscan -f /var/lib/dirsrv/slapd-wam-ldap01/db/example/id2entry.db4 -K 99151 Can't set cursor to returned item: DB_NOTFOUND: No matching key/data pair found Thanks for any help in advance.
Aoki-san, I'm trying to reproduce the problem. So for, no luck. Could you share the steps to duplicate the bug? What's the values of nsds5ReplicaPurgeDelay and nsds5ReplicaTombstonePurgeInterval in your replica entry? You had a subtree like this: ou=group,o=example,c=jp cn=111111100,ou=group,o=example,c=jp cn=222222200,cn=111111100,ou=group,o=example,c=jp How did you delete the subtree? Does the problem occur every time you repeat the test? Or does it depend upon the timing you delete the subtree?
Thank you for the response. > I'm trying to reproduce the problem. So for, no luck. Could you share the > steps to duplicate the bug? > > What's the values of nsds5ReplicaPurgeDelay and > nsds5ReplicaTombstonePurgeInterval in your replica entry? The default value is used. nsds5ReplicaPurgeDelay : 604800 nsDS5ReplicaTombstonePurgeInterval: 86400 > You had a subtree like this: > ou=group,o=example,c=jp > cn=111111100,ou=group,o=example,c=jp > cn=222222200,cn=111111100,ou=group,o=example,c=jp > How did you delete the subtree? The problem occured when low-order entry was deleted sequentially(1->2). 1. ldapdelete -x -D cn=root -W cn=222222200,cn=111111100,ou=group,o=osakagas,c=jp 2. ldapdelete -x -D cn=root -W cn=111111100,ou=group,o=osakagas,c=jp > Does the problem occur every time you repeat the test? Or does it depend upon > the timing you delete the subtree? I retried this case. After restarting dirsrv or executing db2ldif, this problem always occurred. (This problem may not occur in the environment that after starting dirsrv and above deleting operation(1,2), dirsrv never restart???) Please try restart dirsrv. Thanks for any help in advance.
Thank you for the steps to reproduce the problem. I could make it happen and it looks it's a duplicate of this bug. Bug 736431 - parent tombstone entry could be reaped even if its child tombstone entries still exist Please take a look at the bug as well as this trac ticket. https://fedorahosted.org/389/ticket/2 *** This bug has been marked as a duplicate of bug 736431 ***
[trac] https://fedorahosted.org/389/ticket/2
Created attachment 568133 [details] The file is LDIF which construct nest structure entries.
I tested based on the "Trac-Ticket2-test-scenario.txt" using the source code downloaded to 2012/2/17. * test scenario https://fedorahosted.org/389/attachment/ticket/2/Trac-Ticket2-test-scenario.txt * download command git clone http://git.fedorahosted.org/git/389/ds.git -> 389ds version is "1.2.11.a1.gitf7b882a" * Deleted entry which constructs nest structure in this test (Please refer to an attached file #568133 'import.ldif') # ldapdelete -x -D xxxxxxxx -W "uid=000000001,ou=people,o=example,c=jp" -r As a result of the test, following errors were not recorded on a log file. "_entry_set_tombstone_rdn - Failed to convert DN uid=... to RDN" "id2entry - str2entry returned NULL for id 30, string="rdn"" But deleted entries seem to be remaining in database. # dbscan -f /var/lib/dirsrv/slapd-xxxxxxxx/db/example/id2entry.db4 -K XX id XX rdn: nsuniqueid=a7c47f04-610c11e1-839ef58b-164ce082,uid=000000001 The setup of my environment is as follows. "nsds5ReplicaPurgeDelay:60" "nsds5ReplicaTombstonePurgeInterval: 60" When is this entry deleted from database? And, I found a new problem which is different from the above. If an entry is deleted in the case where "referential integrity postoperation" plug-in is set, process terminated unexpectedly. (In case where this plug-in is not set, process is not terminated.) Test operation is as follows. 1. Set "referential integrity postoperation" on cn: referential integrity postoperation nsslapd-pluginEnabled: on 2. Delete entry referred to by other entries. (Please refer to an attached file #568133 'import.ldif') # ldapdelete -x -D xxxxxxxx -W "uid=000000001,ou=people,o=example,c=jp" -r 3. Check "dirsrv" service # service dirsrv status dirsrv "instance" dead but pid file exists Thanks for any help in advance.
Tabata-san, > When is this entry deleted from database? At the next time any update operation is made. So, you may need to do some modification operation such as add, modify, delete. > 3. Check "dirsrv" service > # service dirsrv status > dirsrv "instance" dead but pid file exists You mean your server crashed? Do you have any core? Or what happens if you attach gdb to ns-slapd and delete the entries? Thanks!
> 3. Check "dirsrv" service I could not reproduce the problem. $ service dirsrv status dirsrv s1 (pid 3450) is running... dirsrv s2 (pid 4013) is running... $ ldapsearch -h localhost -p <port1> -LLLx -b "o=my.com" -D 'cn=directory manager' -w <pw> dn dn: o=example,c=jp dn: ou=people,o=example,c=jp dn: ou=group,o=example,c=jp dn: cn=000000003,ou=group,o=example,c=jp $ ldapsearch -h localhost -p <port2> -LLLx -b "o=my.com" -D 'cn=directory manager' -w Secret123 dn dn: o=example,c=jp dn: ou=people,o=example,c=jp dn: ou=group,o=example,c=jp dn: cn=000000003,ou=group,o=example,c=jp Also, could you try the same steps against 389-ds-base 1.2.10-3? The version should be more stable. The build from the git master could contain the unexpected bugs.
Thank you for the response. > At the next time any update operation is made. So, you may need to do some > modification operation such as add, modify, delete. By deleting other entries, I checked that the following entries were deleted. > Also, could you try the same steps against 389-ds-base 1.2.10-3? The version > should be more stable. The build from the git master could contain the > unexpected bugs. I tried the same steps against 389-ds-base 1.2.10-3. As a result, process is not terminated in the case where "referential integrity postoperation" plug-in is set. Thanks for any help in advance.
(In reply to comment #11) > I tried the same steps against 389-ds-base 1.2.10-3. As a result, process is > not terminated in the case where "referential integrity postoperation" plug-in > is set. When you write "process is not terminated", does that mean "if you tried to shutdown the server, the server hung"? Or some other symptom? If your server is hanging, could you attach your server's stacktrace to this bug? This is the howto for the crash case. If hanging, you need to attach gdb to the server directly to get the stacktraces. http://port389.org/wiki/FAQ#Debugging_Crashes
Thank you for the response. > When you write "process is not terminated", does that mean "if you tried to > shutdown the server, the server hung"? Or some other symptom? It doesn't mean "the server hung", but the result of test against 389-ds-base 1.2.10-3. Thanks for any help in advance.