Hide Forgot
Description of problem: we have seen this issue in customer site. Two differen connections do the same MODRDN in two different nodes at the same time. the operation suceeds but the replicated op from one node to the other fails to be applied and aborts replication session. Please, see the logs: Node 01: [07/Oct/2016:06:33:01 -0300] conn=26518 op=4799 MODRDN dn="cn=23175505199,ou=Nuevos,dc=abierto,dc=anses,dc=gov,dc=ar" newrdn="cn=23175505199" newsuperior="ou=Activos,dc=abierto,dc=anses,dc=gov,dc=ar" [07/Oct/2016:06:33:01 -0300] conn=26518 op=4799 RESULT err=0 tag=109 nentries=0 etime=0 csn=57f76dd10002000a0000 Node 02: [07/Oct/2016:06:33:02 -0300] conn=2680 op=4810 MODRDN dn="cn=23175505199,ou=Nuevos,dc=abierto,dc=anses,dc=gov,dc=ar" newrdn="cn=23175505199" newsuperior="ou=Activos,dc=abierto,dc=anses,dc=gov,dc=ar" [07/Oct/2016:06:33:02 -0300] conn=2680 op=4810 RESULT err=0 tag=109 nentries=0 etime=0 csn=57f76dd2000100140000 In node 01 we see this message repeatedly: 07/Oct/2016:06:33:03 -0300] NSMMReplicationPlugin - agmt="cn=MMR-ansesrhds02" (ansesrhds02:636): Consumer failed to replay change (uniqueid 9f680901-8c7011e6-99f9afaa-6e3226cf, CSN 57f76dd10002000a0000): Server is unwilling to perform (53). Will retry later. And in 02 repeatedly: [07/Oct/2016:06:33:03 -0300] NSMMReplicationPlugin - process_postop: Failed to apply update (57f76dd10002000a0000) error (53). Aborting replication session(conn=3 op=15371) because the entry is not found. Has been the original entry transformed in a tombstone ? Also in 01 logs: [07/Oct/2016:07:37:43 -0300] conn=23631 op=18437 csn=57f77cfa000100140000 - Failed to convert cn=27006139499 to RDN [07/Oct/2016:07:37:43 -0300] NSMMReplicationPlugin - process_postop: Failed to apply update (57f77cfa000100140000) error (1). Aborting replication session(conn=23631 op=18437) Because the customer, by mistake, is applying the same operations to both nodes all the time. Version-Release number of selected component (if applicable): 389-ds-base-1.2.11.15-74.el6.x86_64 How reproducible: I have not tried yet. I will do on Monday.
Created attachment 1210179 [details] lib389 reproducible test case
With the attachment https://bugzilla.redhat.com/attachment.cgi?id=1210179 it creates the error err=68 that looks good to me and that does not break replication session [07/Oct/2016:19:29:09.235324915 +0200] conn=5 op=5 MODRDN dn="cn=new_account0,cn=subtree1,dc=example,dc=com" newrdn="cn=new_account0" newsuperior="cn=subtree2,dc=example,dc=com" [07/Oct/2016:19:29:09.235546262 +0200] conn=5 op=5 RESULT err=68 tag=109 nentries=0 etime=0 csn=57f7db62000000020000
I think the problem is the missing fix for 918687 in 1.2.11