Bug 1382784 - Replication cannot handle MODRDN conflict properly.
Summary: Replication cannot handle MODRDN conflict properly.
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: 389-ds-base
Version: 6.8
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: rc
: ---
Assignee: Noriko Hosoi
QA Contact: Viktor Ashirov
URL:
Whiteboard:
Depends On:
Blocks: 1461138
TreeView+ depends on / blocked
 
Reported: 2016-10-07 17:54 UTC by German Parente
Modified: 2020-12-14 07:47 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-10-26 15:54:57 UTC
Target Upstream Version:


Attachments (Terms of Use)
lib389 reproducible test case (7.59 KB, text/plain)
2016-10-13 15:27 UTC, thierry bordaz
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github 389ds 389-ds-base issues 2067 0 None None None 2020-09-13 21:51:53 UTC

Description German Parente 2016-10-07 17:54:39 UTC
Description of problem:

we have seen this issue in customer site. Two differen connections do the same MODRDN in two different nodes at the same time.

the operation suceeds but the replicated op from one node to the other fails to be applied and aborts replication session.

Please, see the logs:


Node 01:

[07/Oct/2016:06:33:01 -0300] conn=26518 op=4799 MODRDN dn="cn=23175505199,ou=Nuevos,dc=abierto,dc=anses,dc=gov,dc=ar" newrdn="cn=23175505199" newsuperior="ou=Activos,dc=abierto,dc=anses,dc=gov,dc=ar"
[07/Oct/2016:06:33:01 -0300] conn=26518 op=4799 RESULT err=0 tag=109 nentries=0 etime=0 csn=57f76dd10002000a0000

Node 02:

[07/Oct/2016:06:33:02 -0300] conn=2680 op=4810 MODRDN dn="cn=23175505199,ou=Nuevos,dc=abierto,dc=anses,dc=gov,dc=ar" newrdn="cn=23175505199" newsuperior="ou=Activos,dc=abierto,dc=anses,dc=gov,dc=ar"
[07/Oct/2016:06:33:02 -0300] conn=2680 op=4810 RESULT err=0 tag=109 nentries=0 etime=0 csn=57f76dd2000100140000

In node 01 we see this message repeatedly:


07/Oct/2016:06:33:03 -0300] NSMMReplicationPlugin - agmt="cn=MMR-ansesrhds02" (ansesrhds02:636): Consumer failed to replay change (uniqueid 9f680901-8c7011e6-99f9afaa-6e3226cf, CSN 57f76dd10002000a0000): Server is unwilling to perform (53). Will retry later.


And in 02 repeatedly:

[07/Oct/2016:06:33:03 -0300] NSMMReplicationPlugin - process_postop: Failed to apply update (57f76dd10002000a0000) error (53).  Aborting replication session(conn=3 op=15371)

because the entry is not found. Has been the original entry transformed in a tombstone ?

Also in 01 logs:


[07/Oct/2016:07:37:43 -0300] conn=23631 op=18437 csn=57f77cfa000100140000 - Failed to convert cn=27006139499 to RDN
[07/Oct/2016:07:37:43 -0300] NSMMReplicationPlugin - process_postop: Failed to apply update (57f77cfa000100140000) error (1).  Aborting replication session(conn=23631 op=18437)

Because the customer, by mistake, is applying the same operations to both nodes all the time.


Version-Release number of selected component (if applicable):

389-ds-base-1.2.11.15-74.el6.x86_64


How reproducible:

I have not tried yet. I will do on Monday.

Comment 13 thierry bordaz 2016-10-13 15:27:25 UTC
Created attachment 1210179 [details]
lib389 reproducible test case

Comment 14 thierry bordaz 2016-10-13 15:28:57 UTC
With the attachment https://bugzilla.redhat.com/attachment.cgi?id=1210179

it creates the error err=68 that looks good to me and that does not break replication session
[07/Oct/2016:19:29:09.235324915 +0200] conn=5 op=5 MODRDN dn="cn=new_account0,cn=subtree1,dc=example,dc=com" newrdn="cn=new_account0" newsuperior="cn=subtree2,dc=example,dc=com"
[07/Oct/2016:19:29:09.235546262 +0200] conn=5 op=5 RESULT err=68 tag=109 nentries=0 etime=0 csn=57f7db62000000020000

Comment 21 Ludwig 2016-11-04 07:55:29 UTC
I think the problem is the missing fix for 918687 in 1.2.11


Note You need to log in before you can comment on or make changes to this bug.