Bug 1013735 - CLEANALLRUV doesnt run across all replicas
Summary: CLEANALLRUV doesnt run across all replicas
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: 389-ds-base
Version: 6.4
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Rich Megginson
QA Contact: Sankar Ramalingam
URL:
Whiteboard:
Depends On:
Blocks: 1013738
TreeView+ depends on / blocked
 
Reported: 2013-09-30 16:45 UTC by Nathan Kinder
Modified: 2013-11-21 21:12 UTC (History)
5 users (show)

Fixed In Version: 389-ds-base-1.2.11.15-27.el6
Doc Type: Bug Fix
Doc Text:
Cause: A server in the replication environment that does not support the CLEANALLRUV task. Consequence: The task never finishes Fix: Ignore replicas that do not support the task. Result: CLEANALLRUV task completes when all the replicas that support the task have been cleaned.
Clone Of:
: 1013738 (view as bug list)
Environment:
Last Closed: 2013-11-21 21:12:30 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2013:1653 normal SHIPPED_LIVE 389-ds-base bug fix update 2013-11-20 21:53:19 UTC

Description Nathan Kinder 2013-09-30 16:45:58 UTC
This bug is created as a clone of upstream ticket:
https://fedorahosted.org/389/ticket/47509

While running CLEANALLRUV from the masters , it fails to execute across some replicas with the error
CleanAllRUV Task: Replica cn=example-lx9078,cn=replica,cn=o\3Dexample.com,cn=mapping tree,cn=config does not support the CLEANALLRUV task.  Sending replica CLEANRUV task...
[10/Sep/2013:12:19:37 +0000] NSMMReplicationPlugin - CleanAllRUV Task: Failed to add CLEANRUV task replica (agmt="cn=example-lx9078" (example-lx9078:636)).  You will need to manually run the CLEANRUV task on this replica (example-lx9078.examplea.com) error (32)
[10/Sep/2013:12:19:37 +0000] NSMMReplicationPlugin - CleanAllRUV Task: Failed to send task to replica (agmt="cn=example-lx9078" (example-lx9078:636))
[10/Sep/2013:12:19:37 +0000] NSMMReplicationPlugin - CleanAllRUV Task: Not all replicas have received the cleanallruv extended op,
The error continues even after running the ABORT CLEANALLRUV task and manually running the CLEANRUV task on the replica.

Comment 1 Nathan Kinder 2013-10-01 02:37:21 UTC
To reproduce/verify:

----------------------------------------
- Setup replication with an older 389-ds-base instance that doesn't support CLEANALLRUV and a two newer instances that do support CLEANALLRUV.  Use a 3 master full-mesh topology.

- Run remove-ds.pl to remove one of the newer instances.

- Remove any replication agreements that point to the deleted instance.

- Run the CLEANALLRUV task on the one remaining newer master to remove the RUV for the removed master.
----------------------------------------

The bug is that the task never completes and you can't abort the task.  With the fix, the task should not hang.

Comment 3 Sankar Ramalingam 2013-10-17 15:35:41 UTC
This cannot be automated in TET since it requires multiple machines to verify. Hence, removing the qe_test_coverage+ flag.

Comment 4 Sankar Ramalingam 2013-10-24 15:30:25 UTC
1). As per the comment #1, I configured replication between 389-ds-base-1.2.11.15-29(Two Masters) and 389-ds-base-1.2.10.2-15(One Master). Then removed M2 and initiated a cleanallruv task on M1 for replica Id-1252(M2)

cat cleanruv.ldif 
dn: cn=1013735,cn=cleanallruv,cn=tasks,cn=config
cn: bug1013735
objectclass: extensibleObject
replica-base-dn: dc=passsync,dc=com
replica-id: 1252


2). ldapmodify -x -p 1189 -h localhost -D "cn=Directory Manager" -w Secret123 -avf /export/cleanruv.ldif 
ldap_initialize( ldap://localhost:1189 )
add cn:
	bug1013735
add objectclass:
	extensibleObject
add replica-base-dn:
	dc=passsync,dc=com
add replica-id:
	1252
adding new entry "cn=1013735,cn=cleanallruv,cn=tasks,cn=config"
modify complete

3). Though, the job completes immediately, the retry keeps checking whether the replica is on-line.

Also, I could re-run the same task repeatedly. Is this a problem?

Comment 5 Sankar Ramalingam 2013-10-24 15:55:32 UTC
(In reply to Sankar Ramalingam from comment #4)
> 1). As per the comment #1, I configured replication between
> 389-ds-base-1.2.11.15-29(Two Masters) and 389-ds-base-1.2.10.2-15(One
> Master). Then removed M2 and initiated a cleanallruv task on M1 for replica
> Id-1252(M2)
> 
> cat cleanruv.ldif 
> dn: cn=1013735,cn=cleanallruv,cn=tasks,cn=config
> cn: bug1013735
> objectclass: extensibleObject
> replica-base-dn: dc=passsync,dc=com
> replica-id: 1252
> 
> 
> 2). ldapmodify -x -p 1189 -h localhost -D "cn=Directory Manager" -w
> Secret123 -avf /export/cleanruv.ldif 
> ldap_initialize( ldap://localhost:1189 )
> add cn:
> 	bug1013735
> add objectclass:
> 	extensibleObject
> add replica-base-dn:
> 	dc=passsync,dc=com
> add replica-id:
> 	1252
> adding new entry "cn=1013735,cn=cleanallruv,cn=tasks,cn=config"
> modify complete
> 
> 3). Though, the job completes immediately, the retry keeps checking whether
> the replica is on-line.
Deleting the replication agreement resolves the problem. As per the design doc, the replication agreement should be removed before running the cleanallruv task.
> 
> Also, I could re-run the same task repeatedly. Is this a problem?

Comment 6 mreynolds 2013-10-24 15:56:41 UTC
Revised steps to verify the fix:

[1]  Create two instances(1.2.11.x):  replica A & B
[2]  Create a third instance(1.2.10) on a different host:  replica C
[3]  Setup replication between all three
[4]  Make some updates on each replica to make sure everything is working
[5]  Remove all the agreements that point to replica B from replicas A & C
[6]  Remove replica B
[7]  Run the cleanallruv task against replica A

The task should complete.  Its ok if replica C was not cleaned, this fix was only to make sure the task does not endlessly loop.

Comment 7 Sankar Ramalingam 2013-10-24 15:58:45 UTC
As per comment #5 and #6, I am marking the bug as Verified.

Comment 8 errata-xmlrpc 2013-11-21 21:12:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1653.html


Note You need to log in before you can comment on or make changes to this bug.