Bug 1013735 - CLEANALLRUV doesnt run across all replicas
CLEANALLRUV doesnt run across all replicas
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: 389-ds-base (Show other bugs)
6.4
Unspecified Unspecified
high Severity high
: rc
: ---
Assigned To: Rich Megginson
Sankar Ramalingam
:
Depends On:
Blocks: 1013738
  Show dependency treegraph
 
Reported: 2013-09-30 12:45 EDT by Nathan Kinder
Modified: 2013-11-21 16:12 EST (History)
5 users (show)

See Also:
Fixed In Version: 389-ds-base-1.2.11.15-27.el6
Doc Type: Bug Fix
Doc Text:
Cause: A server in the replication environment that does not support the CLEANALLRUV task. Consequence: The task never finishes Fix: Ignore replicas that do not support the task. Result: CLEANALLRUV task completes when all the replicas that support the task have been cleaned.
Story Points: ---
Clone Of:
: 1013738 (view as bug list)
Environment:
Last Closed: 2013-11-21 16:12:30 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Nathan Kinder 2013-09-30 12:45:58 EDT
This bug is created as a clone of upstream ticket:
https://fedorahosted.org/389/ticket/47509

While running CLEANALLRUV from the masters , it fails to execute across some replicas with the error
CleanAllRUV Task: Replica cn=example-lx9078,cn=replica,cn=o\3Dexample.com,cn=mapping tree,cn=config does not support the CLEANALLRUV task.  Sending replica CLEANRUV task...
[10/Sep/2013:12:19:37 +0000] NSMMReplicationPlugin - CleanAllRUV Task: Failed to add CLEANRUV task replica (agmt="cn=example-lx9078" (example-lx9078:636)).  You will need to manually run the CLEANRUV task on this replica (example-lx9078.examplea.com) error (32)
[10/Sep/2013:12:19:37 +0000] NSMMReplicationPlugin - CleanAllRUV Task: Failed to send task to replica (agmt="cn=example-lx9078" (example-lx9078:636))
[10/Sep/2013:12:19:37 +0000] NSMMReplicationPlugin - CleanAllRUV Task: Not all replicas have received the cleanallruv extended op,
The error continues even after running the ABORT CLEANALLRUV task and manually running the CLEANRUV task on the replica.
Comment 1 Nathan Kinder 2013-09-30 22:37:21 EDT
To reproduce/verify:

----------------------------------------
- Setup replication with an older 389-ds-base instance that doesn't support CLEANALLRUV and a two newer instances that do support CLEANALLRUV.  Use a 3 master full-mesh topology.

- Run remove-ds.pl to remove one of the newer instances.

- Remove any replication agreements that point to the deleted instance.

- Run the CLEANALLRUV task on the one remaining newer master to remove the RUV for the removed master.
----------------------------------------

The bug is that the task never completes and you can't abort the task.  With the fix, the task should not hang.
Comment 3 Sankar Ramalingam 2013-10-17 11:35:41 EDT
This cannot be automated in TET since it requires multiple machines to verify. Hence, removing the qe_test_coverage+ flag.
Comment 4 Sankar Ramalingam 2013-10-24 11:30:25 EDT
1). As per the comment #1, I configured replication between 389-ds-base-1.2.11.15-29(Two Masters) and 389-ds-base-1.2.10.2-15(One Master). Then removed M2 and initiated a cleanallruv task on M1 for replica Id-1252(M2)

cat cleanruv.ldif 
dn: cn=1013735,cn=cleanallruv,cn=tasks,cn=config
cn: bug1013735
objectclass: extensibleObject
replica-base-dn: dc=passsync,dc=com
replica-id: 1252


2). ldapmodify -x -p 1189 -h localhost -D "cn=Directory Manager" -w Secret123 -avf /export/cleanruv.ldif 
ldap_initialize( ldap://localhost:1189 )
add cn:
	bug1013735
add objectclass:
	extensibleObject
add replica-base-dn:
	dc=passsync,dc=com
add replica-id:
	1252
adding new entry "cn=1013735,cn=cleanallruv,cn=tasks,cn=config"
modify complete

3). Though, the job completes immediately, the retry keeps checking whether the replica is on-line.

Also, I could re-run the same task repeatedly. Is this a problem?
Comment 5 Sankar Ramalingam 2013-10-24 11:55:32 EDT
(In reply to Sankar Ramalingam from comment #4)
> 1). As per the comment #1, I configured replication between
> 389-ds-base-1.2.11.15-29(Two Masters) and 389-ds-base-1.2.10.2-15(One
> Master). Then removed M2 and initiated a cleanallruv task on M1 for replica
> Id-1252(M2)
> 
> cat cleanruv.ldif 
> dn: cn=1013735,cn=cleanallruv,cn=tasks,cn=config
> cn: bug1013735
> objectclass: extensibleObject
> replica-base-dn: dc=passsync,dc=com
> replica-id: 1252
> 
> 
> 2). ldapmodify -x -p 1189 -h localhost -D "cn=Directory Manager" -w
> Secret123 -avf /export/cleanruv.ldif 
> ldap_initialize( ldap://localhost:1189 )
> add cn:
> 	bug1013735
> add objectclass:
> 	extensibleObject
> add replica-base-dn:
> 	dc=passsync,dc=com
> add replica-id:
> 	1252
> adding new entry "cn=1013735,cn=cleanallruv,cn=tasks,cn=config"
> modify complete
> 
> 3). Though, the job completes immediately, the retry keeps checking whether
> the replica is on-line.
Deleting the replication agreement resolves the problem. As per the design doc, the replication agreement should be removed before running the cleanallruv task.
> 
> Also, I could re-run the same task repeatedly. Is this a problem?
Comment 6 mreynolds 2013-10-24 11:56:41 EDT
Revised steps to verify the fix:

[1]  Create two instances(1.2.11.x):  replica A & B
[2]  Create a third instance(1.2.10) on a different host:  replica C
[3]  Setup replication between all three
[4]  Make some updates on each replica to make sure everything is working
[5]  Remove all the agreements that point to replica B from replicas A & C
[6]  Remove replica B
[7]  Run the cleanallruv task against replica A

The task should complete.  Its ok if replica C was not cleaned, this fix was only to make sure the task does not endlessly loop.
Comment 7 Sankar Ramalingam 2013-10-24 11:58:45 EDT
As per comment #5 and #6, I am marking the bug as Verified.
Comment 8 errata-xmlrpc 2013-11-21 16:12:30 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1653.html

Note You need to log in before you can comment on or make changes to this bug.