1013735 – CLEANALLRUV doesnt run across all replicas

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1013735 - CLEANALLRUV doesnt run across all replicas

Summary: CLEANALLRUV doesnt run across all replicas

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 6
Classification:	Red Hat
Component:	389-ds-base
Sub Component:
Version:	6.4
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	Rich Megginson
QA Contact:	Sankar Ramalingam
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1013738
TreeView+	depends on / blocked

Reported:	2013-09-30 16:45 UTC by Nathan Kinder
Modified:	2020-09-13 20:45 UTC (History)
CC List:	5 users (show)
Fixed In Version:	389-ds-base-1.2.11.15-27.el6
Doc Type:	Bug Fix
Doc Text:	Cause: A server in the replication environment that does not support the CLEANALLRUV task. Consequence: The task never finishes Fix: Ignore replicas that do not support the task. Result: CLEANALLRUV task completes when all the replicas that support the task have been cleaned.
Clone Of:
Clones:	1013738 (view as bug list)
Environment:
Last Closed:	2013-11-21 21:12:30 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	389ds 389-ds-base issues 846	0	None	None	None	2020-09-13 20:45:20 UTC
Red Hat Product Errata	RHBA-2013:1653	0	normal	SHIPPED_LIVE	389-ds-base bug fix update	2013-11-20 21:53:19 UTC

Description Nathan Kinder 2013-09-30 16:45:58 UTC

This bug is created as a clone of upstream ticket:
https://fedorahosted.org/389/ticket/47509

While running CLEANALLRUV from the masters , it fails to execute across some replicas with the error
CleanAllRUV Task: Replica cn=example-lx9078,cn=replica,cn=o\3Dexample.com,cn=mapping tree,cn=config does not support the CLEANALLRUV task.  Sending replica CLEANRUV task...
[10/Sep/2013:12:19:37 +0000] NSMMReplicationPlugin - CleanAllRUV Task: Failed to add CLEANRUV task replica (agmt="cn=example-lx9078" (example-lx9078:636)).  You will need to manually run the CLEANRUV task on this replica (example-lx9078.examplea.com) error (32)
[10/Sep/2013:12:19:37 +0000] NSMMReplicationPlugin - CleanAllRUV Task: Failed to send task to replica (agmt="cn=example-lx9078" (example-lx9078:636))
[10/Sep/2013:12:19:37 +0000] NSMMReplicationPlugin - CleanAllRUV Task: Not all replicas have received the cleanallruv extended op,
The error continues even after running the ABORT CLEANALLRUV task and manually running the CLEANRUV task on the replica.

Comment 1 Nathan Kinder 2013-10-01 02:37:21 UTC

To reproduce/verify:

----------------------------------------
- Setup replication with an older 389-ds-base instance that doesn't support CLEANALLRUV and a two newer instances that do support CLEANALLRUV.  Use a 3 master full-mesh topology.

- Run remove-ds.pl to remove one of the newer instances.

- Remove any replication agreements that point to the deleted instance.

- Run the CLEANALLRUV task on the one remaining newer master to remove the RUV for the removed master.
----------------------------------------

The bug is that the task never completes and you can't abort the task.  With the fix, the task should not hang.

Comment 3 Sankar Ramalingam 2013-10-17 15:35:41 UTC

This cannot be automated in TET since it requires multiple machines to verify. Hence, removing the qe_test_coverage+ flag.

Comment 4 Sankar Ramalingam 2013-10-24 15:30:25 UTC

1). As per the comment #1, I configured replication between 389-ds-base-1.2.11.15-29(Two Masters) and 389-ds-base-1.2.10.2-15(One Master). Then removed M2 and initiated a cleanallruv task on M1 for replica Id-1252(M2)

cat cleanruv.ldif 
dn: cn=1013735,cn=cleanallruv,cn=tasks,cn=config
cn: bug1013735
objectclass: extensibleObject
replica-base-dn: dc=passsync,dc=com
replica-id: 1252


2). ldapmodify -x -p 1189 -h localhost -D "cn=Directory Manager" -w Secret123 -avf /export/cleanruv.ldif 
ldap_initialize( ldap://localhost:1189 )
add cn:
	bug1013735
add objectclass:
	extensibleObject
add replica-base-dn:
	dc=passsync,dc=com
add replica-id:
	1252
adding new entry "cn=1013735,cn=cleanallruv,cn=tasks,cn=config"
modify complete

3). Though, the job completes immediately, the retry keeps checking whether the replica is on-line.

Also, I could re-run the same task repeatedly. Is this a problem?

Comment 5 Sankar Ramalingam 2013-10-24 15:55:32 UTC

(In reply to Sankar Ramalingam from comment #4)
> 1). As per the comment #1, I configured replication between
> 389-ds-base-1.2.11.15-29(Two Masters) and 389-ds-base-1.2.10.2-15(One
> Master). Then removed M2 and initiated a cleanallruv task on M1 for replica
> Id-1252(M2)
> 
> cat cleanruv.ldif 
> dn: cn=1013735,cn=cleanallruv,cn=tasks,cn=config
> cn: bug1013735
> objectclass: extensibleObject
> replica-base-dn: dc=passsync,dc=com
> replica-id: 1252
> 
> 
> 2). ldapmodify -x -p 1189 -h localhost -D "cn=Directory Manager" -w
> Secret123 -avf /export/cleanruv.ldif 
> ldap_initialize( ldap://localhost:1189 )
> add cn:
> 	bug1013735
> add objectclass:
> 	extensibleObject
> add replica-base-dn:
> 	dc=passsync,dc=com
> add replica-id:
> 	1252
> adding new entry "cn=1013735,cn=cleanallruv,cn=tasks,cn=config"
> modify complete
> 
> 3). Though, the job completes immediately, the retry keeps checking whether
> the replica is on-line.
Deleting the replication agreement resolves the problem. As per the design doc, the replication agreement should be removed before running the cleanallruv task.
> 
> Also, I could re-run the same task repeatedly. Is this a problem?

Comment 6 mreynolds 2013-10-24 15:56:41 UTC

Revised steps to verify the fix:

[1]  Create two instances(1.2.11.x):  replica A & B
[2]  Create a third instance(1.2.10) on a different host:  replica C
[3]  Setup replication between all three
[4]  Make some updates on each replica to make sure everything is working
[5]  Remove all the agreements that point to replica B from replicas A & C
[6]  Remove replica B
[7]  Run the cleanallruv task against replica A

The task should complete.  Its ok if replica C was not cleaned, this fix was only to make sure the task does not endlessly loop.

Comment 7 Sankar Ramalingam 2013-10-24 15:58:45 UTC

As per comment #5 and #6, I am marking the bug as Verified.

Comment 8 errata-xmlrpc 2013-11-21 21:12:30 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1653.html

Note You need to log in before you can comment on or make changes to this bug.