Bug 1013738
Summary: | CLEANALLRUV doesnt run across all replicas | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Nathan Kinder <nkinder> |
Component: | 389-ds-base | Assignee: | Rich Megginson <rmeggins> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Sankar Ramalingam <sramling> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 7.0 | CC: | jgalipea, mreynolds |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | 389-ds-base-1.3.1.6-5.el7 | Doc Type: | Bug Fix |
Doc Text: |
Cause: Running the CLEANALLRUV task in a replication environment where one of the replicas does not support the CLEANALLRUV task.
Consequence: The task never completes
Fix: Do not prevent the task from completing if it runs across a replica that does not support the CLEANALLRUV task.
Result: The CLEANALLRUV task completes after it cleans all the replicas that do support the task.
|
Story Points: | --- |
Clone Of: | 1013735 | Environment: | |
Last Closed: | 2014-06-13 10:21:15 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1013735 | ||
Bug Blocks: |
Description
Nathan Kinder
2013-09-30 16:52:12 UTC
To reproduce/verify: ---------------------------------------- - Setup replication with an older 389-ds-base instance that doesn't support CLEANALLRUV and a two newer instances that do support CLEANALLRUV. Use a 3 master full-mesh topology. - Run remove-ds.pl to remove one of the newer instances. - Remove any replication agreements that point to the deleted instance. - Run the CLEANALLRUV task on the one remaining newer master to remove the RUV for the removed master. ---------------------------------------- The bug is that the task never completes and you can't abort the task. With the fix, the task should not hang. To verify bugzilla, I followed the steps... 1. Setup RHEL7 machine and create 2 masters/2 consumers. 2. Setup RHEL63 machine with older version of 389-ds-base-1.2.10.2-15 3. Setup a new master in RHEL63 machine to talk to M1 and M2 on RHEL7 4. M1(Replica Id - 1231) and M2(Replica Id - 1232) in RHEL7. M3(Replica Id - 1291) is in RHEL63. 5. Verified replication works fine. All entries synced from each master. 6. Removed replication agreements for M1 from M2 and M3. 7. Removed M1 from RHEL7 machine. 8. Ran cleanallruv task from M2. cat cleanallruv.ldif dn: cn=1013735,cn=cleanallruv,cn=tasks,cn=config cn: bug1013735 objectclass: extensibleObject replica-base-dn: dc=passsync,dc=com replica-id: 1231 9. cleanallruv task completed. Some other issues reported from the error logs which is not relevant to the cleanallruv task running for Replica Id 1231. [20/Feb/2014:10:08:04 -0500] NSMMReplicationPlugin - agmt_delete: begin [20/Feb/2014:10:09:44 -0500] NSMMReplicationPlugin - CleanAllRUV Task: Initiating CleanAllRUV Task... [20/Feb/2014:10:09:44 -0500] NSMMReplicationPlugin - CleanAllRUV Task: Retrieving maxcsn... [20/Feb/2014:10:09:44 -0500] NSMMReplicationPlugin - CleanAllRUV Task: Found maxcsn (53060f1c000704cf0000) [20/Feb/2014:10:09:45 -0500] NSMMReplicationPlugin - CleanAllRUV Task: Cleaning rid (1231)... [20/Feb/2014:10:09:45 -0500] NSMMReplicationPlugin - CleanAllRUV Task: Waiting to process all the updates from the deleted replica... [20/Feb/2014:10:09:45 -0500] NSMMReplicationPlugin - CleanAllRUV Task: Waiting for all the replicas to be online... [20/Feb/2014:10:09:45 -0500] NSMMReplicationPlugin - CleanAllRUV Task: Waiting for all the replicas to receive all the deleted replica updates... [20/Feb/2014:10:09:45 -0500] NSMMReplicationPlugin - CleanAllRUV Task: Sending cleanAllRUV task to all the replicas... [20/Feb/2014:10:09:45 -0500] NSMMReplicationPlugin - CleanAllRUV Task: Replica cn=1289_to_1616_on_hp-dl380pgen8-02-vm-4.lab.bos.redhat.com,cn=replica,cn=dc\3Dpasssync\2Cdc\3Dcom,cn=mapping tree,cn=config does not support the CLEANALLRUV task. Sending replica CLEANRUV task... [20/Feb/2014:10:09:45 -0500] NSMMReplicationPlugin - CleanAllRUV Task: Failed to add CLEANRUV task (cn=replica,cn=dc\3Dpasssync\2Cdc\3Dcom,cn=mapping tree,cn=config) to replica (agmt="cn=1289_to_1616_on_hp-dl380pgen8-02-vm-4.lab.bos.redhat.com" (hp-dl380pgen8-02-vm-4:1616)). You will need to manually run the CLEANRUV task on this replica (hp-dl380pgen8-02-vm-4.lab.bos.redhat.com) error (50) [20/Feb/2014:10:09:45 -0500] NSMMReplicationPlugin - CleanAllRUV Task: Cleaning local ruv's... [20/Feb/2014:10:09:45 -0500] NSMMReplicationPlugin - CleanAllRUV Task: Waiting for all the replicas to be cleaned... [20/Feb/2014:10:09:46 -0500] NSMMReplicationPlugin - CleanAllRUV Task: Waiting for all the replicas to finish cleaning... [20/Feb/2014:10:09:46 -0500] NSMMReplicationPlugin - CleanAllRUV Task: Successfully cleaned rid(1231). Also, nsds50ruv has not been cleaned. I see these entries in all other replication agreements. Is this something which is expected to be cleaned from replication agreements? nsds50ruv: {replica 1231 ldap://ibm-hs23-01.rhts.eng.bos.redhat.com:3189} 5306 0eab000004cf0000 53060f1c000704cf0000 nsruvReplicaLastModified: {replica 1231 ldap://ibm-hs23-01.rhts.eng.bos.redhat .com:3189} 00000000 These error messages observed from the RHEL6.3 master. But, the replication doesn't have any issues. I could sync entries from M3 to M1 and vice versa. [20/Feb/2014:10:33:25 -0500] agmt="cn=1189_to_1626_on_ibm-hs23-01.rhts.eng.bos.redhat.com" (ibm-hs23-01:1626) - Can't locate CSN 53060eab000004cf0000 in the changelog (DB rc=-30988). The consumer may need to be reinitialized. [20/Feb/2014:10:33:27 -0500] agmt="cn=1189_to_1626_on_ibm-hs23-01.rhts.eng.bos.redhat.com" (ibm-hs23-01:1626) - Can't locate CSN 53060eab000004cf0000 in the changelog (DB rc=-30988). The consumer may need to be reinitialized. [20/Feb/2014:10:33:36 -0500] agmt="cn=1189_to_1626_on_ibm-hs23-01.rhts.eng.bos.redhat.com" (ibm-hs23-01:1626) - Can't locate CSN 53060eab000004cf0000 in the changelog (DB rc=-30988). The consumer may need to be reinitialized. [20/Feb/2014:10:33:38 -0500] agmt="cn=1189_to_1626_on_ibm-hs23-01.rhts.eng.bos.redhat.com" (ibm-hs23-01:1626) - Can't locate CSN 53060eab000004cf0000 in the changelog (DB rc=-30988). The consumer may need to be reinitialized. (In reply to Sankar Ramalingam from comment #3) > To verify bugzilla, I followed the steps... > 1. Setup RHEL7 machine and create 2 masters/2 consumers. > 2. Setup RHEL63 machine with older version of 389-ds-base-1.2.10.2-15 > 3. Setup a new master in RHEL63 machine to talk to M1 and M2 on RHEL7 > 4. M1(Replica Id - 1231) and M2(Replica Id - 1232) in RHEL7. M3(Replica Id - > 1291) is in RHEL63. > 5. Verified replication works fine. All entries synced from each master. > 6. Removed replication agreements for M1 from M2 and M3. > 7. Removed M1 from RHEL7 machine. > 8. Ran cleanallruv task from M2. cat cleanallruv.ldif > dn: cn=1013735,cn=cleanallruv,cn=tasks,cn=config > cn: bug1013735 > objectclass: extensibleObject > replica-base-dn: dc=passsync,dc=com > replica-id: 1231 > > 9. cleanallruv task completed. Some other issues reported from the error > logs which is not relevant to the cleanallruv task running for Replica Id > 1231. > > [20/Feb/2014:10:08:04 -0500] NSMMReplicationPlugin - agmt_delete: begin > [20/Feb/2014:10:09:44 -0500] NSMMReplicationPlugin - CleanAllRUV Task: > Initiating CleanAllRUV Task... > [20/Feb/2014:10:09:44 -0500] NSMMReplicationPlugin - CleanAllRUV Task: > Retrieving maxcsn... > [20/Feb/2014:10:09:44 -0500] NSMMReplicationPlugin - CleanAllRUV Task: Found > maxcsn (53060f1c000704cf0000) > [20/Feb/2014:10:09:45 -0500] NSMMReplicationPlugin - CleanAllRUV Task: > Cleaning rid (1231)... > [20/Feb/2014:10:09:45 -0500] NSMMReplicationPlugin - CleanAllRUV Task: > Waiting to process all the updates from the deleted replica... > [20/Feb/2014:10:09:45 -0500] NSMMReplicationPlugin - CleanAllRUV Task: > Waiting for all the replicas to be online... > [20/Feb/2014:10:09:45 -0500] NSMMReplicationPlugin - CleanAllRUV Task: > Waiting for all the replicas to receive all the deleted replica updates... > [20/Feb/2014:10:09:45 -0500] NSMMReplicationPlugin - CleanAllRUV Task: > Sending cleanAllRUV task to all the replicas... > [20/Feb/2014:10:09:45 -0500] NSMMReplicationPlugin - CleanAllRUV Task: > Replica > cn=1289_to_1616_on_hp-dl380pgen8-02-vm-4.lab.bos.redhat.com,cn=replica, > cn=dc\3Dpasssync\2Cdc\3Dcom,cn=mapping tree,cn=config does not support the > CLEANALLRUV task. Sending replica CLEANRUV task... > [20/Feb/2014:10:09:45 -0500] NSMMReplicationPlugin - CleanAllRUV Task: > Failed to add CLEANRUV task > (cn=replica,cn=dc\3Dpasssync\2Cdc\3Dcom,cn=mapping tree,cn=config) to > replica (agmt="cn=1289_to_1616_on_hp-dl380pgen8-02-vm-4.lab.bos.redhat.com" > (hp-dl380pgen8-02-vm-4:1616)). You will need to manually run the CLEANRUV > task on this replica (hp-dl380pgen8-02-vm-4.lab.bos.redhat.com) error (50) > [20/Feb/2014:10:09:45 -0500] NSMMReplicationPlugin - CleanAllRUV Task: > Cleaning local ruv's... > [20/Feb/2014:10:09:45 -0500] NSMMReplicationPlugin - CleanAllRUV Task: > Waiting for all the replicas to be cleaned... > [20/Feb/2014:10:09:46 -0500] NSMMReplicationPlugin - CleanAllRUV Task: > Waiting for all the replicas to finish cleaning... > [20/Feb/2014:10:09:46 -0500] NSMMReplicationPlugin - CleanAllRUV Task: > Successfully cleaned rid(1231). > > > > > Also, nsds50ruv has not been cleaned. I see these entries in all other > replication agreements. Is this something which is expected to be cleaned > from replication agreements? > > nsds50ruv: {replica 1231 ldap://ibm-hs23-01.rhts.eng.bos.redhat.com:3189} > 5306 > 0eab000004cf0000 53060f1c000704cf0000 > nsruvReplicaLastModified: {replica 1231 > ldap://ibm-hs23-01.rhts.eng.bos.redhat > .com:3189} 00000000 This all looks correct. The fix for this bug is that it does not get stuck when it encounters a replica that does not support CLEANALLRUV. As we can see, the task moves on and finishes its processing. As for the RUV not being cleaned... As stated in the logging, you need to manually run CLEANRUV on the replica that does not support CLEANALLRUV. So although the ruv was successfully cleaned, the older replica(which was not cleaned), polluted the ruv again. This is expected. Anyway, you have successfully verified this bug fix. As per your above comments, marking the bug as Verified. This request was resolved in Red Hat Enterprise Linux 7.0. Contact your manager or support representative in case you have further questions about the request. |