Bug 1031852

Summary:	RHEL7 ipa-replica-manage hang waiting on CLEANALLRUV tasks
Product:	Red Hat Enterprise Linux 7	Reporter:	Scott Poore <spoore>
Component:	ipa	Assignee:	Martin Kosek <mkosek>
Status:	CLOSED DUPLICATE	QA Contact:	Namita Soman <nsoman>
Severity:	unspecified	Docs Contact:
Priority:	medium
Version:	7.0	CC:	dpal, jcholast, jgalipea, lkrispen, mkosek, mreynolds, msauton, rcritten, spoore
Target Milestone:	rc	Keywords:	Reopened
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:
Clones:	1034832 (view as bug list)		Environment:
Last Closed:	2016-02-25 14:51:17 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1034832
Bug Blocks:

Description Scott Poore 2013-11-19 00:54:29 UTC

Description of problem:

I'm seeing ipa-replica-manage hang in a few cases.  For ipa-replica-manage del in some environments, I'm seeing it hang.  It was discovered that it was hanging on some CLEANALLRUV tasks:


[root@ipaqa64vmk ~]# ps -ef|grep ipa-replica-manage
root     18587 11341  0 19:47 pts/0    00:00:00 grep --color=auto ipa-replica-manage
root     21526  4389  0 18:42 ?        00:00:09 /usr/bin/python -E /usr/sbin/ipa-replica-manage -p Secret123 del qeblade6.testrelm.com -f

[root@ipaqa64vmk ~]# ipa-replica-manage list-ruv
ipaqa64vmk.testrelm.com:389: 6
ipaqa64vmb.testrelm.com:389: 5
ipaqavmd.testrelm.com:389: 4
qeblade6.testrelm.com:389: 8
ipaqa64vma.testrelm.com:389: 12

[root@ipaqa64vmk ~]# ipa-replica-manage list-clean-ruv
CLEANALLRUV tasks
RID 8: Not all replicas caught up, retrying in 2560 seconds

No abort CLEANALLRUV tasks running

Version-Release number of selected component (if applicable):
ipa-server-3.3.3-4.el7.x86_64
389-ds-base-1.3.1.6-10.el7.x86_64


How reproducible:
happening often in automated testing.

Steps to Reproduce:
1.  Setup IPA environment with 5 nodes in line 1-2-3-4-5
2.  on node4: ipa-replica-manage -p $PASSWD del node5

It should be noted that I've seen hangs with other ipa-replicamanage other places but, those have been very infrequent compared to the del one.

Actual results:

ipa-replica-manage command hangs

ipa-replica-manage list-clean-ruvs shows "Not all replicas caught up,"


Expected results:
No hang and node is deleted from replication agreement topology.

Additional info:


Env where I see the del hang:

   M
  / \
R1   R2
      \
       R3
        \
         R4

R3 dels R4.

Comment 4 Dmitri Pal 2013-11-19 19:56:28 UTC

Upstream ticket:
https://fedorahosted.org/freeipa/ticket/4036

Comment 5 Dmitri Pal 2013-11-26 15:12:33 UTC

Test blocker keyword is moved to the DS bug. There is no need to have the keyword on two bugs.

Comment 6 mreynolds 2013-11-27 14:26:24 UTC

I can not reproduce the cleanallruv hang.  The reason cleanallruv "hung" before was because not all the replicas were in synch.  When I ran the test, everything was in synch, and cleanallruv ran fine.  

So maybe the automatic test suite is removing the replica too quickly before it can send out all its changes?  
Maybe there was a replication failure that prevented the updates from going out?  I don't know, but running the test manually worked fine.

This is what I saw when manually running the test:


nsds50ruv: {replica 3 ldap://ipaqavmb.testrelm.com:389} 5292bba3000000030000    5292c3ac000000030000
nsds50ruv: {replica 4 ldap://cloud-qe-3.testrelm.com:389} 5292bc21000000040000  5292c3a9000600040000
nsds50ruv: {replica 5 ldap://ipaqavmc.testrelm.com:389} 5292bdd3000000050000    5292c3a9000200050000
nsds50ruv: {replica 6 ldap://tigger.testrelm.com:389} 5292c051000000060000      52950cf3000000060000
nsds50ruv: {replica 7 ldap://apollo.testrelm.com:389} 5292c2b0000000070000      5292c3bc000000070000

On replica 6, deleting replica 7

[root@tigger ~]# ipa-replica-manage -p Secret123 del apollo.testrelm.com
Deleting a master is irreversible.
To reconnect to the remote master you will need to prepare a new replica file
and re-install.
Continue to delete? [no]: yes
Deleting replication agreements between apollo.testrelm.com and tigger.testrelm.com
ipa: INFO: Setting agreement cn=meTotigger.testrelm.com,cn=replica,cn=dc\=testrelm\,dc\=com,cn=mapping tree,cn=config schedule to 2358-2359 0 to force synch
ipa: INFO: Deleting schedule 2358-2359 0 from agreement cn=meTotigger.testrelm.com,cn=replica,cn=dc\=testrelm\,dc\=com,cn=mapping tree,cn=config
ipa: INFO: Replication Update in progress: TRUE: status: 0 Replica acquired successfully: Incremental update started: start: 0: end: 0
ipa: INFO: Replication Update in progress: FALSE: status: 0 Replica acquired successfully: Incremental update succeeded: start: 0: end: 0
Deleted replication agreement from 'tigger.testrelm.com' to 'apollo.testrelm.com'
Background task created to clean replication data. This may take a while.
This may be safely interrupted with Ctrl+C

[26/Nov/2013:16:30:24 -0500] NSMMReplicationPlugin - CleanAllRUV Task: Initiating CleanAllRUV Task...
[26/Nov/2013:16:30:24 -0500] NSMMReplicationPlugin - CleanAllRUV Task: Retrieving maxcsn...
[26/Nov/2013:16:30:24 -0500] NSMMReplicationPlugin - CleanAllRUV Task: Found maxcsn (5292c3bc000000070000)
[26/Nov/2013:16:30:24 -0500] NSMMReplicationPlugin - CleanAllRUV Task: Cleaning rid (7)...
[26/Nov/2013:16:30:24 -0500] NSMMReplicationPlugin - CleanAllRUV Task: Waiting to process all the updates from the deleted replica...
[26/Nov/2013:16:30:24 -0500] NSMMReplicationPlugin - CleanAllRUV Task: Waiting for all the replicas to be online...
[26/Nov/2013:16:30:24 -0500] NSMMReplicationPlugin - CleanAllRUV Task: Waiting for all the replicas to receive all the deleted replica updates...
[26/Nov/2013:16:30:24 -0500] NSMMReplicationPlugin - CleanAllRUV Task: Sending cleanAllRUV task to all the replicas...
[26/Nov/2013:16:30:24 -0500] NSMMReplicationPlugin - CleanAllRUV Task: Cleaning local ruv's...
[26/Nov/2013:16:30:24 -0500] NSMMReplicationPlugin - CleanAllRUV Task: Waiting for all the replicas to be cleaned...
[26/Nov/2013:16:30:24 -0500] NSMMReplicationPlugin - CleanAllRUV Task: Replica is not cleaned yet (agmt="cn=meToipaqavmc.testrelm.com" (ipaqavmc:389))
[26/Nov/2013:16:30:24 -0500] NSMMReplicationPlugin - CleanAllRUV Task: Replicas have not been cleaned yet, retrying in 10 seconds
[26/Nov/2013:16:30:36 -0500] NSMMReplicationPlugin - CleanAllRUV Task: Waiting for all the replicas to finish cleaning...
[26/Nov/2013:16:30:36 -0500] NSMMReplicationPlugin - CleanAllRUV Task: Not all replicas finished cleaning, retrying in 10 seconds
[26/Nov/2013:16:30:46 -0500] NSMMReplicationPlugin - CleanAllRUV Task: Successfully cleaned rid(7).

Comment 17 Martin Kosek 2014-01-13 08:09:08 UTC

I am closing this Bugzilla. As I discussed with Nathan, there was a problem with a wrong test procedure. When Scott fixed it, the problem went away. There was still the 389-ds-base freeze, but it is being investigated in Bug 1034832.

I am closing this Bugzilla then.

Comment 18 Scott Poore 2014-01-17 22:21:02 UTC

Martin,

I'm reopening this one for clarification to find out if there is anything that should be done for ipa-replica-manage.  I suspect this is simply procedural and not something that can be fixed in ipa-replica-manage.  Just want confirmation:

From Mark's explanation in bug #1034832, the CLEANALLRUV task was waiting on replication which never finished because the re-intialize overwrote the changelog.  Now, that re-initialize wasn't necessary to begin with (in fact was incorrect) and has since been removed.  So, my particular problem was alleviated.  However, I'm wondering if there's something that ipa-replica-manage could do to help prevent that scenario.  This was Mark's explanation:

> It's not that you need to check the change log, but you need to wait for 
> replication to complete or be idle(e.g. by putting all the replicas in 
> read-only mode and checking the replication status of each agreement).  

Wasn't something put into ipa-replica-manage that locked the replicas for another bug?

So, is there something that should be done for ipa-replica-manage to check state before a re-initialize?  

Or is this simply a procedural issue where the user should check these things before attempting a re-initialize?

Thanks,
Scott

Comment 20 Martin Kosek 2014-01-20 08:12:40 UTC

This is a good question, If there would be something we can do to make the re-initialize process on the replica more robust, I am open to it.

Mark, any recommendations? I am thinking that putting the replica and all it's peers to readonly mode may not be what we want as it would disrupt service on all connected replication peers, right?

This is what we do when re-initializing a replica:

1) enable the agreement from this host to the remote host (put nsds5ReplicaEnabled to ON)
2) enable the agreement from the remote host to local host (put nsds5ReplicaEnabled to ON)
3) Force synchronization from the remote host to the local host (play with nsDS5ReplicaUpdateSchedule)
4) Re-initialize the replication (change nsds5BeginReplicaRefresh to start)

I see no wait with the force sync action. I also see we do not force sync with other replication peers.

Should we proceed differently? Ideally without disruptions on the remote Directory Servers.

Comment 21 mreynolds 2014-01-20 16:22:03 UTC

(In reply to Martin Kosek from comment #20)
> This is a good question, If there would be something we can do to make the
> re-initialize process on the replica more robust, I am open to it.
> 
> Mark, any recommendations? I am thinking that putting the replica and all
> it's peers to readonly mode may not be what we want as it would disrupt
> service on all connected replication peers, right?

Yes this would work, but you also need to make sure that replication gets caught up before doing the reinit.

Let me back track.  This is really a procedural type of issue.  In Scott's replication setup, most replicas are chained sequentially, not round-robin.  So instead of setting replicas as read-only, you could reinit the first replica and then reinit all its child replicas as well:

    A
   / \
  B   C
       \
        D
         \
          E

If you reinit C, then you need to reinit D and E.  As D and E might not have received all the updates from the old(pre-reinited) replica C changelog in time.  In a round-robin deployment, where every replica is connected to one another, this is not really an issue.

This being said, usually you only reinit a replica once replication is broken(or being setup for the very first time), not while it is still correctly working and processing updates.  So it's simply problematic when you reinit a replica that is already correctly running.   

Again setting to read only would work(with client disruptions), but then you need to make sure that replication is idle before doing the reinit.  Meaning, make sure replica C has sent out all its updates(all the agreements are idle/caught up), then reinit C.  This requires checking RUVs in each agreement against the consumer replica database RUV, etc.

But if you only did a reinit if replication was already broken we would not see these issues, its only when you reinit a working replica that problems can arise.  So the short story is - don't reinit working replicas :-)

Please let me know if you have any more questions.

PS - something to keep in mind for the future.  If IPA deployments start to become very large.  Hundreds of thousands of entries, or more, online reinitializations become very expensive/disruptive.  They can even appear to hang the server. This is why reinits are considered to be most expensive/disruptive task replication can do, and they are usually avoided at all costs unless replication can not recover from some serious failure.  So, offline (db2ldif -r/ldif2db) reinitializations become the preferred choice once replication breaks, and even for the initial replication setup when dealing with large databases.

> 
> This is what we do when re-initializing a replica:
> 
> 1) enable the agreement from this host to the remote host (put
> nsds5ReplicaEnabled to ON)
> 2) enable the agreement from the remote host to local host (put
> nsds5ReplicaEnabled to ON)
> 3) Force synchronization from the remote host to the local host (play with
> nsDS5ReplicaUpdateSchedule)
> 4) Re-initialize the replication (change nsds5BeginReplicaRefresh to start)
> 
> I see no wait with the force sync action. I also see we do not force sync
> with other replication peers.
> 
> Should we proceed differently? Ideally without disruptions on the remote
> Directory Servers.

Comment 22 Martin Kosek 2014-01-21 08:47:30 UTC

Mark, thanks for explanation. I am now thinking what from the proposed improvements could be automated in ipa-replica-manage.

We could warn user that he has to re-initialize also other IPA masters in case he reinitializes "C" as in your example. But for that, we would first need to be able to get a full graph of the IPA network: https://fedorahosted.org/freeipa/ticket/3058

As for other enhancements, I am thinking about following update to the process:

1) enable the agreement from this host to the remote host (put nsds5ReplicaEnabled to ON)
2) enable the agreement from the remote host to local host (put nsds5ReplicaEnabled to ON)

FOR EACH replication peer:
    a) Force synchronization from the remote host to the local host (play with nsDS5ReplicaUpdateSchedule)
    b) Wait until replication is stale (nsds5replicaUpdateInProgress is false)

3) Re-initialize the replication (change nsds5BeginReplicaRefresh to start)

Would that improve the process? I was not sure what exactly do you mean by "This requires checking RUVs in each agreement against the consumer replica database RUV, etc.", i.e. how should I check/compare that.

Comment 23 mreynolds 2014-01-21 21:40:22 UTC

(In reply to Martin Kosek from comment #22)
> Mark, thanks for explanation. I am now thinking what from the proposed
> improvements could be automated in ipa-replica-manage.
> 
> We could warn user that he has to re-initialize also other IPA masters in
> case he reinitializes "C" as in your example. 

I think there should always be some type of warning when doing an online reinit.  Stating something like the remote database will be removed, and its changelog invalidated, and that it might require the remote replicas peers to be reinited as well.

> But for that, we would first
> need to be able to get a full graph of the IPA network:
> https://fedorahosted.org/freeipa/ticket/3058
> 
> As for other enhancements, I am thinking about following update to the
> process:
> 
> 1) enable the agreement from this host to the remote host (put
> nsds5ReplicaEnabled to ON)
> 2) enable the agreement from the remote host to local host (put
> nsds5ReplicaEnabled to ON)
> 
> FOR EACH replication peer:
>     a) Force synchronization from the remote host to the local host (play
> with nsDS5ReplicaUpdateSchedule)
>     b) Wait until replication is stale (nsds5replicaUpdateInProgress is
> false)

This won't guarantee that replication is idle when you actually do the reinit.
You would need to:

 a) Set this server to read-only mode.
 b) Force synchronization from the remote host to the local host (play with nsDS5ReplicaUpdateSchedule).
 c) Then wait for nsds5replicaUpdateInProgress to be false. 
 d) Do the reinit on the remote replica.
 e) Finally, disable read-only mode.

While this is disruptive to clients/replicas, this should not be a common task being performed.  If it needs to be run, then there are probably already disruptive problems occurring, or, nothing was even setup yet(in which case it doesn't really matter).

> 
> 3) Re-initialize the replication (change nsds5BeginReplicaRefresh to start)
> 
> Would that improve the process? I was not sure what exactly do you mean by
> "This requires checking RUVs in each agreement against the consumer replica
> database RUV, etc.", i.e. how should I check/compare that.

I was referring to the "hard" way of determining if the replica was idle.  Checking nsds5replicaUpdateInProgress should be sufficient.

Comment 24 Martin Kosek 2014-02-06 13:19:05 UTC

Ok, thanks for suggestions. I reopened the ticket, we will triage it and see what we can do with it upstream.

Comment 27 Martin Kosek 2016-02-24 11:38:32 UTC

Ludwig, will this Bugzilla be fixed with the latest RUV fixes that were done in Directory Server and FreeIPA?

Comment 28 Ludwig 2016-02-25 14:26:04 UTC

yes, but the corresponding DS ticket #48218 is only committed in master

Comment 29 Martin Kosek 2016-02-25 14:51:17 UTC

Ok. Just for reference, this is the link to DS ticket:
https://fedorahosted.org/389/ticket/48218
It should be used in FreeIPA 4.4, when RUVs are cleaned.

The DS and FreeIPA changes should get to RHEL-7.3 with next considered rebase (Bug 1270020). The situation should be also much improved with
https://fedorahosted.org/freeipa/ticket/5411
being closed.

This all should be tested as part of the IdM topology feature (Bug 1298848) will manage the agreement. The proposed enhancements should be then filed on top of the Topology feature, based on the experience. For now, I am thus closing this bug as duplicate and I will link the upstream ticket to the Topology feature, to be aware of the request.

*** This bug has been marked as a duplicate of bug 1298848 ***