Description of problem:
we can see at some customer's deployment the following issue:
ipa topologysuffix-verify domain
Replication topology of suffix "domain" contains errors.
Recommended maximum number of agreements per replica exceeded
Maximum number of agreements per replica: 4
Server "server1" has 5 agreements with servers:
So, we see server4 twice !
When inspecting cn=topology tree, we can see the following entries:
even if this could be valid as setting, I think in IPA we should not allow to have a "both" direction and a "left right" or "right left" direction between the same nodes simultaneously. What would be the point of having redundant replication conflicts.
Note that the DN is different because in one case is node1 -> node2 "both" and in the other node2 -> node1 "left right".
Version-Release number of selected component (if applicable):
Steps to Reproduce:
This indeed looks as wrong state. The segments should be merged by topology plugin. CCing Ludwig.
The segments seem to be merged, we have a "both" segment. But the "left-right" should have been deleted.
Can you provide the segment with full entrystate: nscpentrywsi ?
Do you know teh sequence of actions leading to this state ?
at this moment, the issue was solved at customer side.
I cannot answer your question but since this customer had conflicts as well, we have manipulated segments with the plugin disabled. I wonder whether the customer has created segments before re-enabling the plugin.
In any case, is there any reason to allow creating left-right segments in ipa. Shouldn't the segments be only "both" till we could allow read only replicas ?
the existence of left-right and both is a side effect of the "conflict problem" and Ihave seen them in my efforts to reproduce, so it is another part of the necessary cleanup to check for these semi-duplicate segments.
For the other question: no, we cannot do this. When the topology plugin is enabled, eg after raising the domain level, then it starts on each server independently and creating the segments from the existing agreements, at this time it does not know if there is an agreement in the other direction. in an existing topology admin might have created one-directional agreements.
So, segments fro agreements always start ondirectional and when they "meet" the other direction one is transformed to both and the other is removed.
Thanks for your answer Ludwig.
I didn't know we could create one-directional segment between two masters.
What would happen if I have:
A ===> B
between two masters and I do updates on B only ?
In traditional replication, this will provoke that the replicas are not synced.
the IPA CLI does not allow to create one directional segments, but if we start creating segments by bootstrapping a topology from existing replication agreements, we start with one directional segments. Your example doe not work, but what prevented to create a toplogy like
A <==> B <==> C <==> D <==> A and add A ==> C and B ==> D.
we do not support the creation of this by creating segments, but if it exists then B==>D will become a left-right only segment
"and add A ==> C and B ==> D" by which means ? If the CLI does not support it, we can only do this manually.
If there's a B==>D segment in A <==> B <==> C <==> D <==> A
and we decide later to disconnect, for instance, A <=> D and B <=> C, the graph is still connected but updates to D will never arrive to A and B.
I would say this bug was provoked by a bootstrapping process that has not managed to read all the segments to make them become from "left right" to "both" (either because not all the nodes were reachable at that moment or because of the conflicts issue).
If that's the case, then let's close this one.
Perhaps we could add a mechanism of "merge" not only at bootstrap but at some other point (for instance a sort of ipa replica-topologysuffix-verify --repair) since the code is already present.
I would not close the bug. The situation can be reported by topologysuffix-verify command. Or also fixed with e.g. --repair, as suggested by German.
I see the resolution as:
1. Can we prevent the situation? If not then #2. If yes, then fix where we can prevent it.
2. Can topology plugin fix it automatically? If yes then fix. If not then #3.
3. Do the stuff in ipa topologysuffix-verify outlined above.
Ludwig, is #1 or #2 possible? Seems to me that comment 5 says yes.
(In reply to Petr Vobornik from comment #9)
> I would not close the bug. The situation can be reported by
> topologysuffix-verify command. Or also fixed with e.g. --repair, as
> suggested by German.
> I see the resolution as:
> 1. Can we prevent the situation? If not then #2. If yes, then fix where we
> can prevent it.
> 2. Can topology plugin fix it automatically? If yes then fix. If not then #3.
> 3. Do the stuff in ipa topologysuffix-verify outlined above.
> Ludwig, is #1 or #2 possible? Seems to me that comment 5 says yes.
I am working on #1
#2 would be hard or impossible as the entries may be turned into conflict entries after generation of segments
#3 could be an option to report and resolve conflicts, we can evaluate what is needed when we have more results testing with #1
And #1 is on 389-ds side, right? So we should change component, right?
(In reply to Petr Vobornik from comment #12)
> And #1 is on 389-ds side, right? So we should change component, right?
I would close it then as duplicate of 1395848, that is the bug used to fix the conflicts
I was thinking about it again, and there is something that could and should be done in IPA.
The fix in DS will prevent the creation of visible conflicts and so prevent follow up errors, but in a deployment where the conflicts already exist the raising of the domain level can cause the problems with segments.
So I suggest to add a check to the "ipa domainlevel-set" command to check if there are conflicts below cn=topology and reject raising the domainlevel.
Let's add Bug 1395848 as blocked by the IdM change. It would be indeed good to add this one check in the command to make sure the deployment does not get to a broken state after Domain Level upgrade.
Warning: when implementing this change, we will need to make sure we use the right filter for detecting collisions, given that their structure is being changed in Bug 1395848.
no, bug 1395848 is not blocked, is only that the patch 1395848 will not address the existing conflicts, so they are complementary.
And the check has to use the filter to find the old conflict entries, the conflict entries created after applying fix for 1395848 should not be in the way.
But it would be ok to search for old AND new conflicts
the summary should say "before raising the domain level" not "before upgrading".
I don't understand how checking for replication conflicts in segment entries before raising domain level to 1 would help.
AFAIK on domain level 0 no segments exist so there should not be any replication conflicts. Conflicts happen after domain level is raised. Then topology plugin is "activated" as starts to create segemts, right? What am I missing? Or did you mean replication conflicts in suffix entries?
the problem is the existing conflict entries for cn=domain or cn=ca.
if there are conflict entries it can happen that when the domain level is raised some segments will be put under the "conflict" entries and some under the "real" entries and we can avoid this by checking before raising the domain level.
if a replica is upgraded to 4.3+ then the cn=realm is deleted and cn=domain is added and also cn=ca is added. If the upgrade is run in parallel on several replicas, it can happen that this creates conflict entries.
a fix in ds can prevent/hide these, but if a deployment already upgraded to 7.3, but did not yet raise the domain level, the conflicts could already exist and the patch in DS would not cjhange this, but a check for conflicts before raising the domain level could prevent to make it worse
Marking the bug as VERIFIED as per steps mentioned in https://bugzilla.redhat.com/show_bug.cgi?id=1404338#c8.
Upgraded from 7.2.z to 7.4 while verifying this bug.
[root@ibm-x3650m4-01-vm-04 slapd-TESTRELM-TEST]# echo *** | kinit admin
Password for admin@TESTRELM.TEST:
[root@ibm-x3650m4-01-vm-04 slapd-TESTRELM-TEST]# klist -l
Principal name Cache name
[root@ibm-x3650m4-01-vm-04 slapd-TESTRELM-TEST]# ipa domainlevel-set 1
Current domain level: 1
[root@replica-01-vm-15 slapd-TESTRELM-TEST]# ipa domainlevel-set 1
ipa: ERROR: no modifications to be performed
[root@replica-01-vm-15 slapd-TESTRELM-TEST]# ipa domainlevel-get
Current domain level: 1
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.