Bug 1666843
| Summary: | ipa-replica-manage force-sync --from keeps prompting "No status yet" | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | anuja <amore> | ||||
| Component: | ipa | Assignee: | IPA Maintainers <ipa-maint> | ||||
| Status: | CLOSED ERRATA | QA Contact: | ipa-qe <ipa-qe> | ||||
| Severity: | unspecified | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 7.5 | CC: | afarley, amore, cheimes, frenaud, jomurphy, lkrispen, mhonek, mkosek, mreynolds, myusuf, ndehadra, nkinder, pvoborni, rcritten, rmeggins, spichugi, tbordaz, tscherf, vashirov | ||||
| Target Milestone: | rc | Keywords: | Regression | ||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | ipa-4.6.5-2.el7 | Doc Type: | If docs needed, set a value | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2019-08-06 13:09:30 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
anuja
2019-01-16 17:48:31 UTC
Created attachment 1521095 [details]
logs.
I was able to reproduce the same issue. Moving issue to 389-ds as it seems a replication issue. This is marked a regression, so what DS version was working? Looking at the changelog for 7.5 I do not see any replication changes that were made in a very long time. I also don't see any problems in the DS logs. Not sure why the ipa tool is "misbehaving". Someone more familiar with IPA needs to look into this... What ipa does in this case is it sets the nsds5replicaupdatescheduleto 2358-2359 0 and then deletes that schedule. Then it reads the agreement and examines nsds5BeginReplicaRefresh and nsds5ReplicaLastInitStatus. In this case there is no nsds5BeginReplicaRefresh and no nsds5ReplicaLastInitStatus so it is looping waiting for either to appear. (In reply to Rob Crittenden from comment #7) > What ipa does in this case is it sets the nsds5replicaupdatescheduleto > 2358-2359 0 and then deletes that schedule. What is the purpose of this? To wake up the agreement? > Then it reads the agreement and > examines nsds5BeginReplicaRefresh and nsds5ReplicaLastInitStatus. > > In this case there is no nsds5BeginReplicaRefresh and no > nsds5ReplicaLastInitStatus so it is looping waiting for either to appear. In the errors log I do see 5 successful initializations. So it is working, but not sure why the status attributes are not updated. Is there a system I can look at where this currently happening? I did try to look at the beaker boxes listed in comment 4 but I don't know the root password. (In reply to mreynolds from comment #8) > What is the purpose of this? To wake up the agreement? Exactly. (In reply to mreynolds from comment #6) > This is marked a regression, so what DS version was working? Looking at the > changelog for 7.5 I do not see any replication changes that were made in a > very long time. > > I also don't see any problems in the DS logs. Not sure why the ipa tool is > "misbehaving". Someone more familiar with IPA needs to look into this... There is test which checks ipa-replica-manage force-sync works or not. This was randomly observed in previous version during upgrade jobs. After rerunning we got successful run in upgrade job. http://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2019/01/32915/3291597/6397118/86858538/TESTOUT.log Additional info: Before running ipa-replica-manage -p force-sync --from Instead of ipa topologysegment-reinitialize domain --right If we use : ipa-replica-manage -p re-initialize --from We got successful run with no random failures. This is a different command though. Re-initialize drops the current database and re-loads the entire database. force-sync just forces replication to wake-up. I could reproduce the issue, here are my results...
On the REPLICA I ran:
ipa topologysegment-reinitialize domain segment-name --right
This causes REPLICA to reinit MASTER, and this does set nsds5replicaLastInitStatus in REPLICA's replication agreement (but not on MASTER of course)
Then when I run:
ipa-replica-manage -p Secret123 force-sync --from MASTER_HOSTNAME
I see the error message "No status yet", but the tool is querying MASTER - which was not initialized from the first command so it does not have the nsds5replicaLastInitStatus attribute. Now "force-sync" would not update nsds5replicaLastInitStatus anyway from my understanding (it just wakes the agreement up), so checking nsds5replicaLastInitStatus after doing "force-sync" is not correct.
Another observation I made is that after you restart DS the attribute nsds5replicaLastInitStatus is silently stripped from the replication agreement entries. I do not know if that is how it always worked or not (needs more testing to see if that is a change in behavior). If it is a change in behavior then that explains why the IPA tool was accidentally working before, but it was not giving you the actual status.
So there could be bugs in both DS and IPA on this one.
Follow up to the DS issue, turns out we do not and never did store the init and update status attributes in the agreement. They are in memory only, and that's why they reset after a restart. So now this just looks like a logic flaw in the IPA tool. I don't dispute that there could be a bug in IPA but this code is more or less unchanged since 2013. (In reply to Rob Crittenden from comment #19) > I don't dispute that there could be a bug in IPA but this code is more or > less unchanged since 2013. But, like I said if the server is restarted the last init attributes are removed. So if the test was run "before" master was restarted the tool would have reported success (even though looking at the last init status attribute has nothing to do with forcing updates to be sent). So the issue here is the order in which things are done. From looking at the IPA code it looks like "force-sync" should not call check_repl_init(), but instead call check_repl_update(). The issue appeared in RHEL 7.5 but was not present on RHEL 7.4, so it is indeed a regression. ipa-replica-manage has been modified when fixing https://pagure.io/freeipa/issue/7211, and the fix introduced a call to repl.wait_for_repl_init in the force_sync method of install/tools/ipa-replica-manage. wait_for_repl_init is calling check_repl_init, which is reading nsds5BeginReplicaRefresh nsds5replicaUpdateInProgress nsds5ReplicaLastInitStatus nsds5ReplicaLastInitStart and nsds5ReplicaLastInitEnd on the --from node. See commit https://pagure.io/freeipa/c/b83073d288292d2f2cc09d480bf90c7d5208111c on branch ipa-4-5, and commit https://pagure.io/freeipa/c/a3b0af387f8e3c67f1b223869d3f540989eb2f43 on branch ipa-4-6 Hence moving the BZ to ipa. Upstream ticket: https://pagure.io/freeipa/issue/7886 Fixed upstream master: https://pagure.io/freeipa/c/6e8d38caa8926deedc8cfc5dc113444431d94051 ipa-4-7: https://pagure.io/freeipa/c/6856b17fe9358a98f1bce8591d024847e4a97f21 Fixed upstream ipa-4-6: https://pagure.io/freeipa/c/36628f68fce364854f9789bb0339307e2e0d417a Version: Steps: Install ipa-server with replica On replica: 1. ipa topologysegment-find domain 2. ipa topologysegment-reinitialize domain segment-name --left 3. ipa-replica-manage -p Secret123 force-sync --from ipa-server Actual results: [root@master ~]# ipa topologysegment-reinitialize domain master.testrelm.test-to-replica.testrelm.test --right ------------------------------------------------------------------------------------------- Replication refresh for segment: "master.testrelm.test-to-replica.testrelm.test" requested. ------------------------------------------------------------------------------------------- [root@master ~]# [root@master ~]# ipa-replica-manage -p Secret123 force-sync --from master.testrelm.test [root@master ~]# echo $? 0 [root@master ~]# ipa topologysegment-reinitialize domain master.testrelm.test-to-replica.testrelm.test --left ------------------------------------------------------------------------------------------- Replication refresh for segment: "master.testrelm.test-to-replica.testrelm.test" requested. ------------------------------------------------------------------------------------------- [root@master ~]# [root@master ~]# ipa-replica-manage -p Secret123 force-sync --from master.testrelm.test [root@master ~]# echo $? 0 [root@master ~]# cat /var/log/ipa/cli.log 2019-05-15T08:35:07Z 7059 MainThread ipaserver.install.replication INFO Setting agreement cn=meToreplica.testrelm.test,cn=replica,cn=dc\=testrelm\,dc\=test,cn=mapping tree,cn=config schedule to 2358-2359 0 to force synch 2019-05-15T08:35:08Z 7059 MainThread ipaserver.install.replication INFO Deleting schedule 2358-2359 0 from agreement cn=meToreplica.testrelm.test,cn=replica,cn=dc\=testrelm\,dc\=test,cn=mapping tree,cn=config 2019-05-15T08:35:09Z 7059 MainThread ipaserver.install.replication INFO Replication Update in progress: FALSE: status: Error (0) Replica acquired successfully: Incremental update succeeded: start: 0: end: 0 Command succeed, hence based on above observations, marking the bug as verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2241 |