| Summary: | Retain the changelog db file when the replica entry is modified during the promotion or demotion operation on a replica | ||
|---|---|---|---|
| Product: | [Retired] 389 | Reporter: | Jyoti ranjan das <jyoti-ranjan.das> |
| Component: | Replication - General | Assignee: | Rich Megginson <rmeggins> |
| Status: | CLOSED DEFERRED | QA Contact: | Ben Levenson <benl> |
| Severity: | urgent | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 1.2.10 | CC: | nhosoi |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | All | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2015-11-19 19:22:47 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
Upstream ticket: https://fedorahosted.org/389/ticket/295 We have a new ticket tracking system for 389. http://directory.fedoraproject.org/wiki/Bugs We have copied this bug to that system. Please create a Fedora account and add yourself to the CC of that ticket. If you have patches to contribute, please attach them to that ticket as attachments, and send an email to 389-devel.org asking for a review. Thanks Closing this bug since we moved to the trac ticket: https://fedorahosted.org/389/ticket/295 |
Description of problem: This bug is opened with reference to the bug id:750425 During the promotion or demotion operation of a replica(cases like when Master is demoted to play the Hub role and Hub is promoted to play the Master), if we delete the replica entry for the suffix to recreate a new replica for the same suffix then it intern delete the changelogdb file associated with that replica role, which causes data loss in few scenarios mentioned below. Scenario-1: ========== Topology: Master ( Replica ID: 1) | Hub / \ Consumer1 Consumer2 let us consider in this topology everybody are in sync up to a specific CSN say "CSN5". Now consumer1 is taken out of the topology for some reason. Master is open to receive the updates. Now consider in the topology all are in sync up to "CSN10" where as Cosumer1 is in "CSN5" because it's out of topology. Due to some reason Master disaster happened so to reduce the downtime the Hub is promoted to play the Master role in the absence of original Master. The steps which are followed to promote a Hub to plat the Master role is given below.After Hub is being promoted its open now to receive the changes. Now bring back the Consumer1 into to topology without initializing the Consumer1. In this case, Consumer1 will miss the changes from "CSN6" to "CSN10". This is one the potential data loss issue. Steps followed during Promotion: ================================ All these steps are followed when the "ns-slapd" process is running. Note: Please note that the promotion operation is getting performed when hub is up and running. 1: Delete the supplier bind DN(cn=<suppdn>,cn=config) from the Slave. This the DN which was being used by the master to communicate with the Hub/consumer. This being used when we create the replication aggrement. EX: dn: cn=replication manager,cn=config objectClass: inetorgperson objectClass: top cn: replication manager sn: RM userPassword: password passwordExpirationTime: 20380119031407Z # /opt/dirsrv/bin/ldapmodify -h localhost -p 4601 -D "cn=directory manager" -w <xxxxx> dn: cn=replication manager,cn=config changetype: delete 2: Make a copy of the "nsDS5ReplicaName" attribute value from the replica entry. 3: Modify cn=replica entry. it will in-tern delete the changelogdb file. EX: /opt/dirsrv/bin/ldapmodify -h localhost -p 4601 -D "cn=directory manager" -w <xxxx> dn: cn=replica,cn="dc=ind, dc=hp, dc=com",cn=mapping tree,cn=config changetype: delete Note: After this step there won't be any changelogdb file for the hub 4: Modify "cn=<suffix>,cn=mapping tree, cn=config" entry # /opt/dirsrv/bin/ldapmodify -h localhost -p 4601 -D "cn=directory manager" -w <xxxx> dn: cn="dc=ind, dc=hp, dc=com",cn=mapping tree,cn=config changetype: modify replace: nsslapd-state nsslapd-state: backend # /opt/dirsrv/bin/ldapmodify -h localhost -p 4601 -D "cn=directory manager" -w <xxxx> dn: cn="dc=ind, dc=hp, dc=com",cn=mapping tree,cn=config changetype: modify delete: nsslapd-referral 5: Now Re-create the cn=replica entry. # /opt/dirsrv/bin/ldapmodify -h localhost -p 4601 -D "cn=directory manager" -w <xxxx> dn: cn=replica,cn="dc=ind, dc=hp, dc=com",cn=mapping tree,cn=config changetype: add objectClass: nsDS5Replica objectClass: top nsDS5ReplicaRoot: dc=ind, dc=hp, dc=com nsDS5ReplicaType: 3 nsDS5Flags: 1 nsDS5ReplicaId: 1 < This is the same replicaid which is being used by the Master > nsDS5ReplicaName: < Same value which was taken in step-2 above> cn: replica nsds5ReplicaPurgeDelay: 1 nsds5ReplicaTombstonePurgeInterval: -1 Step-5: Restart the Hub. Now Hub becomes Master There are couple of other scenario where we can see this type of issue. Version-Release number of selected component (if applicable): How reproducible: Frequently Steps to Reproduce: Steps are given above to reproduce the issue. Actual results: Expected results: In this case, the expectation is, if we could modify the replica entry instead of deleting it during promotion or demotion operation and also retain the changelog db file when modification is done on the replica entry, it could resolve few of the data loss scenario in the topology. Additional info: Please let me know if more details necessary on this.