Bug 1908553
| Summary: | LDAP replication halt , nstate and large offsets, CSN poisoning example | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Marc Sauton <msauton> |
| Component: | 389-ds-base | Assignee: | LDAP Maintainers <ldap-maint> |
| Status: | CLOSED DUPLICATE | QA Contact: | RHDS QE <ds-qe-bugs> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 7.7 | CC: | bsmejkal, ldap-maint, mreynolds, tbordaz |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-08-01 19:51:21 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
POssibly fixed by Issue 4943 - Fix csn generator to limit time skew drift (#4946) This was potentially fixed in two bugs: https://bugzilla.redhat.com/show_bug.cgi?id=2049812 ---> fixed in 389-ds-base-1.3.10.2-15.el7_9 Not fixed in RHEL 7.9 yet: https://bugzilla.redhat.com/show_bug.cgi?id=2113056 ---> Import may break replication because changelog starting csn may not be created A hotfix could be provided using this commit to see if it helps the issue: https://github.com/389ds/389-ds-base/commit/2e4625fc533011a4214408612eb93eeb66a4ddb0 Since there is the 7.9 bug listed above, and the customer case is closed I am going to close this bug as a duplicate of BZ#2113056. *** This bug has been marked as a duplicate of bug 2113056 *** |
Description of problem: this is a report to track another example of IPA LDAP replication halt with "CSN poisoning" I will speculate the non proven root cause may be: there was a LDAP change processed for replication while either the system time in unstable or the system time is changed on one or more replica. Version-Release number of selected component (if applicable): 389-ds-base-1.3.9.1-12.el7_7.x86_64 ipa-server-4.6.5-11.el7_7.4.x86_64 redhat-release-server-7.7-10.el7.x86_64 How reproducible: N/A Steps to Reproduce: 1. N/A 2. 3. Actual results: IPA LDAP replication halt the errors logs are full of events like this sample: [09/Dec/2020:21:47:06.908139155 +0000] - ERR - agmt="cn=edited-host1-to-edited-host2" (edited-host2:389) - clcache_load_buffer - Can't locate CSN ffffffffe05601dc0000 in the changelog (DB rc=-30988). If replication stops, the consumer may need to be reinitialized. Expected results: yes Additional info: the ugly facts: giant offsets in several replication agreement nsstate, edited example: nsState is 3AEAAAAAAACvNdFfAAAAAAAAAAAAAAAAMapQowAAAAACAAAAAAAAAA== Little Endian For replica cn=replica,cn=dc\3Dedited\2Cdc\3Dedited,cn=mapping tree,cn=config fmtstr=[H6x3QH6x] size=40 len of nsstate is 40 CSN generator state: Replica ID : 476 Sampled Time : 1607546287 Gen as csn : 5fd135af000204760000 Time as str : Wed Dec 9 20:38:07 2020 Local Offset : 0 Remote Offset : 2739972657 Seq. num : 2 System time : Thu Dec 17 01:18:32 2020 Diff in sec. : 621625 Day:sec diff : 7:16825 nsState is qQsAAAAAAADujzJlAAAAAJe0EwAAAAAAhUwBAAAAAABcTwAAAAAAAA== Little Endian For replica cn=replica,cn=o\3Dipaca,cn=mapping tree,cn=config fmtstr=[H6x3QH6x] size=40 len of nsstate is 40 CSN generator state: Replica ID : 2985 Sampled Time : 1697812462 Gen as csn : 65328fee2031629850000 Time as str : Fri Oct 20 14:34:22 2023 Local Offset : 1291415 Remote Offset : 85125 Seq. num : 20316 System time : Thu Dec 17 01:18:32 2020 Diff in sec. : -89644550 Day:sec diff : -1038:38650 the RUV in some of the replication agreements have non valid CSNs, edited sample: nsds50ruv: {replicageneration} 00278c29000000030000 nsds50ruv: {replica 1 ldap://edited-host1:389} 0000002d000001f50000 03201ac3000001f50000 nsds50ruv: {replica 2 ldap://edited-host2:389} 5b49720c000001db0000 03206fa0000001db0000 nsds50ruv: {replica 3 ldap://edited-host3:389} 5b496c72000001da0000 03206edf000001da0000 nsds50ruv: {replica 4 ldap://edited-host4:389} 5b4a1fd6000001e10000 5caf4aeb000501e10000 nsds50ruv: {replica 5 ldap://edited-host5:389} 5b49675f000001d90000 0320696d000001d90000 nsds50ruv: {replica 6 ldap://edited-host6:389} 5b4a6558000801eb0000 5caf500bdf4801eb0000 nsds50ruv: {replica 7 ldap://edited-host7:389} 00000027000002020000 02f3a7ec000102020000 nsds50ruv: {replica 8 ldap://edited-host8:389} 00000029000002070000 00ddbb78000002070000 nsds50ruv: {replica 9 ldap://edited-host9:389} 0000001c0000020e0000 00c1d7300003020e0000 nsds50ruv: {replica 10 ldap://edited-host10:389} 5ca3804d000001f10000 03206715000001f10000 nsds50ruv: {replica 11 ldap://edited-host11:389} 5b4a68c8000101ec0000 5caf561a000401ec0000 nsds50ruv: {replica 12 ldap://edited-host12:389} 5b497b17000101dc0000 ffffffffe05601dc0000