Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Description of problem:
this is a report to track another example of IPA LDAP replication halt with "CSN poisoning"
I will speculate the non proven root cause may be:
there was a LDAP change processed for replication while either the system time in unstable or the system time is changed on one or more replica.
Version-Release number of selected component (if applicable):
389-ds-base-1.3.9.1-12.el7_7.x86_64
ipa-server-4.6.5-11.el7_7.4.x86_64
redhat-release-server-7.7-10.el7.x86_64
How reproducible:
N/A
Steps to Reproduce:
1. N/A
2.
3.
Actual results:
IPA LDAP replication halt
the errors logs are full of events like this sample:
[09/Dec/2020:21:47:06.908139155 +0000] - ERR - agmt="cn=edited-host1-to-edited-host2" (edited-host2:389) - clcache_load_buffer - Can't locate CSN ffffffffe05601dc0000 in the changelog (DB rc=-30988). If replication stops, the consumer may need to be reinitialized.
Expected results:
yes
Additional info:
the ugly facts:
giant offsets in several replication agreement nsstate, edited example:
nsState is 3AEAAAAAAACvNdFfAAAAAAAAAAAAAAAAMapQowAAAAACAAAAAAAAAA==
Little Endian
For replica cn=replica,cn=dc\3Dedited\2Cdc\3Dedited,cn=mapping tree,cn=config
fmtstr=[H6x3QH6x]
size=40
len of nsstate is 40
CSN generator state:
Replica ID : 476
Sampled Time : 1607546287
Gen as csn : 5fd135af000204760000
Time as str : Wed Dec 9 20:38:07 2020
Local Offset : 0
Remote Offset : 2739972657
Seq. num : 2
System time : Thu Dec 17 01:18:32 2020
Diff in sec. : 621625
Day:sec diff : 7:16825
nsState is qQsAAAAAAADujzJlAAAAAJe0EwAAAAAAhUwBAAAAAABcTwAAAAAAAA==
Little Endian
For replica cn=replica,cn=o\3Dipaca,cn=mapping tree,cn=config
fmtstr=[H6x3QH6x]
size=40
len of nsstate is 40
CSN generator state:
Replica ID : 2985
Sampled Time : 1697812462
Gen as csn : 65328fee2031629850000
Time as str : Fri Oct 20 14:34:22 2023
Local Offset : 1291415
Remote Offset : 85125
Seq. num : 20316
System time : Thu Dec 17 01:18:32 2020
Diff in sec. : -89644550
Day:sec diff : -1038:38650
the RUV in some of the replication agreements have non valid CSNs, edited sample:
nsds50ruv: {replicageneration} 00278c29000000030000
nsds50ruv: {replica 1 ldap://edited-host1:389} 0000002d000001f50000 03201ac3000001f50000
nsds50ruv: {replica 2 ldap://edited-host2:389} 5b49720c000001db0000 03206fa0000001db0000
nsds50ruv: {replica 3 ldap://edited-host3:389} 5b496c72000001da0000 03206edf000001da0000
nsds50ruv: {replica 4 ldap://edited-host4:389} 5b4a1fd6000001e10000 5caf4aeb000501e10000
nsds50ruv: {replica 5 ldap://edited-host5:389} 5b49675f000001d90000 0320696d000001d90000
nsds50ruv: {replica 6 ldap://edited-host6:389} 5b4a6558000801eb0000 5caf500bdf4801eb0000
nsds50ruv: {replica 7 ldap://edited-host7:389} 00000027000002020000 02f3a7ec000102020000
nsds50ruv: {replica 8 ldap://edited-host8:389} 00000029000002070000 00ddbb78000002070000
nsds50ruv: {replica 9 ldap://edited-host9:389} 0000001c0000020e0000 00c1d7300003020e0000
nsds50ruv: {replica 10 ldap://edited-host10:389} 5ca3804d000001f10000 03206715000001f10000
nsds50ruv: {replica 11 ldap://edited-host11:389} 5b4a68c8000101ec0000 5caf561a000401ec0000
nsds50ruv: {replica 12 ldap://edited-host12:389} 5b497b17000101dc0000 ffffffffe05601dc0000
Description of problem: this is a report to track another example of IPA LDAP replication halt with "CSN poisoning" I will speculate the non proven root cause may be: there was a LDAP change processed for replication while either the system time in unstable or the system time is changed on one or more replica. Version-Release number of selected component (if applicable): 389-ds-base-1.3.9.1-12.el7_7.x86_64 ipa-server-4.6.5-11.el7_7.4.x86_64 redhat-release-server-7.7-10.el7.x86_64 How reproducible: N/A Steps to Reproduce: 1. N/A 2. 3. Actual results: IPA LDAP replication halt the errors logs are full of events like this sample: [09/Dec/2020:21:47:06.908139155 +0000] - ERR - agmt="cn=edited-host1-to-edited-host2" (edited-host2:389) - clcache_load_buffer - Can't locate CSN ffffffffe05601dc0000 in the changelog (DB rc=-30988). If replication stops, the consumer may need to be reinitialized. Expected results: yes Additional info: the ugly facts: giant offsets in several replication agreement nsstate, edited example: nsState is 3AEAAAAAAACvNdFfAAAAAAAAAAAAAAAAMapQowAAAAACAAAAAAAAAA== Little Endian For replica cn=replica,cn=dc\3Dedited\2Cdc\3Dedited,cn=mapping tree,cn=config fmtstr=[H6x3QH6x] size=40 len of nsstate is 40 CSN generator state: Replica ID : 476 Sampled Time : 1607546287 Gen as csn : 5fd135af000204760000 Time as str : Wed Dec 9 20:38:07 2020 Local Offset : 0 Remote Offset : 2739972657 Seq. num : 2 System time : Thu Dec 17 01:18:32 2020 Diff in sec. : 621625 Day:sec diff : 7:16825 nsState is qQsAAAAAAADujzJlAAAAAJe0EwAAAAAAhUwBAAAAAABcTwAAAAAAAA== Little Endian For replica cn=replica,cn=o\3Dipaca,cn=mapping tree,cn=config fmtstr=[H6x3QH6x] size=40 len of nsstate is 40 CSN generator state: Replica ID : 2985 Sampled Time : 1697812462 Gen as csn : 65328fee2031629850000 Time as str : Fri Oct 20 14:34:22 2023 Local Offset : 1291415 Remote Offset : 85125 Seq. num : 20316 System time : Thu Dec 17 01:18:32 2020 Diff in sec. : -89644550 Day:sec diff : -1038:38650 the RUV in some of the replication agreements have non valid CSNs, edited sample: nsds50ruv: {replicageneration} 00278c29000000030000 nsds50ruv: {replica 1 ldap://edited-host1:389} 0000002d000001f50000 03201ac3000001f50000 nsds50ruv: {replica 2 ldap://edited-host2:389} 5b49720c000001db0000 03206fa0000001db0000 nsds50ruv: {replica 3 ldap://edited-host3:389} 5b496c72000001da0000 03206edf000001da0000 nsds50ruv: {replica 4 ldap://edited-host4:389} 5b4a1fd6000001e10000 5caf4aeb000501e10000 nsds50ruv: {replica 5 ldap://edited-host5:389} 5b49675f000001d90000 0320696d000001d90000 nsds50ruv: {replica 6 ldap://edited-host6:389} 5b4a6558000801eb0000 5caf500bdf4801eb0000 nsds50ruv: {replica 7 ldap://edited-host7:389} 00000027000002020000 02f3a7ec000102020000 nsds50ruv: {replica 8 ldap://edited-host8:389} 00000029000002070000 00ddbb78000002070000 nsds50ruv: {replica 9 ldap://edited-host9:389} 0000001c0000020e0000 00c1d7300003020e0000 nsds50ruv: {replica 10 ldap://edited-host10:389} 5ca3804d000001f10000 03206715000001f10000 nsds50ruv: {replica 11 ldap://edited-host11:389} 5b4a68c8000101ec0000 5caf561a000401ec0000 nsds50ruv: {replica 12 ldap://edited-host12:389} 5b497b17000101dc0000 ffffffffe05601dc0000