RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1309962 - keep alive entries can break replication
Summary: keep alive entries can break replication
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: 389-ds-base
Version: 6.0
Hardware: All
OS: Linux
urgent
urgent
Target Milestone: rc
: ---
Assignee: Noriko Hosoi
QA Contact: Viktor Ashirov
URL:
Whiteboard:
Depends On: 1307152
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-02-19 06:04 UTC by Jan Kurik
Modified: 2016-03-22 14:47 UTC (History)
9 users (show)

Fixed In Version: 389-ds-base-1.2.11.15-71.el6_7
Doc Type: Bug Fix
Doc Text:
Previously, a keep alive entry was being created at too many opportunities during replication, potentially causing a race condition when adding the entry to the replica changelog and resulting in operations being dropped from the replication. With this update, unnecessary keep alive entry creation has been eliminated, and missing replication no longer occurs.
Clone Of: 1307152
Environment:
Last Closed: 2016-03-22 14:47:21 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
M2_logs (501.83 KB, text/plain)
2016-03-07 07:36 UTC, Amita Sharma
no flags Details
M1 logs (859.28 KB, text/plain)
2016-03-07 07:36 UTC, Amita Sharma
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:0468 0 normal SHIPPED_LIVE 389-ds-base bug fix update 2016-03-22 18:46:51 UTC

Description Jan Kurik 2016-02-19 06:04:29 UTC
This bug has been copied from bug #1307152 and has been proposed
to be backported to 6.7 z-stream (EUS).

Comment 4 Amita Sharma 2016-03-04 13:33:12 UTC
Hi tbordaz,

I Assume the fix verification for this bug is ::
=================================
1. Create Master-consumer replication with fractional replication.
2. Afetr total init, check on cosumer side, the keep alive entry should not be created
3. Create a dummy entry on master and modify it with filtered updates in fraction replication, till it reaches 100.
4. Now check on consumer, keep alive entry should be there.

Here is what I have done ::
=======================
[root@ipaqa64vml yum.repos.d]# rpm -qa | grep 389
389-ds-base-libs-1.2.11.15-71.el6_7.x86_64
389-ds-base-1.2.11.15-71.el6_7.x86_64


Create Master. and on Master ::

enable replication logs
ldapmodify -x -h localhost -p 389 -D "cn=directory manager" -w Secret123 << EOF
> dn: cn=config
> changetype: modify
> replace: nsslapd-errorlog-level
> nsslapd-errorlog-level: 8192
> EOF
modifying entry "cn=config"

created replication manager entry
ldapmodify -x -a -h localhost -p 389 -D "cn=Directory Manager" -w Secret123 << EOF
dn: cn=replication manager,cn=config 
objectClass: inetorgperson
objectClass: person
objectClass: top
cn: replication manager
sn: RM
userPassword: Secret123
passwordExpirationTime: 20380119031407Z
nsIdleTimeout: 0

Enable Changelog
ldapmodify -D "cn=directory manager" -w Secret123 -p 389 -x -h localhost -v -a << EOF
dn: cn=changelog5,cn=config
changetype: add
objectclass: top
objectclass: extensibleObject
cn: changelog5
nsslapd-changelogdir: /var/lib/dirsrv/slapd-master/changelogdb
nsslapd-changelogmaxage: 10d
EOF

Configure Replica
ldapmodify -D "cn=directory manager" -w Secret123 -p 389 -x -h localhost -v -a << EOF
dn: cn=replica,cn=dc\=example\,dc\=com,cn=mapping tree,cn=config
changetype: add
objectclass: top
objectclass: nsds5replica
objectclass: extensibleObject
cn: replica
nsds5replicaroot: dc=example,dc=com
nsds5replicaid: 1
nsds5replicatype: 3
nsds5flags: 1
nsds5ReplicaPurgeDelay: 604800
nsds5ReplicaBindDN: cn=replication manager,cn=config

On Supplier
=========
Add Replication manager
ldapmodify -x -a -h localhost -p 1389 -D "cn=Directory Manager" -w Secret123 << EOF
dn: cn=replication manager,cn=config 
objectClass: inetorgperson
objectClass: person
objectClass: top
cn: replication manager
sn: RM
userPassword: Secret123
passwordExpirationTime: 20380119031407Z
nsIdleTimeout: 0

Create Replica
ldapmodify -D "cn=directory manager" -w Secret123 -p 1389 -x -h localhost -v -a << EOF
dn: cn=replica,cn=dc\=example\,dc\=com,cn=mapping tree,cn=config
changetype: add
objectclass: top
objectclass: nsds5replica
objectclass: extensibleObject
cn: replica
nsds5replicaroot: dc=example,dc=com
nsds5replicaid: 2
nsds5replicatype: 2
nsds5ReplicaBindDN: cn=replication manager,cn=config
nsds5flags: 0


On consumer
=========
Add replication agreement
ldapmodify -v -h localhost -p 389 -D "cn=directory manager" -w Secret123 << EOF
dn: cn=ExampleAgreement,cn=replica,cn="dc=example,dc=com",cn=mapping tree,cn=config
changetype: add
objectclass: top
objectclass: nsds5replicationagreement
cn: ExampleAgreement
nsds5replicahost: localhost
nsds5replicaport: 1389
nsds5ReplicaBindDN: cn=replication manager,cn=config
nsds5replicabindmethod: SIMPLE
nsds5replicaroot: dc=example,dc=com
description: agreement between Master and Slave
nsds5replicaupdateschedule: 0000-0500 1
nsds5replicatedattributelist: (objectclass=*) $ EXCLUDE mail
nsds5replicacredentials: Secret123
nsds5BeginReplicaRefresh: start

And as soon as I add the replication agreement, It adds the keep alive entry in the consumer server.

error Logs ::
[04/Mar/2016:07:24:58 -0500] NSMMReplicationPlugin - agmt="cn=ExampleAgreement" (localhost:1389): Replica was successfully acquired.
[04/Mar/2016:07:24:58 -0500] NSMMReplicationPlugin - Beginning total update of replica "agmt="cn=ExampleAgreement" (localhost:1389)".
[04/Mar/2016:07:24:58 -0500] NSMMReplicationPlugin - Need to create replication keep alive entry <cn=repl keep alive 1,dc=example,dc=com>
[04/Mar/2016:07:24:58 -0500] NSMMReplicationPlugin - add dn: cn=repl keep alive 1,dc=example,dc=com
objectclass: top
objectclass: ldapsubentry
objectclass: extensibleObject
cn: repl keep alive 1
[04/Mar/2016:07:24:58 -0500] NSMMReplicationPlugin - ruv_add_csn_inprogress: successfully inserted csn 56d97e9a000100010000 into pending list

And ldapsearch on consumer ::
ldapsearch -x -D "cn=Directory Manager" -h localhost -p 389 -w Secret123 -b "cn=repl keep alive 1,dc=example,dc=com" -s base
# extended LDIF
#
# LDAPv3
# base <cn=repl keep alive 1,dc=example,dc=com> with scope baseObject
# filter: (objectclass=*)
# requesting: ALL
#

# repl keep alive 1, example.com
dn: cn=repl keep alive 1,dc=example,dc=com
objectClass: top
objectClass: ldapsubentry
objectClass: extensibleObject
cn: repl keep alive 1

# search result
search: 2
result: 0 Success

# numResponses: 2
# numEntries: 1
[root@ipaqa64vml yum.repos.d]# 


Please advice, what is the expected behavior and the verification steps.
Thanks,
Ami

Comment 5 thierry bordaz 2016-03-04 15:07:24 UTC
Hello Ami,

The described 'fix verification' (https://bugzilla.redhat.com/show_bug.cgi?id=1309962#c4) steps 1 and 3 are IMHO incorrect.

Each instance in the topology (master or consumer) may have created a 'keep alive' entry. The 'cn=repl keep alive 1' is created by the instance with replicaId:1 (here the master). The consumer (replicaId:2) may create a 'cn=repl keep alive 2' entry if it needs to.

The bug was that the consumer (replicaId:2) created the entry 'cn=repl keep alive 2' but the entry was cleared from its changelog and so not replicated. It basically existed only on the consumer.

Step 1 : In IPA all replica are actually masters, I mean can be updated. In the deployement make sure that both master and consumer are nsdsreplicatype: 3 (masters). 

Step 2 : 'cn=repl keep alive 2' should not exist. The notion of master and consumer are said only for the role during initialization, the consumer is also a master (can handle updates) but has been initialized by the 'master'

Step 3: The dummy entry can be created on master or consumer. But the filtered updates must be applied on the consumer. So after you created more than 100 filtered updates on the consumer, the replica will create 'cn=repl keep alive 2'.

Step 4: ok

Hope it will help, do not hesitate to ping me.

thanks
theirry

Comment 6 Amita Sharma 2016-03-07 07:26:42 UTC
Hi Theirry ,

Thanks for the reply, here are the steps I have followed now::
M1 - 389
M2 - 1389

Both are masters but only one line is missing in M2 agreement i.e. :: "nsds5BeginReplicaRefresh: start" so that only M1 initializes M2 and not vise verse.

On M1 the agreement looks like ::
 ================================
ldapmodify -v -h localhost -p 389 -D "cn=directory manager" -w Secret123 << EOF
dn: cn=ExampleAgreement,cn=replica,cn="dc=example,dc=com",cn=mapping tree,cn=config
changetype: add
objectclass: top
objectclass: nsds5replicationagreement
cn: ExampleAgreement
nsds5replicahost: localhost
nsds5replicaport: 389
nsds5ReplicaBindDN: cn=replication manager,cn=config
nsds5replicabindmethod: SIMPLE
nsds5replicaroot: dc=example,dc=com
description: agreement between Master and Slave
nsds5replicatedattributelist: (objectclass=*) $ EXCLUDE mail
nsds5replicacredentials: Secret123
nsds5BeginReplicaRefresh: start

At this point, M1 has created keep alive entry in M2, But M2 has not created any entry in M1 ::
=====================================================
M1 ::
[07/Mar/2016:01:57:58 -0500] NSMMReplicationPlugin - agmt="cn=ExampleAgreement" (localhost:1389): replay_update: Sending add operation (dn="cn=repl keep alive 1,dc=example,dc=com" csn=56dd2670000100010000)

M2::
[07/Mar/2016:01:57:58 -0500] @A - urp_add (cn=repl keep alive 1,dc=example,dc=com): an entry with this uniqueid already exists.

[root@ipaqa64vml ~]# grep -li "cn=repl keep alive 1" /var/log/dirsrv/slapd-*/errors
/var/log/dirsrv/slapd-M1/errors
/var/log/dirsrv/slapd-M2/errors
[root@ipaqa64vml ~]# grep -li "cn=repl keep alive 2" /var/log/dirsrv/slapd-*/errors
[root@ipaqa64vml ~]# 

I have executed this script on M2::
================================
Entry added on M2::
$ldapadd -x -h localhost -p 389 -D "cn=Directory Manager" -w Secret123  << EOF
dn: uid=ami,dc=example,dc=com
cn: ams
sn: ams
givenname: ams
objectclass: top
objectclass: person
objectclass: organizationalPerson
objectclass: inetOrgPerson
uid: ami
mail: ams
userpassword: Secret123
EOF

[root@ipaqa64vml export]# cat modify.sh 
CPT=0
while [ $CPT -ne 100 ]
do
        # do this update 'mail' that is not replicated
        # so after more than 100 update, it will create the consumer keep alive entry
        #
        /usr/bin/ldapmodify -x -h localhost -p 1389 -D "cn=Directory Manager" -w Secret123 << EOF
dn: uid=ami,dc=example,dc=com
changetype: modify
replace: mail
mail: aamii
EOF
	CPT=`expr $CPT + 1`
done


Which does 100 times modifications on mail attribute which is skipped under fractional replication.
Still I don't see keep alive entry on M1.

PFA for the error logs on both the servers.
Please let me know in case any step in missing or need modification.

Thanks,
Ami

Comment 7 Amita Sharma 2016-03-07 07:36:27 UTC
Created attachment 1133644 [details]
M2_logs

Comment 8 Amita Sharma 2016-03-07 07:36:55 UTC
Created attachment 1133645 [details]
M1 logs

Comment 12 Amita Sharma 2016-03-07 17:16:04 UTC
Thanks Thierry for help with reproducing.

Two agreements on M1 and M2, where M2 is initialized by M1, but M1 is not by M2.

1. dn: cn=M1_to_M2,cn=replica,cn=dc\3Dexample\2Cdc\3Dcom,cn=mapping tree,cn=confi
 g
objectClass: top
objectClass: nsds5replicationagreement
cn: ExampleAgreement
cn: M1_to_M2
nsDS5ReplicaHost: localhost
nsDS5ReplicaPort: 389
nsDS5ReplicaBindDN: cn=replication manager,cn=config
nsDS5ReplicaBindMethod: SIMPLE
nsDS5ReplicaRoot: dc=example,dc=com
description: agreement between Master and Slave
nsDS5ReplicatedAttributeList: (objectclass=*) $ EXCLUDE mail
nsds5ReplicaStripAttrs: modifiersname modifytimestamp internalmodifiersname in
 ternalmodifytimestamp

2. dn: cn=M2_to_M1,cn=replica,cn=dc\3Dexample\2Cdc\3Dcom,cn=mapping tree,cn=confi
 g
objectClass: top
objectClass: nsds5replicationagreement
cn: ExampleAgreement
cn: M1_to_M2
nsDS5ReplicaHost: localhost
nsDS5ReplicaPort: 1389
nsDS5ReplicaBindDN: cn=replication manager,cn=config
nsDS5ReplicaBindMethod: SIMPLE
nsDS5ReplicaRoot: dc=example,dc=com
description: agreement between Master and Slave
nsDS5ReplicatedAttributeList: (objectclass=*) $ EXCLUDE mail
nsds5ReplicaStripAttrs: modifiersname modifytimestamp internalmodifiersname in
 ternalmodifytimestamp

3. On M2 Keep alive is created at the time of initialization, but it was not present on M1.

4. After doing 100+ updates on mail attribute on M2, Keep Alive entry got created ::
/var/log/dirsrv/slapd-M1/errors:[07/Mar/2016:11:42:02 -0500] NSMMReplicationPlugin - Need to create replication keep alive entry <cn=repl keep alive 1,dc=example,dc=com>
/var/log/dirsrv/slapd-M1/errors:[07/Mar/2016:11:42:02 -0500] NSMMReplicationPlugin - add dn: cn=repl keep alive 1,dc=example,dc=com
/var/log/dirsrv/slapd-M1/errors:[07/Mar/2016:11:42:07 -0500] NSMMReplicationPlugin - agmt="cn=M1_to_M2" (localhost:1389): replay_update: Sending add operation (dn="cn=repl keep alive 1,dc=example,dc=com" csn=56ddaf5a000100010000)
/var/log/dirsrv/slapd-M1/errors:[07/Mar/2016:11:47:11 -0500] NSMMReplicationPlugin - agmt="cn=M1_to_M2" (localhost:1389): replay_update: Sending add operation (dn="cn=repl keep alive 1,dc=example,dc=com" csn=56ddaf5a000100010000)
/var/log/dirsrv/slapd-M1/errors:[07/Mar/2016:11:50:09 -0500] NSMMReplicationPlugin - agmt="cn=M1_to_M2" (localhost:1389): replay_update: Sending add operation (dn="cn=repl keep alive 1,dc=example,dc=com" csn=56ddaf5a000100010000)
/var/log/dirsrv/slapd-M1/errors:[07/Mar/2016:11:53:43 -0500] NSMMReplicationPlugin - agmt="cn=M1_to_M2" (localhost:1389): replay_update: Sending add operation (dn="cn=repl keep alive 1,dc=example,dc=com" csn=56ddaf5a000100010000)
/var/log/dirsrv/slapd-M1/errors:[07/Mar/2016:11:58:46 -0500] NSMMReplicationPlugin - agmt="cn=M1_to_M2" (localhost:1389): replay_update: Sending add operation (dn="cn=repl keep alive 1,dc=example,dc=com" csn=56ddaf5a000100010000)
/var/log/dirsrv/slapd-M1/errors:[07/Mar/2016:12:03:51 -0500] NSMMReplicationPlugin - agmt="cn=M1_to_M2" (localhost:1389): replay_update: Sending add operation (dn="cn=repl keep alive 1,dc=example,dc=com" csn=56ddaf5a000100010000)
/var/log/dirsrv/slapd-M1/errors:[07/Mar/2016:12:08:56 -0500] NSMMReplicationPlugin - agmt="cn=M1_to_M2" (localhost:1389): replay_update: Sending add operation (dn="cn=repl keep alive 1,dc=example,dc=com" csn=56ddaf5a000100010000)
/var/log/dirsrv/slapd-M1/errors:[07/Mar/2016:12:09:11 -0500] NSMMReplicationPlugin - Purged state information from entry cn=repl keep alive 2,dc=example,dc=com up to CSN 56d47b15000100020000
/var/log/dirsrv/slapd-M1/errors:[07/Mar/2016:12:09:13 -0500] NSMMReplicationPlugin - agmt="cn=M1_to_M2" (localhost:1389): replay_update: Sending add operation (dn="cn=repl keep alive 1,dc=example,dc=com" csn=56ddaf5a000100010000)
/var/log/dirsrv/slapd-M1/errors:[07/Mar/2016:12:09:19 -0500] NSMMReplicationPlugin - agmt="cn=M1_to_M2" (localhost:1389): replay_update: Sending add operation (dn="cn=repl keep alive 1,dc=example,dc=com" csn=56ddaf5a000100010000)
/var/log/dirsrv/slapd-M2/errors: - urp_add (cn=repl keep alive 1,dc=example,dc=com): an entry with this uniqueid already exists.
/var/log/dirsrv/slapd-M2/errors: - urp_add (cn=repl keep alive 1,dc=example,dc=com): an entry with this uniqueid already exists.
/var/log/dirsrv/slapd-M2/errors:[07/Mar/2016:11:50:09 -0500] �� - urp_add (cn=repl keep alive 1,dc=example,dc=com): an entry with this uniqueid already exists.
/var/log/dirsrv/slapd-M2/errors:[07/Mar/2016:11:53:43 -0500] �� - urp_add (cn=repl keep alive 1,dc=example,dc=com): an entry with this uniqueid already exists.
/var/log/dirsrv/slapd-M2/errors:[07/Mar/2016:11:58:46 -0500] �� - urp_add (cn=repl keep alive 1,dc=example,dc=com): an entry with this uniqueid already exists.
/var/log/dirsrv/slapd-M2/errors:[07/Mar/2016:12:03:51 -0500] �� - urp_add (cn=repl keep alive 1,dc=example,dc=com): an entry with this uniqueid already exists.
/var/log/dirsrv/slapd-M2/errors:[07/Mar/2016:12:08:37 -0500] NSMMReplicationPlugin - Need to create replication keep alive entry <cn=repl keep alive 2,dc=example,dc=com>
/var/log/dirsrv/slapd-M2/errors:[07/Mar/2016:12:08:37 -0500] NSMMReplicationPlugin - add dn: cn=repl keep alive 2,dc=example,dc=com
/var/log/dirsrv/slapd-M2/errors:[07/Mar/2016:12:08:38 -0500] NSMMReplicationPlugin - Purged state information from entry cn=repl keep alive 2,dc=example,dc=com up to CSN 56d47b16000000020000
/var/log/dirsrv/slapd-M2/errors:[07/Mar/2016:12:08:58 -0500] ��@ - urp_add (cn=repl keep alive 1,dc=example,dc=com): an entry with this uniqueid already exists.
/var/log/dirsrv/slapd-M2/errors:[07/Mar/2016:12:09:10 -0500] NSMMReplicationPlugin - agmt="cn=M1_to_M2" (localhost:389): replay_update: Sending add operation (dn="cn=repl keep alive 2,dc=example,dc=com" csn=56ddb595000100020000)
/var/log/dirsrv/slapd-M2/errors:[07/Mar/2016:12:09:11 -0500] NSMMReplicationPlugin - agmt="cn=M1_to_M2" (localhost:389): replay_update: Sending modify operation (dn="cn=repl keep alive 2,dc=example,dc=com" csn=56ddb596000100020000)
/var/log/dirsrv/slapd-M2/errors:[07/Mar/2016:12:09:14 -0500] ��@ - urp_add (cn=repl keep alive 1,dc=example,dc=com): an entry with this uniqueid already exists.
/var/log/dirsrv/slapd-M2/errors:[07/Mar/2016:12:09:21 -0500] ��@ - urp_add (cn=repl keep alive 1,dc=example,dc=com): an entry with this uniqueid already exists.
[root@ipaqa64vml export]# 

The number 100 may not be accurate here, but may be entry creation is taking some time.

But as it is not breaking the replication and original reported issues is solved, I am marking this bug as VERIFIED.

Comment 14 errata-xmlrpc 2016-03-22 14:47:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0468.html


Note You need to log in before you can comment on or make changes to this bug.