Bug 515329 - Multiple mods in one operation can result in an inconsistent replica
Summary: Multiple mods in one operation can result in an inconsistent replica
Status: CLOSED CURRENTRELEASE
Alias: None
Product: 389
Classification: Retired
Component: Replication - General
Version: 1.2.1
Hardware: All
OS: Linux
high
medium
Target Milestone: ---
Assignee: Rich Megginson
QA Contact: Viktor Ashirov
URL:
Whiteboard:
Keywords:
Depends On:
Blocks: 639035 389_1.2.8
TreeView+ depends on / blocked
 
Reported: 2009-08-03 18:33 UTC by Nathan Kinder
Modified: 2015-12-07 16:50 UTC (History)
4 users (show)

(edit)
Clone Of:
(edit)
Last Closed: 2015-12-07 16:50:58 UTC


Attachments (Terms of Use)
Proposed Patch (6.31 KB, patch)
2009-11-16 20:28 UTC, Nathan Kinder
nhosoi: review+
rmeggins: review+
Details | Diff
0001-Bug-515329-Multiple-mods-in-one-operation-can-result.patch (3.90 KB, patch)
2010-12-08 02:52 UTC, Rich Megginson
nhosoi: review+
nkinder: review+
Details | Diff

Description Nathan Kinder 2009-08-03 18:33:12 UTC
Performing multiple mods on the same attribute in a single modify operation can result in different sets of data on a replica.  The problem seems to be related to multi-valued attributes.

To reproduce:

- Setup replication with 2 masters.
- Add the following test entry to the first master:

    dn: uid=testuser,dc=example,dc=com
    objectclass: inetorgperson
    objectclass: person
    objectclass: top
    uid: testuser
    cn: 1
    sn: testuser

- Perform the following modify operation against the first master:

    dn: uid=testuser,dc=example,dc=com
    changetype: modify
    add: cn
    cn: 2
    -
    replace: cn
    cn: 3

At this point, the only cn value for testuser on the first master is "3", which is correct.  On the second master, values "2" and "3" will be present.

Comment 1 Rich Megginson 2009-09-29 02:49:47 UTC
This is due to the logic in resolve_attribute_state_multi_valued().  I don't quite understand what is going on, but it does not move the value from the present values to the deleted values list in the attribute because of this test at entrywsi.c:959:
		if(csn_compare(vucsn,deletedcsn)<0) /* check if the attribute or value was deleted after the value was last updated */
		{
	        if(!value_distinguished_at_csn(e, a, v, deletedcsn))
			{
				entry_present_value_to_deleted_value(a,v);
			}
		}

Since the vucsn == deletedcsn (deletedcsn is set in entry_delete_present_values_wsi() line 547 by the call to attr_set_deletion_csn()), the test fails, and the value is not moved to the deleted values list.  If we change the < to <=, it is moved, but then it is moved back to the present values list by the subsequent block of code below:
		if((csn_compare(vucsn,deletedcsn)>=0) || /* check if the attribute or value was deleted after the value was last updated */
	        value_distinguished_at_csn(e, a, v, deletedcsn))
		{
			entry_deleted_value_to_present_value(a,v);
		}

I don't know if we could just change the >= to a > - I don't really understand the code well enough.

Comment 2 Nathan Kinder 2009-11-12 01:44:49 UTC
(In reply to comment #1)
> I don't know if we could just change the >= to a > - I don't really understand
> the code well enough. 

Unfortunately, it is not this simple.  Doing this causes the newly added value of "3" to be removed as well on the replica.

The issue seems to be a fundamental problem in being able to determine the order of modify ops within the entire replicated operation.  The resolve_attribute_state_multi_valued() basically just knows the following (the CSN values are simplified in this example):

- Value "1" for "cn" was updated/added at CSN "1".
- Value "2" for "cn" was updated/added at CSN "2".
- Entire "cn" attribute was deleted at CSN "2" (due to the replace).
- Value "3" for "cn" was updated/added at CSN "2".

The problem is that the same CSN is used for the addition of value "2", the removal of the attribute, and the addition of value 3.  We have no way of knowing
the actual sequence here.  This means we either assume the delete happened before any of the new values (which is what the current code does), leaving values "2" and "3" present, or we assume the delete happens last (which is what the modified code in comment #1 does), which ends up removing values "2" and "3".

Comment 3 Nathan Kinder 2009-11-16 20:28:50 UTC
Created attachment 369777 [details]
Proposed Patch

The problem is that we use the CSNs from the attribute state data
to determine which values should remain after the operation (this is
done to merge with later occurring changes from other masters).  The
CSN for all mods within the same modify operation is exactly the same.
The old code was looking for attributes older than the deletion that
occurs as a part of the replace, then deleting those values.  This
would cause the value of "2" in the above example to remain.  Simply
changing this comparison to look for values with the same or older
CSN to delete would cause the new value of "3" to be removed as well
when we get around to resolving the attribute after the second half
of the replace operation.
    
The fix is to use a different CSN comparison when we are removing all
values of an attribute during attribute resolution (remove values with
the same or older CSN).  This is safe becuse the only present values
at this time are older values or values added in a previous mod in the
same modify operation.  When processing other mods that are not
removing all values of an attribute, we only want to remove values
with a CSN older that that of the current modify operation.  This
prevents us from removing a newly added value, such as "3" in the
example above.  This is safe since we resolve the attribute after
each mod in the modify operation.

Comment 4 Nathan Kinder 2009-11-16 21:51:59 UTC
Pushed to master.  Thanks to Rich and Noriko for their reviews!

Counting objects: 11, done.
Delta compression using 2 threads.
Compressing objects: 100% (6/6), done.
Writing objects: 100% (6/6), 1.73 KiB, done.
Total 6 (delta 4), reused 0 (delta 0)
To ssh://git.fedorahosted.org/git/389/ds.git
   2b2c260..6319251  master -> master

Comment 5 Jenny Galipeau 2010-05-19 20:49:16 UTC
If I am verifying this correctly the bug still exists:

1.  User's original cn value
cn: Teena Vradmin

2. modify ldif:

dn: uid=TVradmin0, ou=Accounting, dc=example,dc=com
changetype: modify
add: cn
cn: 2
-
replace: cn
cn: 3

SERVER 1:
[root@jgalipea-rhel4 515329]#  ldapsearch -x -h jgalipea-rhel4.idm.lab.bos.redhat.com -p 1389 -D "cn=Directory Manager" -w Secret123 -b "uid=TVradmin0,ou=Accounting,dc=example,dc=com" | grep cn:
cn: 3


SERVER 2:
[root@jgalipea-rhel4 515329]#  ldapsearch -x -h jgalipea-rhel4.idm.lab.bos.redhat.com -p 2389 -D "cn=Directory Manager" -w Secret123 -b "uid=TVradmin0,ou=Accounting,dc=example,dc=com" | grep cn:
cn: 3
cn: 2

Comment 6 Rich Megginson 2010-05-19 21:23:15 UTC
(In reply to comment #5)
> If I am verifying this correctly the bug still exists:
> 
> 1.  User's original cn value
> cn: Teena Vradmin
> 
> 2. modify ldif:
> 
> dn: uid=TVradmin0, ou=Accounting, dc=example,dc=com
> changetype: modify
> add: cn
> cn: 2
> -
> replace: cn
> cn: 3
> 
> SERVER 1:
> [root@jgalipea-rhel4 515329]#  ldapsearch -x -h
> jgalipea-rhel4.idm.lab.bos.redhat.com -p 1389 -D "cn=Directory Manager" -w
> Secret123 -b "uid=TVradmin0,ou=Accounting,dc=example,dc=com" | grep cn:
> cn: 3
> 
> 
> SERVER 2:
> [root@jgalipea-rhel4 515329]#  ldapsearch -x -h
> jgalipea-rhel4.idm.lab.bos.redhat.com -p 2389 -D "cn=Directory Manager" -w
> Secret123 -b "uid=TVradmin0,ou=Accounting,dc=example,dc=com" | grep cn:
> cn: 3
> cn: 2    

On each server, do the ldapsearch like this:
ldapsearch -x -h jgalipea-rhel4.idm.lab.bos.redhat.com -p 1389 -D "cn=Directory Manager" -w Secret123 -b "uid=TVradmin0,ou=Accounting,dc=example,dc=com" nscpEntryWsi

And paste the entire thing in the bug report.

Comment 7 Jenny Galipeau 2010-05-20 13:00:32 UTC
SERVER 1:

# TVradmin0, Accounting, example.com
dn: uid=TVradmin0,ou=Accounting,dc=example,dc=com
nscpEntryWsi: dn: uid=TVradmin0,ou=Accounting,dc=example,dc=com
nscpEntryWsi: modifyTimestamp;adcsn-4bf44d95000000010000;vucsn-4bf44d950000000
 10000: 20100519204403Z
nscpEntryWsi: modifiersName;adcsn-4bf44d95000000010000;vucsn-4bf44d95000000010
 000: cn=directory manager
nscpEntryWsi: cn;adcsn-4bf44d95000000010000;vucsn-4bf44d95000000010000: 3
nscpEntryWsi: cn;vucsn-4bf44d72000000010000;deleted: Teena Vradmin
nscpEntryWsi: cn;vucsn-4bf44d95000000010000;deleted: 2
nscpEntryWsi: objectClass;vucsn-4bf44d72000000010000: top
nscpEntryWsi: objectClass;vucsn-4bf44d72000000010000: person
nscpEntryWsi: objectClass;vucsn-4bf44d72000000010000: organizationalPerson
nscpEntryWsi: objectClass;vucsn-4bf44d72000000010000: inetOrgPerson
nscpEntryWsi: sn;vucsn-4bf44d72000000010000: Vradmin
nscpEntryWsi: uid;vucsn-4bf44d72000000010000;mdcsn-4bf44d72000000010000: TVrad
 min0
nscpEntryWsi: givenName;vucsn-4bf44d72000000010000: Teena
nscpEntryWsi: description;vucsn-4bf44d72000000010000: 2;649;CN=Red Hat CS 71GA
  Demo,O=Red Hat CS 71GA Demo,C=US;CN=RHCS Agent - admin01,UID=admin01,O=redha
 t,C=US [1] This is Teena Vradmin's description.
nscpEntryWsi: departmentNumber;vucsn-4bf44d72000000010000: 2220
nscpEntryWsi: employeeType;vucsn-4bf44d72000000010000: Manager
nscpEntryWsi: homePhone;vucsn-4bf44d72000000010000: +1 510 551-9687
nscpEntryWsi: initials;vucsn-4bf44d72000000010000: T. V.
nscpEntryWsi: telephoneNumber;vucsn-4bf44d72000000010000: +1 303 703-2147
nscpEntryWsi: facsimileTelephoneNumber;vucsn-4bf44d72000000010000: +1 206 682-
 3534
nscpEntryWsi: mobile;vucsn-4bf44d72000000010000: +1 213 151-5816
nscpEntryWsi: pager;vucsn-4bf44d72000000010000: +1 804 769-1685
nscpEntryWsi: manager;vucsn-4bf44d72000000010000: cn=Olga Lake
nscpEntryWsi: secretary;vucsn-4bf44d72000000010000: cn=Silva Giamatteo
nscpEntryWsi: roomNumber;vucsn-4bf44d72000000010000: 7730
nscpEntryWsi: carLicense;vucsn-4bf44d72000000010000: ZSN6DM3
nscpEntryWsi: l;vucsn-4bf44d72000000010000: Milpitas
nscpEntryWsi: ou;vucsn-4bf44d72000000010000: Accounting
nscpEntryWsi: mail;vucsn-4bf44d72000000010000: Teena_Vradmin@example.com
nscpEntryWsi: postalAddress;vucsn-4bf44d72000000010000: 796,  Dept #58, Room#A
 ccounting
nscpEntryWsi: title;vucsn-4bf44d72000000010000: Senior Accounting Visionary
nscpEntryWsi: userPassword;vucsn-4bf44d72000000010000: {SSHA}Yqp20EBqvu7mZMToU
 5TINa2wySsCE1dWg50VSg==
nscpEntryWsi: creatorsName;vucsn-4bf44d72000000010000: cn=directory manager
nscpEntryWsi: createTimestamp;vucsn-4bf44d72000000010000: 20100519204328Z
nscpEntryWsi: nsUniqueId: 219bd606-1dd211b2-8e4df248-4ed20000
nscpEntryWsi: parentid: 11
nscpEntryWsi: entryid: 14
nscpEntryWsi: entrydn: uid=tvradmin0,ou=accounting,dc=example,dc=com


SERVER 2:

# TVradmin0, Accounting, example.com
dn: uid=TVradmin0,ou=Accounting,dc=example,dc=com
nscpEntryWsi: dn: uid=TVradmin0,ou=Accounting,dc=example,dc=com
nscpEntryWsi: modifyTimestamp;adcsn-4bf44d95000000010000;vucsn-4bf44d950000000
 10000: 20100519204403Z
nscpEntryWsi: modifiersName;adcsn-4bf44d95000000010000;vucsn-4bf44d95000000010
 000: cn=directory manager
nscpEntryWsi: cn;adcsn-4bf44d95000000010000;vucsn-4bf44d95000000010000: 3
nscpEntryWsi: cn;vucsn-4bf44d95000000010000: 2
nscpEntryWsi: cn;vucsn-4bf44d72000000010000;deleted: Teena Vradmin
nscpEntryWsi: objectClass;vucsn-4bf44d72000000010000: top
nscpEntryWsi: objectClass;vucsn-4bf44d72000000010000: person
nscpEntryWsi: objectClass;vucsn-4bf44d72000000010000: organizationalPerson
nscpEntryWsi: objectClass;vucsn-4bf44d72000000010000: inetOrgPerson
nscpEntryWsi: sn;vucsn-4bf44d72000000010000: Vradmin
nscpEntryWsi: uid;vucsn-4bf44d72000000010000;mdcsn-4bf44d72000000010000: TVrad
 min0
nscpEntryWsi: givenName;vucsn-4bf44d72000000010000: Teena
nscpEntryWsi: description;vucsn-4bf44d72000000010000: 2;649;CN=Red Hat CS 71GA
  Demo,O=Red Hat CS 71GA Demo,C=US;CN=RHCS Agent - admin01,UID=admin01,O=redha
 t,C=US [1] This is Teena Vradmin's description.
nscpEntryWsi: departmentNumber;vucsn-4bf44d72000000010000: 2220
nscpEntryWsi: employeeType;vucsn-4bf44d72000000010000: Manager
nscpEntryWsi: homePhone;vucsn-4bf44d72000000010000: +1 510 551-9687
nscpEntryWsi: initials;vucsn-4bf44d72000000010000: T. V.
nscpEntryWsi: telephoneNumber;vucsn-4bf44d72000000010000: +1 303 703-2147
nscpEntryWsi: facsimileTelephoneNumber;vucsn-4bf44d72000000010000: +1 206 682-
 3534
nscpEntryWsi: mobile;vucsn-4bf44d72000000010000: +1 213 151-5816
nscpEntryWsi: pager;vucsn-4bf44d72000000010000: +1 804 769-1685
nscpEntryWsi: manager;vucsn-4bf44d72000000010000: cn=Olga Lake
nscpEntryWsi: secretary;vucsn-4bf44d72000000010000: cn=Silva Giamatteo
nscpEntryWsi: roomNumber;vucsn-4bf44d72000000010000: 7730
nscpEntryWsi: carLicense;vucsn-4bf44d72000000010000: ZSN6DM3
nscpEntryWsi: l;vucsn-4bf44d72000000010000: Milpitas
nscpEntryWsi: ou;vucsn-4bf44d72000000010000: Accounting
nscpEntryWsi: mail;vucsn-4bf44d72000000010000: Teena_Vradmin@example.com
nscpEntryWsi: postalAddress;vucsn-4bf44d72000000010000: 796,  Dept #58, Room#A
 ccounting
nscpEntryWsi: title;vucsn-4bf44d72000000010000: Senior Accounting Visionary
nscpEntryWsi: userPassword;vucsn-4bf44d72000000010000: {SSHA}Yqp20EBqvu7mZMToU
 5TINa2wySsCE1dWg50VSg==
nscpEntryWsi: creatorsName;vucsn-4bf44d72000000010000: cn=directory manager
nscpEntryWsi: createTimestamp;vucsn-4bf44d72000000010000: 20100519204328Z
nscpEntryWsi: nsUniqueId: 219bd606-1dd211b2-8e4df248-4ed20000
nscpEntryWsi: parentid: 11
nscpEntryWsi: entryid: 14
nscpEntryWsi: entrydn: uid=tvradmin0,ou=accounting,dc=example,dc=com

Comment 10 Rich Megginson 2010-11-20 01:00:51 UTC
This is closely related to bz578167

Comment 11 Rich Megginson 2010-12-08 02:52:00 UTC
Created attachment 467350 [details]
0001-Bug-515329-Multiple-mods-in-one-operation-can-result.patch

Comment 12 Rich Megginson 2010-12-15 17:13:52 UTC
commit fec8d9372b53122881dab491ce37c18068389c25
Author: Rich Megginson <rmeggins@redhat.com>
Date:   Tue Nov 23 14:16:13 2010 -0700

Reviewed by: nkinder, nhosoi (Thanks!)
Branch: master
Fix Description: When a master is processing a list of mod ops in a single
modify operation, all of which have the same CSN, the server processes them
in the order specified.  When a replica processes this same operation, and
has to merge this operation with other operations, it essentially processes
the operations in CSN order, and loses the original order of the ops within
each modify operation.  The function that initially receives the
replicated op, entry_apply_mods_wsi(), has the original order, and enforces
this order by incrementing the until now unused subsequence number in the
CSN for each mod op.  However, it only does this if the operation has
been replicated, and if the original CSN does not already have a
subsequence number.
One side effect is that the subsequence number is stored with the state
information in the database, which does not appear to affect anything else,
and things like state information purging still work correctly.
I've verfied that the subsequence number does not end up in in the changelog
nor the RUV or other externally used CSNs.
Platforms tested: RHEL6 x86_64
Flag Day: no
Doc impact: no

Comment 13 Amita Sharma 2011-06-02 10:57:08 UTC
[root@rhel61-ds90-amita slapd-c1]# ldapadd -x -h localhost -p 34204 -D "cn=Directory Manager" -w Secret123  << EOF
> dn: uid=testuser,o=airius.com
> objectclass: inetorgperson
> objectclass: person
> objectclass: top
> uid: testuser
> cn: 1
> sn: testuser
> EOF
adding new entry "uid=testuser,o=airius.com"

[root@rhel61-ds90-amita slapd-c1]# ldapadd -x -h localhost -p 34204 -D "cn=Directory Manager" -w Secret123  << EOF
> dn: uid=testuser,o=airius.com
> changetype: modify
> add: cn
> cn: 2
> -
> replace: cn
> cn: 3
> EOF
modifying entry "uid=testuser,o=airius.com"

[root@rhel61-ds90-amita slapd-c1]# ldapsearch -x -h localhost -p 34204 -D "cn=Directory Manager" -w Secret123 -b "uid=testuser,o=airius.com"
# extended LDIF
#
# LDAPv3
# base <uid=testuser,o=airius.com> with scope subtree
# filter: (objectclass=*)
# requesting: ALL
#

# testuser, airius.com
dn: uid=testuser,o=airius.com
cn: 3
objectClass: inetorgperson
objectClass: person
objectClass: top
objectClass: organizationalPerson
uid: testuser
sn: testuser

# search result
search: 2
result: 0 Success

# numResponses: 2
# numEntries: 1

 ldapsearch -x -h localhost -p 34208 -D "cn=Directory Manager" -w Secret123 -b "uid=testuser,o=airius.com"
# extended LDIF
#
# LDAPv3
# base <uid=testuser,o=airius.com> with scope subtree
# filter: (objectclass=*)
# requesting: ALL
#

# testuser, airius.com
dn: uid=testuser,o=airius.com
cn: 3
objectClass: inetorgperson
objectClass: person
objectClass: top
objectClass: organizationalPerson
uid: testuser
sn: testuser

# search result
search: 2
result: 0 Success

# numResponses: 2
# numEntries: 1


Note You need to log in before you can comment on or make changes to this bug.