Bug 1021461 - In non-tx caches, write operations may not be atomic during rebalance
Summary: In non-tx caches, write operations may not be atomic during rebalance
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: JBoss Data Grid 6
Classification: JBoss
Component: Infinispan
Version: 6.2.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ER6
: 6.2.0
Assignee: Tristan Tarrant
QA Contact: Martin Gencur
URL:
Whiteboard:
Depends On:
Blocks: 1017190
TreeView+ depends on / blocked
 
Reported: 2013-10-21 10:56 UTC by Radim Vansa
Modified: 2025-02-10 03:28 UTC (History)
1 user (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2025-02-10 03:28:47 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker ISPN-3422 0 Critical Resolved In non-tx caches, write operations may not be atomic during rebalance 2016-08-15 08:47:14 UTC

Description Radim Vansa 2013-10-21 10:56:54 UTC
If the cache topology changes while a write command is running and before it has actually committed the entry to the data container, we retry the command (see ISPN-3366 and ISPN-3357). But before we detect the topology change, one or more of the backup owners may have already applied the modification.

Retrying the command re-acquires the key lock on the primary owner (even if the primary owner didn't change). That means another command could have modified the same key in the meantime, but the retried command is going to ignore any changes and is going to return the value before the first attempt.

I have observed a situation where two concurrent putIfAbsent operations are retried because of topology change, and the second retry has overwritten the first retry.

Comment 2 JBoss JIRA Server 2013-10-23 07:09:29 UTC
Radim Vansa <rvansa> made a comment on jira ISPN-3422

Are there any ideas how to solve this issue? I am rather curious about it.

Comment 3 JBoss JIRA Server 2013-10-24 10:23:57 UTC
Dan Berindei <dberinde> made a comment on jira ISPN-3422

Yes, we should be able to fix this by comparing the PutIfAbsentCommand's value with the value that exists in the data container even when ignorePreviousValue is set to true. If the value in the data container is either null or the same, then the command should succeed, otherwise it should fail.

I didn't implement this back when fixing ISPN-3357 because of 2 reasons:
1) Values would have to implement equals() for this to work. AFAICT we don't really support versioning in non-tx mode.
2) We should only do the comparison check on the primary, because we want the final value to be the same on all the owners even if there was an inconsistency before. So we'd need to inject a {{ClusteringDependentLogic}}  in all the conditional commands.

Comment 4 JBoss JIRA Server 2013-10-24 11:57:57 UTC
Radim Vansa <rvansa> made a comment on jira ISPN-3422

I have a feeling (already mentioned in ISPN-3600) that the ignorePreviousValue is rather abused, as even with the flag set we would do the comparison, and with the flag set to false on non-origin nodes the command is ignored at all in perform()... I know this is an implementation detail, but the flag name does not correspond to the use at all.

ad 1) I don't quite understand the problem - conditional commands already require equals implementation on values or defined valueEquivalence within the command

Comment 5 JBoss JIRA Server 2013-11-14 11:55:44 UTC
Dan Berindei <dberinde> made a comment on jira ISPN-3422

At point 1), I was thinking of {{putIfAbsent()}}, which only compares the values with {{null}} ATM. You're right about the other conditional commands, though.

Comment 6 JBoss JIRA Server 2013-12-04 11:12:36 UTC
Dan Berindei <dberinde> made a comment on jira ISPN-3422

These are the changes I made:
    * Replace the ignorePreviousValue flag with a ValueMatchingPolicy enum.
    * Set the matching policy to MATCH_EXPECTED_OR_NEW when retrying a
      putIfAbsent(k, value).
    * replace(k, expected, newValue) and remove(k, expected) will also be
      retried with MATCH_EXPECTED_OR_NEW.
    * Check the primary owner's response and update the status of the command.
      Only successful commands should be retried if the topology changes.

Regular put(k, value) still won't be atomic, because we don't keep the previous value around. Same as regular put(k, value), replace(k, newValue) and remove(k) won't be atomic.

Comment 7 JBoss JIRA Server 2013-12-09 14:33:34 UTC
William Burns <wburns> updated the status of jira ISPN-3422 to Resolved

Comment 8 Red Hat Bugzilla 2025-02-10 03:28:47 UTC
This product has been discontinued or is no longer tracked in Red Hat Bugzilla.


Note You need to log in before you can comment on or make changes to this bug.