Bug 1021461

Summary: In non-tx caches, write operations may not be atomic during rebalance
Product: [JBoss] JBoss Data Grid 6 Reporter: Radim Vansa <rvansa>
Component: InfinispanAssignee: Tristan Tarrant <ttarrant>
Status: CLOSED UPSTREAM QA Contact: Martin Gencur <mgencur>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.2.0CC: jdg-bugs
Target Milestone: ER6   
Target Release: 6.2.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Known Issue
Doc Text:
In a non-transactional cache, conditional operations executed during cluster rebalance after a node joins or leaves may not test and set the value atomically - two competing commands may both override the value.
Story Points: ---
Clone Of: Environment:
Last Closed: 2025-02-10 03:28:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1017190    

Description Radim Vansa 2013-10-21 10:56:54 UTC
If the cache topology changes while a write command is running and before it has actually committed the entry to the data container, we retry the command (see ISPN-3366 and ISPN-3357). But before we detect the topology change, one or more of the backup owners may have already applied the modification.

Retrying the command re-acquires the key lock on the primary owner (even if the primary owner didn't change). That means another command could have modified the same key in the meantime, but the retried command is going to ignore any changes and is going to return the value before the first attempt.

I have observed a situation where two concurrent putIfAbsent operations are retried because of topology change, and the second retry has overwritten the first retry.

Comment 2 JBoss JIRA Server 2013-10-23 07:09:29 UTC
Radim Vansa <rvansa> made a comment on jira ISPN-3422

Are there any ideas how to solve this issue? I am rather curious about it.

Comment 3 JBoss JIRA Server 2013-10-24 10:23:57 UTC
Dan Berindei <dberinde> made a comment on jira ISPN-3422

Yes, we should be able to fix this by comparing the PutIfAbsentCommand's value with the value that exists in the data container even when ignorePreviousValue is set to true. If the value in the data container is either null or the same, then the command should succeed, otherwise it should fail.

I didn't implement this back when fixing ISPN-3357 because of 2 reasons:
1) Values would have to implement equals() for this to work. AFAICT we don't really support versioning in non-tx mode.
2) We should only do the comparison check on the primary, because we want the final value to be the same on all the owners even if there was an inconsistency before. So we'd need to inject a {{ClusteringDependentLogic}}  in all the conditional commands.

Comment 4 JBoss JIRA Server 2013-10-24 11:57:57 UTC
Radim Vansa <rvansa> made a comment on jira ISPN-3422

I have a feeling (already mentioned in ISPN-3600) that the ignorePreviousValue is rather abused, as even with the flag set we would do the comparison, and with the flag set to false on non-origin nodes the command is ignored at all in perform()... I know this is an implementation detail, but the flag name does not correspond to the use at all.

ad 1) I don't quite understand the problem - conditional commands already require equals implementation on values or defined valueEquivalence within the command

Comment 5 JBoss JIRA Server 2013-11-14 11:55:44 UTC
Dan Berindei <dberinde> made a comment on jira ISPN-3422

At point 1), I was thinking of {{putIfAbsent()}}, which only compares the values with {{null}} ATM. You're right about the other conditional commands, though.

Comment 6 JBoss JIRA Server 2013-12-04 11:12:36 UTC
Dan Berindei <dberinde> made a comment on jira ISPN-3422

These are the changes I made:
    * Replace the ignorePreviousValue flag with a ValueMatchingPolicy enum.
    * Set the matching policy to MATCH_EXPECTED_OR_NEW when retrying a
      putIfAbsent(k, value).
    * replace(k, expected, newValue) and remove(k, expected) will also be
      retried with MATCH_EXPECTED_OR_NEW.
    * Check the primary owner's response and update the status of the command.
      Only successful commands should be retried if the topology changes.

Regular put(k, value) still won't be atomic, because we don't keep the previous value around. Same as regular put(k, value), replace(k, newValue) and remove(k) won't be atomic.

Comment 7 JBoss JIRA Server 2013-12-09 14:33:34 UTC
William Burns <wburns> updated the status of jira ISPN-3422 to Resolved

Comment 8 Red Hat Bugzilla 2025-02-10 03:28:47 UTC
This product has been discontinued or is no longer tracked in Red Hat Bugzilla.