If the cache topology changes while a write command is running and before it has actually committed the entry to the data container, we retry the command (see ISPN-3366 and ISPN-3357). But before we detect the topology change, one or more of the backup owners may have already applied the modification. Retrying the command re-acquires the key lock on the primary owner (even if the primary owner didn't change). That means another command could have modified the same key in the meantime, but the retried command is going to ignore any changes and is going to return the value before the first attempt. I have observed a situation where two concurrent putIfAbsent operations are retried because of topology change, and the second retry has overwritten the first retry.
Radim Vansa <rvansa> made a comment on jira ISPN-3422 Are there any ideas how to solve this issue? I am rather curious about it.
Dan Berindei <dberinde> made a comment on jira ISPN-3422 Yes, we should be able to fix this by comparing the PutIfAbsentCommand's value with the value that exists in the data container even when ignorePreviousValue is set to true. If the value in the data container is either null or the same, then the command should succeed, otherwise it should fail. I didn't implement this back when fixing ISPN-3357 because of 2 reasons: 1) Values would have to implement equals() for this to work. AFAICT we don't really support versioning in non-tx mode. 2) We should only do the comparison check on the primary, because we want the final value to be the same on all the owners even if there was an inconsistency before. So we'd need to inject a {{ClusteringDependentLogic}} in all the conditional commands.
Radim Vansa <rvansa> made a comment on jira ISPN-3422 I have a feeling (already mentioned in ISPN-3600) that the ignorePreviousValue is rather abused, as even with the flag set we would do the comparison, and with the flag set to false on non-origin nodes the command is ignored at all in perform()... I know this is an implementation detail, but the flag name does not correspond to the use at all. ad 1) I don't quite understand the problem - conditional commands already require equals implementation on values or defined valueEquivalence within the command
Dan Berindei <dberinde> made a comment on jira ISPN-3422 At point 1), I was thinking of {{putIfAbsent()}}, which only compares the values with {{null}} ATM. You're right about the other conditional commands, though.
Dan Berindei <dberinde> made a comment on jira ISPN-3422 These are the changes I made: * Replace the ignorePreviousValue flag with a ValueMatchingPolicy enum. * Set the matching policy to MATCH_EXPECTED_OR_NEW when retrying a putIfAbsent(k, value). * replace(k, expected, newValue) and remove(k, expected) will also be retried with MATCH_EXPECTED_OR_NEW. * Check the primary owner's response and update the status of the command. Only successful commands should be retried if the topology changes. Regular put(k, value) still won't be atomic, because we don't keep the previous value around. Same as regular put(k, value), replace(k, newValue) and remove(k) won't be atomic.
William Burns <wburns> updated the status of jira ISPN-3422 to Resolved
This product has been discontinued or is no longer tracked in Red Hat Bugzilla.