Bug 1021362

Summary: Inconsistent L1 in tx distributed cache
Product: [JBoss] JBoss Data Grid 6 Reporter: Radim Vansa <rvansa>
Component: InfinispanAssignee: Tristan Tarrant <ttarrant>
Status: VERIFIED --- QA Contact: Martin Gencur <mgencur>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.2.0CC: jdg-bugs, mgencur
Target Milestone: ER4   
Target Release: 6.2.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Known Issue
Doc Text:
With L1 enabled, a node may cache an already overwritten entry. Further reads on this node will return out-of-date value.
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1017190    

Description Radim Vansa 2013-10-21 07:27:06 UTC
In L1LastChance interceptor the CommitCommand sends invalidations only for those keys whose it is the primary owner. However, some key which is owned in backup way may be read when the command is replicated and this does not get invalidated after the command finishes.

Comment 1 Radim Vansa 2013-10-21 07:28:08 UTC
This is related to https://bugzilla.redhat.com/show_bug.cgi?id=1017796

Comment 3 JBoss JIRA Server 2013-10-21 21:47:08 UTC
William Burns <wburns> updated the status of jira ISPN-3648 to Coding In Progress

Comment 4 JBoss JIRA Server 2013-10-21 22:33:59 UTC
William Burns <wburns> made a comment on jira ISPN-3648

So if I understand correctly what you are saying is more that the issue is that the non owner requests a get but only the backup owner answers and the get doesn't reach the primary until after the update occurs causing it to cache an invalid value.

I am thinking this may be due to the fact that the tx l1 code only did invalidates if the context is local, which sounds like to be more consistent we have to do it on a Commit command even if it isn't local.

I will try to make up a test case of both an implicit tx and a explicit tx that is started not on an owner node.

Comment 5 JBoss JIRA Server 2013-10-21 22:43:34 UTC
William Burns <wburns> made a comment on jira ISPN-3648

It sounds like the issue is that the non owner requests a get but only the backup owner answers and the get doesn't reach the primary until after the update occurs causing it to cache an invalid value.

I am thinking this may be due to the fact that the tx l1 code only did invalidates if the context is local, which sounds like to be more consistent we have to do it on a Commit command even if it isn't local.

I will try to make up a test case of both an implicit tx and a explicit tx that is started not on an owner node.

Comment 6 JBoss JIRA Server 2013-10-22 00:06:55 UTC
William Burns <wburns> made a comment on jira ISPN-3648

I can reproduce the issue, however L1 in TX mode has always been invalidated by the owner node, which in this case just won't work.  I will need to think over what I can do for this.  I am guessing we only want to do this invalidation inside the LastChance and leave the normal L1Interceptor alone.

Comment 7 JBoss JIRA Server 2013-10-22 16:53:13 UTC
William Burns <wburns> made a comment on jira ISPN-3648

Actually this is also an exact match for ISPN-3426

Comment 8 Radim Vansa 2013-12-10 10:12:48 UTC
*** Bug 1024937 has been marked as a duplicate of this bug. ***