Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1033087

Summary:	Forwarded Prepare/Commit executed after transaction finished
Product:	[JBoss] JBoss Data Grid 6	Reporter:	Radim Vansa <rvansa>
Component:	Infinispan	Assignee:	Tristan Tarrant <ttarrant>
Status:	CLOSED UPSTREAM	QA Contact:	Martin Gencur <mgencur>
Severity:	high	Docs Contact:
Priority:	unspecified
Version:	6.2.0	CC:	jdg-bugs
Target Milestone:	CR1
Target Release:	6.2.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2025-02-10 03:34:27 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1017190

Description Radim Vansa 2013-11-21 14:23:51 UTC

Replicated TX cache, nodes A, B, C

0. A and B have topology 2, C already got topology 3
1. A sends prepare with topology 2 to B and C, both apply the prepare and respond
2. C forwards prepare to B with topology 3
3. A sends commit with topology 2 to B and C, both commit and respond
4. again, C forwards prepare to B with topology 3
5. A and B get updated topology id
6. A executes another transaction on the same entry
7. prepare and commit from first transaction with topology 3 arrive at B - B overwrites (or removes) the entry again

Result: on B we have inconsistent state

Comment 2 JBoss JIRA Server 2013-12-02 15:27:48 UTC

Dan Berindei <dberinde> made a comment on jira ISPN-3745

[~rvansa] What's the cache configuration? The forwarding is always done synchronously, so node A couldn't receive the prepare response and send the commit until C finished its forwarding.

Comment 3 JBoss JIRA Server 2013-12-03 08:07:37 UTC

Radim Vansa <rvansa> made a comment on jira ISPN-3745

You're right, as I have synchronous tx cache, the forwarding should be synchronous. Regrettably, I miss the logs from the forwarding node (it got truncated), just to let you see what happened:

{code}
04:19:29,410 TRACE [org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher] (OOB-95,default,apex862-11617) Attempting to execute command: CommitCommand {gtx=GlobalTransaction:<apex861-22006>:164595:
local, cacheName='testCache', topologyId=18} [sender=apex861-22006]
04:19:29,411 TRACE [org.infinispan.remoting.InboundInvocationHandlerImpl] (remote-thread-14) Calling perform() on CommitCommand {gtx=GlobalTransaction:<apex861-22006>:164595:remote, cacheName='testCache', topologyId=18}
04:19:29,412 TRACE [org.infinispan.remoting.InboundInvocationHandlerImpl] (remote-thread-14) About to send back response SuccessfulResponse{responseValue=null}  for command CommitCommand {gtx=GlobalTransaction:<apex861-22006>:164595:remote, cacheName='testCache', topologyId=18}
04:19:31,301 TRACE [org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher] (OOB-78,default,apex862-11617) Attempting to execute command: PrepareCommand {modifications=[ ... ], onePhaseCommit=false, gtx=GlobalTransaction:<apex861-22006>:164595:local, cacheName='testCache', topologyId=19} [sender=apex863-20495]
{code}

Comment 4 JBoss JIRA Server 2013-12-03 08:10:31 UTC

Radim Vansa <rvansa> made a comment on jira ISPN-3745

Thinking about that once more, the broadcast optimization may be the villain here as well, because the apex863 (sender) has just joined. It got the prepare/commit as this was broadcast but nobody waited for its response. Then, it could forward the commands to the old nodes and these executed it again.

Comment 5 JBoss JIRA Server 2013-12-04 11:28:25 UTC

Dan Berindei <dberinde> made a comment on jira ISPN-3745

Is topologyId = 18 the id of the topology that contains the joiner, or the topology before? If it's the new topology, and the command was initially invoked remotely with topology 17, then the command was forwarded, otherwise it was likely retransmitted by JGroups. 

I'm inclined to think it's caused by JGroups retransmitting the message to the joiner and the originator not waiting for the response, too.

Comment 6 JBoss JIRA Server 2013-12-04 12:25:23 UTC

Radim Vansa <rvansa> made a comment on jira ISPN-3745

Topology 18 does not contain the joiner, 19 contains it.

Comment 7 JBoss JIRA Server 2013-12-19 11:40:43 UTC

Dan Berindei <dberinde> updated the status of jira ISPN-3745 to Resolved

Comment 9 Red Hat Bugzilla 2025-02-10 03:34:27 UTC

This product has been discontinued or is no longer tracked in Red Hat Bugzilla.