| Summary: | StaleObjectStateException is still seen after configuring new jms implementation introduced by SOA-1310 | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | [JBoss] JBoss Enterprise SOA Platform 4 | Reporter: | Darran Lofthouse <darran.lofthouse> | ||||||||
| Component: | JBPM - within SOA | Assignee: | Alejandro Guizar <alex.guizar> | ||||||||
| Status: | CLOSED NEXTRELEASE | QA Contact: | |||||||||
| Severity: | high | Docs Contact: | |||||||||
| Priority: | high | ||||||||||
| Version: | unspecified | CC: | alex.guizar, imamura.yousuke, tim.kutz, tyasuma, yusuke.yamamoto | ||||||||
| Target Milestone: | --- | ||||||||||
| Target Release: | 4.3 CP02 | ||||||||||
| Hardware: | Unspecified | ||||||||||
| OS: | Unspecified | ||||||||||
| URL: | http://jira.jboss.org/jira/browse/SOA-1476 | ||||||||||
| Whiteboard: | |||||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||||
| Doc Text: | Story Points: | --- | |||||||||
| Clone Of: | Environment: | ||||||||||
| Last Closed: | 2009-09-29 09:11:25 UTC | Type: | Bug | ||||||||
| Regression: | --- | Mount Type: | --- | ||||||||
| Documentation: | --- | CRM: | |||||||||
| Verified Versions: | Category: | --- | |||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||
| Attachments: |
|
||||||||||
|
Description
Darran Lofthouse
2009-09-01 14:06:40 UTC
Link: Added: This issue related SOA-1310 Attaching log. Attachment: Added: server_JBPM-1953_bpm_orchestration2.log Link: Added: This issue is related to JBPM-1953 The JBPM-1952 test case reproduces the situation where asynchronous forked nodes reach a join. When the join node executes, it acquires a pessimistic lock on the common parent of the arriving tokens. Before issuing a SELECT... FOR UPDATE statement tough, Hibernate does a version check and may throw a SOSE. Unfortunately, this Hibernate 'feature' kinda defeats the point of locking the parent token. I cannot think of a way around it. JBPM1952Test completes successfully by retrying the jobs after SOSEs occur in the join node. By setting async='exclusive' in the join node, SOSEs occur less frequently, although they do not fade away completely. <fork name='fork'> <transition to='c1' name='to_c1'/> <transition to='c2' name='to_c2'/> <transition to='c3' name='to_c3'/> </fork> <node name='c1' async='true'> <transition to='join' /> </node> <node name='c2' async='true'> <transition to='join' /> </node> <node name='c3' async='true'> <transition to='join' /> </node> <join name='join' async='exclusive'> <transition to='d' /> </join> Note that the message service must implement the async='exclusive' mode in order to mitigate SOSEs in the join node. 'Exclusive' means to acquire all other exclusive jobs for the current process instance and execute them serially. This is no surprise and has been stated on a number of occasions, the work only removes the conflict during job acquisition. Any concurrency in the process execution, which remains unchanged, can still generate stale object exceptions Try using
<!-- JMS scheduler -->
<service name="scheduler">
<factory>
<bean class="org.jboss.soa.esb.services.jbpm.integration.timer.JmsSchedulerServiceFactory">
<field name="connectionFactoryJndiName"><string value="XAConnectionFactory"/></field>
</bean>
</factory>
</service>
<!-- End of JMS scheduler -->
Link: Added: This issue related JBPM-1952 Setting reference to related support case. Help Desk Ticket Reference: Added: https://enterprise.redhat.com/issue-tracker/?module=issues&action=view&tid=336913 has anyone tried Jirka's suggestion? I can't try it, I don't have the platform build. Would anyone be so kind to point me to the download location? Apart from that, I don't see how switching to the JMS scheduler will make any difference here. We are discussing conflicts during token join, not job acquisition. We have been debugging this issue internally for the last couple of weeks, but had not come across this ticket, nor the related JBPM tickets referenced above. However, a little more reading, and some tinkering, and we have found a solution. I thought I would share it with you.
In the execute() method of the Join, the parent token is locked using the following code:
if (session!=null) {
// force version increment by default (LockMode.FORCE)
LockMode lockMode = parentLockMode != null ? LockMode.parse(parentLockMode) : LockMode.FORCE;
log.debug("acquiring " + lockMode + " lock on " + parentToken);
// lock updates as appropriate, no need to flush here
session.lock(parentToken, lockMode);
}
The problem with this snippet, is that in Hibernate, session.lock(Object, LockMode), when used on a versioned object, locks the row not by ID, but by ID AND VERSION. This prevents a second thread from executing concurrently, but will always throw an SOSE in the subsequent threads when LockMode.FORCE is used, OR when the locked object is modified in any way, thereby changing it's version.
To obtain a non-versioned lock on the object, we instead used
session.load(Class, Id, LockMode), which locks the row based on ID alone, and, when successful at owning the lock, returns the current version.
In our particular case, we have implemented a custom action which replaces the Join, and also performs some work of aggregating list variables from the multiple child token variable sets. The resulting variable is then saved as a variable on the parentToken. Because this causes a version bump on the parentToken, we have altered our lock to be on the ProcessInstance, rather than on the parentToken.
These changes - locking on the ProcessInstance, and using the versionless lock via session.load() appears to have eliminated the SOSEs and related deadlocking that we have been seeing.
I am attaching two source files, to illustrate. One is our custom Join action. The second is a util class which we used to extract the locking code during debugging, which is referenced in the Join action itself. Most of the join logic was originally cribbed from the standard Join source code (working off of JBPM version 3.2.5 at the time), but does not include special case joins such as nOfM, or scripted joins.
Source files for custom join. Attachment: Added: VariableMergingJoinAction.java Attachment: Added: ProcessSynchronizationUtil.java Tim, I tried your non-versioned lock suggestion and it does seem to eliminate a great deal of SOSEs even without async="exclusive". Thank you for this hint. StaleObjectStateException is still seen after configuring jms scheduler, but BPMOrchestration2Test runs OK. (It passes every time). Verified on CP02 CR4 The changes which disable the logging contention will also be required, as will changes to the BPMOrchestraion2 QS. The process definition, as it currently stands, contains a contention on the variable assignment when handling the service responses. added to 4.3.CP02 release notes as resolved: SOA-1476 The JMS implementation of the ESB to jBPM integration could produce Stale Object State Exceptions in many circumstances. Several of these have been eliminated. |