Bug 1287799 - bpm cluster fails to stop probably due to waiting for a TransactionLockInterceptor.execute()
Summary: bpm cluster fails to stop probably due to waiting for a TransactionLockInterc...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: JBoss BPMS Platform 6
Classification: Retired
Component: Business Central
Version: 6.2.0
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: ---
Assignee: Maciej Swiderski
QA Contact: Radovan Synek
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-12-02 17:07 UTC by Radovan Synek
Modified: 2015-12-03 12:17 UTC (History)
1 user (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2015-12-03 12:17:20 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
jstack output (102.03 KB, text/plain)
2015-12-02 17:07 UTC, Radovan Synek
no flags Details
server log node one (2.89 MB, text/plain)
2015-12-02 17:09 UTC, Radovan Synek
no flags Details
server log node two (997.70 KB, text/plain)
2015-12-02 17:10 UTC, Radovan Synek
no flags Details

Description Radovan Synek 2015-12-02 17:07:43 UTC
Created attachment 1101567 [details]
jstack output

Description of problem:
Having a BPM cluster with two EAP nodes in a domain, the EAP node#1 fails to stop in some cases and a corresponding process has to be killed. 
After stopping the EAP, node#2 finishes, but node#1 hangs. Jstack report from such a moment shows there is a thread waiting on a lock in org.drools.persistence.jta.TransactionLockInterceptor.execute(TransactionLockInterceptor.java:81). See the full report in the attachment. Before the attempt to stop the EAP, there is no such thread.

Please note that the Business Central has been deployed together with EJB services application and processes and deployments have been triggered via remote EJBs. Node#1 has been also restarted several times to simulate a failover scenario.

Version-Release number of selected component (if applicable):
6.2.0.CR2

How reproducible:
30% - 50%

Comment 1 Radovan Synek 2015-12-02 17:09:38 UTC
Created attachment 1101568 [details]
server log node one

Comment 2 Radovan Synek 2015-12-02 17:10:16 UTC
Created attachment 1101569 [details]
server log node two

Comment 3 Radovan Synek 2015-12-02 17:55:43 UTC
Raising urgency of this issue as it likely shows not only after running tests when EAP should be stopped, but also during failover scenarios. 
This is in fact also a serious test blocker.

Comment 4 Maciej Swiderski 2015-12-02 18:05:05 UTC
Radek, have you configured TransactionLockInterceptor to be turned on s it should be turned off by default?

In general it proved to not be as efficient solution as initially thought so it's not really recommended to be used, only with very special cases

Comment 5 Radovan Synek 2015-12-03 12:17:20 UTC
Closing the issue as a configuration change proposed by Maciej, i.e. omitting system property org.kie.tx.lock.enabled=true helped to resolve the issue.

Current documentation does not mention this property, so it will be disabled by default.


Note You need to log in before you can comment on or make changes to this bug.