Bug 1287799 - bpm cluster fails to stop probably due to waiting for a TransactionLockInterceptor.execute()
bpm cluster fails to stop probably due to waiting for a TransactionLockInterc...
Status: CLOSED NOTABUG
Product: JBoss BPMS Platform 6
Classification: JBoss
Component: Business Central (Show other bugs)
6.2.0
Unspecified Unspecified
unspecified Severity urgent
: ---
: ---
Assigned To: Maciej Swiderski
Radovan Synek
: Regression, TestBlocker
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-12-02 12:07 EST by Radovan Synek
Modified: 2015-12-03 07:17 EST (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-12-03 07:17:20 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
jstack output (102.03 KB, text/plain)
2015-12-02 12:07 EST, Radovan Synek
no flags Details
server log node one (2.89 MB, text/plain)
2015-12-02 12:09 EST, Radovan Synek
no flags Details
server log node two (997.70 KB, text/plain)
2015-12-02 12:10 EST, Radovan Synek
no flags Details

  None (edit)
Description Radovan Synek 2015-12-02 12:07:43 EST
Created attachment 1101567 [details]
jstack output

Description of problem:
Having a BPM cluster with two EAP nodes in a domain, the EAP node#1 fails to stop in some cases and a corresponding process has to be killed. 
After stopping the EAP, node#2 finishes, but node#1 hangs. Jstack report from such a moment shows there is a thread waiting on a lock in org.drools.persistence.jta.TransactionLockInterceptor.execute(TransactionLockInterceptor.java:81). See the full report in the attachment. Before the attempt to stop the EAP, there is no such thread.

Please note that the Business Central has been deployed together with EJB services application and processes and deployments have been triggered via remote EJBs. Node#1 has been also restarted several times to simulate a failover scenario.

Version-Release number of selected component (if applicable):
6.2.0.CR2

How reproducible:
30% - 50%
Comment 1 Radovan Synek 2015-12-02 12:09 EST
Created attachment 1101568 [details]
server log node one
Comment 2 Radovan Synek 2015-12-02 12:10 EST
Created attachment 1101569 [details]
server log node two
Comment 3 Radovan Synek 2015-12-02 12:55:43 EST
Raising urgency of this issue as it likely shows not only after running tests when EAP should be stopped, but also during failover scenarios. 
This is in fact also a serious test blocker.
Comment 4 Maciej Swiderski 2015-12-02 13:05:05 EST
Radek, have you configured TransactionLockInterceptor to be turned on s it should be turned off by default?

In general it proved to not be as efficient solution as initially thought so it's not really recommended to be used, only with very special cases
Comment 5 Radovan Synek 2015-12-03 07:17:20 EST
Closing the issue as a configuration change proposed by Maciej, i.e. omitting system property org.kie.tx.lock.enabled=true helped to resolve the issue.

Current documentation does not mention this property, so it will be disabled by default.

Note You need to log in before you can comment on or make changes to this bug.