Bug 534394 (RHQ-1195)

Summary: be able to interrupt a thread whose tx has timed out
Product: [Other] RHQ Project Reporter: John Mazzitelli <mazz>
Component: Core ServerAssignee: John Mazzitelli <mazz>
Status: CLOSED NEXTRELEASE QA Contact: Corey Welton <cwelton>
Severity: medium Docs Contact:
Priority: medium    
Version: unspecifiedKeywords: Improvement
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: All   
URL: http://jira.rhq-project.org/browse/RHQ-1195
Whiteboard:
Fixed In Version: 1.2 Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description John Mazzitelli 2008-11-28 19:59:00 UTC
Read this thread:

http://www.jboss.com/index.html?module=bb&op=viewtopic&t=146170

and its related one:

http://www.jboss.com/index.html?module=bb&op=viewtopic&t=146198

In short, when a tx timeout occurs, all the tx manager does is abort the transaction but does NOT interrupt the thread(s) running in that transaction.

I propose we implement the @InterruptOnTimeout annotation that I describe in the forum post above. This is important for things like our quartz jobs that timeout after X minutes but may run for even longer.  There is no sense for most of our tx methods to continue after a timeout, but that's exactly what happens today.

In fact, I would think almost all of our methods that run in a tx would want to be told when the tx times out, so we might want to have an annotation that does NOT interrupt on timeout and use that instead (which essentially would tell the container that we want to have the behavior we currently have, that being the method should continue even if the tx times out... @DoNotInterruptOnTimeout

Comment 1 John Mazzitelli 2008-11-30 04:19:30 UTC
svn rev 2139

annotation: org.rhq.enterprise.server.common.InterruptOnTransactionTimeout
This is  only valid on methods (it is not a type annotation).

If this is not found on a SLSB method,  the default behavior is to NOT interrupt. If you want the method to be interrupted, you explicitly have to indicate that with:

@InterruptOnTransactionTimeout(true)

I set it up this way because it is how the current system works (no interrupts) - so essentially this does not change the known behavior of the system. If you want to be interrupted, you have to ask the system to change its behavior as above.

There is now an EJB3 interceptor, org.rhq.enterprise.server.common.TransactionInterruptInterceptor, in the EJB interceptor chain. It adds a new custom JBossTM CheckedAction class to the underlying JBossTM transaction. The interceptor also looks at the method being invoked and if it has the above annotation and its value is true, it tells that new CheckedAction class to interrupt threads that are active when a transaction is aborted (due to timeouts for example).

If that annotation is not there or not true, there will still be a CheckedAction installed - however, it will simply provide better logging than what JBossTM provides out of box. (this better logging exists whether or not the interrupt occurs).  The old, meaningless, warning you get when tx time outs occur looks something like this:

2008-11-23 01:09:35 WARN  [com.arjuna.ats.arjuna.logging.arjLoggerI18N]
[com.arjuna.ats.arjuna.coordinator.CheckedAction_2] - CheckedAction::check
- atomic action a0b0c21:ab6:4928f32d:1225 aborting with 1 threads active!

The new log message will depend on the log category, as mentioned in jboss-log4j.xml:

   <!-- INFO level gives full stacks to all threads active in aborted transactions -->
   <!-- Set to WARN to only log these events with single-line messages -->
   <category name="org.rhq.enterprise.server.common.TransactionInterruptInterceptor">
     <priority value="WARN"/>
   </category>

If that is set to WARN, the log message is only marginally better than the old message (it at least tells you the name of the threads whose transaction has been aborted/timed out). But, if its set to INFO, not only do you get that marginally better log message, but more importantly you also get the full stack trace of the thread whose transaction has timed out. This has the major benefit of telling us developers what SLSB method/functionality actually took too long before the tx timeout triggered.


Comment 2 Corey Welton 2009-02-13 21:52:46 UTC
QA Closing - code level fix.

Comment 3 Red Hat Bugzilla 2009-11-10 20:27:49 UTC
This bug was previously known as http://jira.rhq-project.org/browse/RHQ-1195
This bug relates to RHQ-1159