534394 – (RHQ-1195) be able to interrupt a thread whose tx has timed out

Bug 534394 (RHQ-1195) - be able to interrupt a thread whose tx has timed out

Summary: be able to interrupt a thread whose tx has timed out

Keywords:
Status:	CLOSED NEXTRELEASE
Alias:	RHQ-1195
Product:	RHQ Project
Classification:	Other
Component:	Core Server
Sub Component:
Version:	unspecified
Hardware:	All
OS:	All
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	John Mazzitelli
QA Contact:	Corey Welton
Docs Contact:
URL:	http://jira.rhq-project.org/browse/RH...
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2008-11-28 19:59 UTC by John Mazzitelli
Modified:	2009-02-13 21:52 UTC (History)
CC List:	0 users
Fixed In Version:	1.2
Clone Of:
Environment:
Last Closed:
Embargoed:

Attachments	(Terms of Use)

Description John Mazzitelli 2008-11-28 19:59:00 UTC

Read this thread:

http://www.jboss.com/index.html?module=bb&op=viewtopic&t=146170

and its related one:

http://www.jboss.com/index.html?module=bb&op=viewtopic&t=146198

In short, when a tx timeout occurs, all the tx manager does is abort the transaction but does NOT interrupt the thread(s) running in that transaction.

I propose we implement the @InterruptOnTimeout annotation that I describe in the forum post above. This is important for things like our quartz jobs that timeout after X minutes but may run for even longer.  There is no sense for most of our tx methods to continue after a timeout, but that's exactly what happens today.

In fact, I would think almost all of our methods that run in a tx would want to be told when the tx times out, so we might want to have an annotation that does NOT interrupt on timeout and use that instead (which essentially would tell the container that we want to have the behavior we currently have, that being the method should continue even if the tx times out... @DoNotInterruptOnTimeout

Comment 1 John Mazzitelli 2008-11-30 04:19:30 UTC

svn rev 2139

annotation: org.rhq.enterprise.server.common.InterruptOnTransactionTimeout
This is  only valid on methods (it is not a type annotation).

If this is not found on a SLSB method,  the default behavior is to NOT interrupt. If you want the method to be interrupted, you explicitly have to indicate that with:

@InterruptOnTransactionTimeout(true)

I set it up this way because it is how the current system works (no interrupts) - so essentially this does not change the known behavior of the system. If you want to be interrupted, you have to ask the system to change its behavior as above.

There is now an EJB3 interceptor, org.rhq.enterprise.server.common.TransactionInterruptInterceptor, in the EJB interceptor chain. It adds a new custom JBossTM CheckedAction class to the underlying JBossTM transaction. The interceptor also looks at the method being invoked and if it has the above annotation and its value is true, it tells that new CheckedAction class to interrupt threads that are active when a transaction is aborted (due to timeouts for example).

If that annotation is not there or not true, there will still be a CheckedAction installed - however, it will simply provide better logging than what JBossTM provides out of box. (this better logging exists whether or not the interrupt occurs).  The old, meaningless, warning you get when tx time outs occur looks something like this:

2008-11-23 01:09:35 WARN  [com.arjuna.ats.arjuna.logging.arjLoggerI18N]
[com.arjuna.ats.arjuna.coordinator.CheckedAction_2] - CheckedAction::check
- atomic action a0b0c21:ab6:4928f32d:1225 aborting with 1 threads active!

The new log message will depend on the log category, as mentioned in jboss-log4j.xml:

   <!-- INFO level gives full stacks to all threads active in aborted transactions -->
   <!-- Set to WARN to only log these events with single-line messages -->
   <category name="org.rhq.enterprise.server.common.TransactionInterruptInterceptor">
     <priority value="WARN"/>
   </category>

If that is set to WARN, the log message is only marginally better than the old message (it at least tells you the name of the threads whose transaction has been aborted/timed out). But, if its set to INFO, not only do you get that marginally better log message, but more importantly you also get the full stack trace of the thread whose transaction has timed out. This has the major benefit of telling us developers what SLSB method/functionality actually took too long before the tx timeout triggered.

Comment 2 Corey Welton 2009-02-13 21:52:46 UTC

QA Closing - code level fix.

Comment 3 Red Hat Bugzilla 2009-11-10 20:27:49 UTC

This bug was previously known as http://jira.rhq-project.org/browse/RHQ-1195
This bug relates to RHQ-1159

Note You need to log in before you can comment on or make changes to this bug.