Bug 724212 (BRMS-155) - CLONE -Deadlock when RuleAgent thread refreshes rules while another thread creates a statefulSession
Summary: CLONE -Deadlock when RuleAgent thread refreshes rules while another thread cr...
Keywords:
Status: CLOSED NEXTRELEASE
Alias: BRMS-155
Product: JBoss Enterprise BRMS Platform 5
Classification: JBoss
Component: unspecified
Version: unspecified
Hardware: Unspecified
OS: Unspecified
high
unspecified
Target Milestone: ---
: 5.0.1
Assignee: Mark Proctor
QA Contact:
URL: http://jira.jboss.org/jira/browse/BRM...
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-07-07 12:59 UTC by nwallace
Modified: 2009-10-05 05:47 UTC (History)
0 users

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Windows XP SP3 Sun JRE 1.5.0_14
Last Closed: 2009-09-01 12:18:47 UTC
Type: Bug


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker BRMS-155 0 None None None Never

Description nwallace 2009-07-07 12:59:44 UTC
Date of First Response: 2009-09-10 00:43:16
securitylevel_name: Public

I believe I have discovered a deadlock that can occur in drools 4.0.7 if the RuleAgent refreshes its associated RuleBase whilst a new stateful session is being created on the RuleBase by another thread. 

The thread dump below shows the deadlock where the two thread "Timer-15" and "DataSource(com.acme.Source)-2" are deadlocked. Thread "Timer-15" is the timer thread created by the RuleAgent rule refresh mechanism to check if the rules files have changed, and to refresh the rules when a change is found. If it finds changes to the rules then it obtains a lock (<0x189a8d88> (a java.util.HashMap)) and proceeds to removes the old version of the changed package from the RuleBase. To do this it needs to obtain another lock (<0x189a2ba0> (a org.drools.reteoo.ReteooRuleBase)) before it can call removeRule.

However, in another thread "DataSource(com.acme.Source)-2" a new stateful session is being created on the same RuleBase. This has already obtained the lock  (<0x189a8d88> (a java.util.HashMap)) that the Timer thread is waiting for, and is itself waiting for the another lock <0x189a2ba0> (a org.drools.reteoo.ReteooRuleBase) that the Timer thread has already locked, hence the deadlock.

In my own application I coded around this by eventually not using the RuleAgent in-built refresh mechanism, but instead periodically calling refreshRuleBase() on the RuleAgent in the SAME thread used to create the Stateful session, thus avoiding any deadlock.


"Timer-15" daemon prio=6 tid=0x02f4b0e8 nid=0x1864 waiting for monitor entry [0x038cf000..0x038cfa68]
	at org.drools.reteoo.ReteooRuleBase.removeRule(ReteooRuleBase.java:270)
	- waiting to lock <0x189a2ba0> (a org.drools.reteoo.ReteooRuleBase)
	at org.drools.common.AbstractRuleBase.removeRule(AbstractRuleBase.java:656)
	at org.drools.common.AbstractRuleBase.removePackage(AbstractRuleBase.java:570)
	- locked <0x189a8d88> (a java.util.HashMap)
	at org.drools.agent.PackageProvider.removePackage(PackageProvider.java:45)
	at org.drools.agent.PackageProvider.applyChanges(PackageProvider.java:63)
	at org.drools.agent.RuleAgent.refreshRuleBase(RuleAgent.java:320)
	at org.drools.agent.RuleAgent$2.run(RuleAgent.java:438)
	at java.util.TimerThread.mainLoop(Unknown Source)
	at java.util.TimerThread.run(Unknown Source)



"DataSource(com.acme.Source)-2" daemon prio=4 tid=0x03027610 nid=0xfac waiting for monitor entry [0x0378f000..0x0378fb68]
	at org.drools.reteoo.ReteooRuleBase.newStatefulSession(ReteooRuleBase.java:225)
	- waiting to lock <0x189a8d88> (a java.util.HashMap)
	- locked <0x189a2ba0> (a org.drools.reteoo.ReteooRuleBase)
	at org.drools.common.AbstractRuleBase.newStatefulSession(AbstractRuleBase.java:284)
	at com.acme.RunRules.flush(RunRules.java:3337)
	at com.acme.ControlThread.run(ControlThread.java:465)
	at java.lang.Thread.run(Unknown Source)

Comment 1 nwallace 2009-07-07 13:01:00 UTC
Link: Added: This issue is related to JBRULES-1876


Comment 2 nwallace 2009-09-01 12:18:47 UTC
Fix in place.

Comment 3 David Le Sage 2009-09-10 04:43:16 UTC
For documenting this in the Release Notes, can you please confirm the following and fill in the missing information. Dot point explanations are fine:

The CAUSE (what was actually broken)
 *  A deadlock would occur in Drools
     4.0.7 if the RuleAgent refreshed its
     associated RuleBase whilst a new
     stateful session was being created by another thread on
     that same RuleBase.

CONSEQUENCES of the bug (how it impacts users.)
 * This would result in a deadlock. 


The FIX (what was changed to eliminate this bug) and 
 *

RESULTS of the fix (what now happens for users.)
 * The error no longer occurs???



Comment 4 David Le Sage 2009-09-23 05:31:17 UTC
We are still awaiting the outstanding information for the Release Notes on this one.  Please provide it as soon as possible. Thanks.

Comment 5 Dana Mison 2009-10-05 05:47:18 UTC
added to the 5.0.CP01 release notes as resolved:

JBRULES-1876
The RuleAgent can now safely refresh its associated RuleBase whilst a new stateful session is being created on the RuleBase by another thread. Previously this could result in a deadlock.


Note You need to log in before you can comment on or make changes to this bug.