Bug 1014612

Summary: Deadlock in DB when deploying dashbuilder to EAP domain with multiple nodes
Product: [Retired] JBoss BPMS Platform 6 Reporter: Radovan Synek <rsynek>
Component: BAMAssignee: Roger Martínez <romartin>
Status: CLOSED CURRENTRELEASE QA Contact: Radovan Synek <rsynek>
Severity: high Docs Contact:
Priority: high    
Version: 6.0.0CC: pzapataf, romartin, vigoyal
Target Milestone: ER5   
Target Release: 6.0.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Summary: Due to an underlying synchronization issue, deploying Dashbuilder to an EAP domain with more than one node causes a deadlock when all the nodes try to execute the initial modules DML sentences at the same time. The Dashbuilder gets deployed, but the node with the failure doesn't start and the user cannot login. Cause: All nodes were trying execute initial modules DML sentences at same time. Consequence: Operation failed and node did not startup. Fix: Syncronize initial modules DML sentences using a database table. Result: Only one node executes initial modules DML sentences.
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-08-06 20:10:52 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
node one server log
none
node two server log none

Description Radovan Synek 2013-10-02 11:53:48 UTC
Created attachment 806397 [details]
node one server log

Description of problem:
When deploying dashbuidler to EAP domain with 2 nodes, exception raises in server log saying that "Deadlock found when trying to get lock". I turned hibernate option "show_sql" on, and as it seems, dashbuilder tries to create and initialize its tables on every node of the domain. I am attaching server logs from both nodes (with those show_sql hibernate debug messages), I guess it could help.

After all, the dashbuilder has been deployed successfully, but I am unable to log in to the node where the deadlock error raised.

Version-Release number of selected component (if applicable):
BPMS-6.0.0.ER3

Steps to Reproduce:
1. prepare & clean DB (I used mysql55)
2. start EAP 6.1 in domain mode with 2 nodes
3. deploy dashbuilder
4. see server logs for error stacktraces

Additional info:
In several cases the deadlock did not occur, so I guess it cannot be reproduced every time.
This issue happens no matter if dashbuilder is being deployed together with business-central or not.

Comment 1 Radovan Synek 2013-10-02 11:54:23 UTC
Created attachment 806398 [details]
node two server log

Comment 2 David Gutierrez 2013-10-07 09:59:54 UTC
First of all I think creation of the database schema should be performed right before its deployment. That way you avoid that kind of issues.  

In production environments, the schema creation is usually restricted to DBA staff, so to be honest I think we should not worry about of schema creation clashing because this is something that actually is not going to happen.

In fact, I think the BPMS installation tool will be responsible of the schema creation, if I'm not wrong. 

Therefore, I recommend try to create the schema first and then start up the two nodes.The schema creation files can be found here:

* https://github.com/droolsjbpm/dashboard-builder/tree/master/modules/dashboard-webapp/src/main/webapp/WEB-INF/etc/sql

Comment 3 Radovan Synek 2013-10-07 11:48:30 UTC
David,

I agree that the scheme should be created by some DB admin and the installation tool could be also helpful. But I can see this deadlock even after creating the scheme before starting deployment. Take a look at the attached server logs - there are insert statements before deadlock occurred, so the scheme was already created and problem raised when tables were being initialized with data.

Comment 4 David Gutierrez 2013-10-10 16:34:13 UTC
When the app starts for the first time, not only tries to initialize the schema but also creates some runtime data, i.e: the built-in showcase dashboard.

Is not possible to have two hosts trying to create the same runtime data at the same time. This is kinda like having two users trying to create the same document on the same folder at the same time, obviously, a concurrence issue is expècted to be raised in both cases.

The only solution I think can be applied here is to ensure only one server is started against a shared DB at a given time.

Comment 5 Radovan Synek 2013-10-11 07:53:24 UTC
Hello David,

I understand there is a issue with concurrency - but if the application has an ability to initialize itself and it is supposed to be deployed to cluster, it should handle the initialization on several nodes and use some synchronization to ensure the deadlock in DB cannot occur.

To start only one server, deploy application and after that start another one is a workaround only for EAP in standalone mode. When using EAP domain, the application is deployed on whole group of nodes at one moment. Of course, we can find a similar workaround even for domain mode, but I doubt this is a way we want our customers to use EAP domain mode (if I simplify it, turning domain mode to standalone).

Comment 6 Roger Martínez 2013-10-21 14:51:34 UTC
Hi,

We have implemented a "database locking" mechanism using a new table to register and handle the nodes connected to the database.

As one node starts the database population process, it sets a special status into its table row.. so the other nodes avoid initial modules DML execution.

I have tested this behavior with a EAP as a domain controller and 3 connected nodes, using a TCP shared H2 database.

NOTE: Due to only one node executes those DML sentences, the rest of the nodes startup quicker than the installer one. Until all sentences are not completely executes, navigation in other nodes is not possible (database is empty). After discussing with the team, this behaviour can be considered  negligible..

Commits for master
------------------

- https://github.com/droolsjbpm/dashboard-builder/commit/6f5ee9bc99ece6c2ec6c0496a1666f2e58b29c82

Commits for 6.0.x
------------------

- https://github.com/droolsjbpm/dashboard-builder/commit/b2dc14a269ded05e83e11bf94a2f5f83dd15b743

Comment 8 Radovan Synek 2013-12-05 08:19:33 UTC
Verified on BPMS-6.0.0.ER5