Bug 1014612 - Deadlock in DB when deploying dashbuilder to EAP domain with multiple nodes
Deadlock in DB when deploying dashbuilder to EAP domain with multiple nodes
Status: CLOSED CURRENTRELEASE
Product: JBoss BPMS Platform 6
Classification: JBoss
Component: BAM (Show other bugs)
6.0.0
Unspecified Unspecified
high Severity high
: ER5
: 6.0.0
Assigned To: Roger Martínez
Radovan Synek
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-10-02 07:53 EDT by Radovan Synek
Modified: 2014-08-06 16:10 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Summary: Due to an underlying synchronization issue, deploying Dashbuilder to an EAP domain with more than one node causes a deadlock when all the nodes try to execute the initial modules DML sentences at the same time. The Dashbuilder gets deployed, but the node with the failure doesn't start and the user cannot login. Cause: All nodes were trying execute initial modules DML sentences at same time. Consequence: Operation failed and node did not startup. Fix: Syncronize initial modules DML sentences using a database table. Result: Only one node executes initial modules DML sentences.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-08-06 16:10:52 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
node one server log (208.49 KB, text/x-log)
2013-10-02 07:53 EDT, Radovan Synek
no flags Details
node two server log (2.34 MB, text/x-log)
2013-10-02 07:54 EDT, Radovan Synek
no flags Details

  None (edit)
Description Radovan Synek 2013-10-02 07:53:48 EDT
Created attachment 806397 [details]
node one server log

Description of problem:
When deploying dashbuidler to EAP domain with 2 nodes, exception raises in server log saying that "Deadlock found when trying to get lock". I turned hibernate option "show_sql" on, and as it seems, dashbuilder tries to create and initialize its tables on every node of the domain. I am attaching server logs from both nodes (with those show_sql hibernate debug messages), I guess it could help.

After all, the dashbuilder has been deployed successfully, but I am unable to log in to the node where the deadlock error raised.

Version-Release number of selected component (if applicable):
BPMS-6.0.0.ER3

Steps to Reproduce:
1. prepare & clean DB (I used mysql55)
2. start EAP 6.1 in domain mode with 2 nodes
3. deploy dashbuilder
4. see server logs for error stacktraces

Additional info:
In several cases the deadlock did not occur, so I guess it cannot be reproduced every time.
This issue happens no matter if dashbuilder is being deployed together with business-central or not.
Comment 1 Radovan Synek 2013-10-02 07:54:23 EDT
Created attachment 806398 [details]
node two server log
Comment 2 David Gutierrez 2013-10-07 05:59:54 EDT
First of all I think creation of the database schema should be performed right before its deployment. That way you avoid that kind of issues.  

In production environments, the schema creation is usually restricted to DBA staff, so to be honest I think we should not worry about of schema creation clashing because this is something that actually is not going to happen.

In fact, I think the BPMS installation tool will be responsible of the schema creation, if I'm not wrong. 

Therefore, I recommend try to create the schema first and then start up the two nodes.The schema creation files can be found here:

* https://github.com/droolsjbpm/dashboard-builder/tree/master/modules/dashboard-webapp/src/main/webapp/WEB-INF/etc/sql
Comment 3 Radovan Synek 2013-10-07 07:48:30 EDT
David,

I agree that the scheme should be created by some DB admin and the installation tool could be also helpful. But I can see this deadlock even after creating the scheme before starting deployment. Take a look at the attached server logs - there are insert statements before deadlock occurred, so the scheme was already created and problem raised when tables were being initialized with data.
Comment 4 David Gutierrez 2013-10-10 12:34:13 EDT
When the app starts for the first time, not only tries to initialize the schema but also creates some runtime data, i.e: the built-in showcase dashboard.

Is not possible to have two hosts trying to create the same runtime data at the same time. This is kinda like having two users trying to create the same document on the same folder at the same time, obviously, a concurrence issue is expècted to be raised in both cases.

The only solution I think can be applied here is to ensure only one server is started against a shared DB at a given time.
Comment 5 Radovan Synek 2013-10-11 03:53:24 EDT
Hello David,

I understand there is a issue with concurrency - but if the application has an ability to initialize itself and it is supposed to be deployed to cluster, it should handle the initialization on several nodes and use some synchronization to ensure the deadlock in DB cannot occur.

To start only one server, deploy application and after that start another one is a workaround only for EAP in standalone mode. When using EAP domain, the application is deployed on whole group of nodes at one moment. Of course, we can find a similar workaround even for domain mode, but I doubt this is a way we want our customers to use EAP domain mode (if I simplify it, turning domain mode to standalone).
Comment 6 Roger Martínez 2013-10-21 10:51:34 EDT
Hi,

We have implemented a "database locking" mechanism using a new table to register and handle the nodes connected to the database.

As one node starts the database population process, it sets a special status into its table row.. so the other nodes avoid initial modules DML execution.

I have tested this behavior with a EAP as a domain controller and 3 connected nodes, using a TCP shared H2 database.

NOTE: Due to only one node executes those DML sentences, the rest of the nodes startup quicker than the installer one. Until all sentences are not completely executes, navigation in other nodes is not possible (database is empty). After discussing with the team, this behaviour can be considered  negligible..

Commits for master
------------------

- https://github.com/droolsjbpm/dashboard-builder/commit/6f5ee9bc99ece6c2ec6c0496a1666f2e58b29c82

Commits for 6.0.x
------------------

- https://github.com/droolsjbpm/dashboard-builder/commit/b2dc14a269ded05e83e11bf94a2f5f83dd15b743
Comment 8 Radovan Synek 2013-12-05 03:19:33 EST
Verified on BPMS-6.0.0.ER5

Note You need to log in before you can comment on or make changes to this bug.