Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 778804 (SOA-1254)

Summary:

Deadlock in JobExecutor on Oracle10g

Product:

[JBoss] JBoss Enterprise SOA Platform 4

Reporter:

Martin Vecera <mvecera>

Component:

JBPM - within SOA

Assignee:

Martin Vecera <mvecera>

Status:

CLOSED NEXTRELEASE

QA Contact:

Severity:

urgent

Docs Contact:

Priority:

urgent

Version:

4.3 CP01

CC:

pmacik

Target Milestone:

---

Target Release:

4.3 CP01

Hardware:

Unspecified

OS:

Unspecified

URL:

http://jira.jboss.org/jira/browse/SOA-1254

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Oracle10g, other databases (MySQL, PostgreSQL) work fine

Last Closed:

2009-04-02 14:54:28 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
Performance4.esb	none
server.log.bz2	none
server.tracker.log.bz2	none
indexes.diff	none
bpm_orchestration_deadlock.tar.bz2	none

Description Martin Vecera 2009-03-26 15:27:38 UTC

Date of First Response: 2009-03-28 04:21:05
project_key: SOA

We found the following issue when trying to execute performance test based on bpm_orchestration4. We are using the attached .esb archive (including sources) and sending 10 messages (transacted JMS session, persistent messages, messages are of type TextMessage).

2009-03-26 16:15:37,469 ERROR [org.hibernate.event.def.AbstractFlushingEventListener] (JbpmJobExector:10.34.32.154:1) Could not synchronize database state with session
org.hibernate.exception.LockAcquisitionException: Could not execute JDBC batch update
        at org.hibernate.exception.SQLStateConverter.convert(SQLStateConverter.java:87)
        at org.hibernate.exception.JDBCExceptionHelper.convert(JDBCExceptionHelper.java:43)
        at org.hibernate.jdbc.AbstractBatcher.executeBatch(AbstractBatcher.java:254)
        at org.hibernate.jdbc.AbstractBatcher.prepareStatement(AbstractBatcher.java:93)
        at org.hibernate.jdbc.AbstractBatcher.prepareStatement(AbstractBatcher.java:88)
        at org.hibernate.persister.entity.AbstractEntityPersister.delete(AbstractEntityPersister.java:2487)
        at org.hibernate.persister.entity.AbstractEntityPersister.delete(AbstractEntityPersister.java:2697)
        at org.hibernate.action.EntityDeleteAction.execute(EntityDeleteAction.java:74)
        at org.hibernate.engine.ActionQueue.execute(ActionQueue.java:250)
        at org.hibernate.engine.ActionQueue.executeActions(ActionQueue.java:234)
        at org.hibernate.engine.ActionQueue.executeActions(ActionQueue.java:146)
        at org.hibernate.event.def.AbstractFlushingEventListener.performExecutions(AbstractFlushingEventListener.java:298)
        at org.hibernate.event.def.DefaultFlushEventListener.onFlush(DefaultFlushEventListener.java:27)
        at org.hibernate.impl.SessionImpl.flush(SessionImpl.java:1000)
        at org.hibernate.impl.SessionImpl.managedFlush(SessionImpl.java:338)
        at org.hibernate.transaction.CacheSynchronization.beforeCompletion(CacheSynchronization.java:59)
        at com.arjuna.ats.internal.jta.resources.arjunacore.SynchronizationImple.beforeCompletion(SynchronizationImple.java:114)
        at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.beforeCompletion(TwoPhaseCoordinator.java:247)
        at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.end(TwoPhaseCoordinator.java:86)
        at com.arjuna.ats.arjuna.AtomicAction.commit(AtomicAction.java:177)
        at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commitAndDisassociate(TransactionImple.java:1406)
        at com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.commit(BaseTransaction.java:135)
        at com.arjuna.ats.jbossatx.BaseTransactionManagerDelegate.commit(BaseTransactionManagerDelegate.java:87)
        at org.jboss.tm.usertx.client.ServerVMClientUserTransaction.commit(ServerVMClientUserTransaction.java:146)
        at org.jbpm.persistence.jta.JtaDbPersistenceService.endUserTransaction(JtaDbPersistenceService.java:94)
        at org.jbpm.persistence.jta.JtaDbPersistenceService.close(JtaDbPersistenceService.java:63)
        at org.jbpm.svc.Services.close(Services.java:294)
        at org.jbpm.JbpmContext.close(JbpmContext.java:136)
        at org.jbpm.job.executor.JobExecutorThread.executeJob(JobExecutorThread.java:194)
        at org.jbpm.job.executor.JobExecutorThread.run(JobExecutorThread.java:60)
Caused by: java.sql.BatchUpdateException: ORA-00060: deadlock detected while waiting for resource

        at oracle.jdbc.driver.DatabaseError.throwBatchUpdateException(DatabaseError.java:343)
        at oracle.jdbc.driver.OraclePreparedStatement.executeBatch(OraclePreparedStatement.java:10698)
        at org.jboss.resource.adapter.jdbc.WrappedStatement.executeBatch(WrappedStatement.java:620)
        at org.hibernate.jdbc.BatchingBatcher.doExecuteBatch(BatchingBatcher.java:48)
        at org.hibernate.jdbc.AbstractBatcher.executeBatch(AbstractBatcher.java:247)
        ... 27 more

The messages cannot be delivered and end in DLQ.

We tried to use jms-jca-provider but it didn't help. Attached is a server log showing the problem with DEBUG level.

Comment 1 Martin Vecera 2009-03-26 15:28:14 UTC

Attachment: Added: Performance4.esb

Comment 2 Martin Vecera 2009-03-26 15:28:32 UTC

Attachment: Added: server.log.bz2

Comment 3 Martin Vecera 2009-03-26 15:31:38 UTC

The issue was introduced in CR4 when jBPM was updated to fix the performance problem.

Comment 4 Martin Vecera 2009-03-27 13:45:51 UTC

Attached server log with tracker JDBC driver installed on JbpmDS.

Comment 5 Martin Vecera 2009-03-27 13:45:51 UTC

Attachment: Added: server.tracker.log.bz2

Comment 6 Kevin Conner 2009-03-28 08:21:05 UTC

My current theory, from looking through the last log, is that the deadlock is caused by a modification to the JBPM_BYTEARRAY table, specifically 'delete from JBPM_BYTEARRAY where ID_=?.'

From what I can see there are two tables which reference this table as a foreign key, JBPM_BYTEBLOCK and JBPM_VARIABLEINSTANCE, but I cannot find any indexes for their constraints.  I believe the outcome of this is that Oracle will be forced into enforcing a table lock onto these tables whenever a row is deleted from JBPM_BYTEARRAY.

My knowledge of Oracle is limited so advice should be sought from other sources, however I believe that this theory could easily be tested from a command line client.

Comment 7 Kevin Conner 2009-03-28 08:22:32 UTC

I should also add that I believe a solution would be to create the indexes on the foreign key constraints.

We should also check all other constraints to see whether they require indexes.

Comment 8 Alejandro Guizar 2009-03-28 15:36:55 UTC

Kevin, thank you for your insights. In the original log that Martin posted, the error occurs around inserting a ByteArray. Deadlocks during both insertion and deletion can be explained by a missing index resulting in locking the whole table/block.

I am going to review the index declarations in the relationships you mention and others.

Comment 9 Kevin Conner 2009-03-29 08:41:59 UTC

Hiya Alejandro.

The original log shows the same issue as the second, with the last statement being executed on the deadlocked thread being the full table lock on delete of JBPM_BYTEARRAY.

The insert you are referring to is from a different thread and is not relevant to the deadlock, that thread should continue once the attempt to acquire a full table lock has been broken.

Kev

Comment 10 Alejandro Guizar 2009-03-30 00:49:39 UTC

Link: Added: This issue depends JBPM-1367

Comment 11 Alejandro Guizar 2009-03-30 20:46:48 UTC

One of our users seems to have hit this very same problem recently, and proposed some index additions: JBPM-2094

Comment 12 Alejandro Guizar 2009-03-30 20:46:48 UTC

Link: Added: This issue related JBPM-2094

Comment 13 Alejandro Guizar 2009-03-31 08:46:35 UTC

I committed a test case based on the process definition in bpm_orchestration 4. 

http://fisheye.jboss.com/changelog/JbpmSvn/?cs=4363

Unfortunately I could not reproduce the problem.

http://jbpm.dyndns.org:8180/hudson/job/jBPM3-JDK1.5/5/

Any thoughts as to what is missing in my test case? A pointer to the original test case would help too.

Comment 14 Alejandro Guizar 2009-03-31 09:02:07 UTC

The attached patch provides the missing foreign key indexes in many-to-one mappings. It is ready to be applied once the deadlock is reproduced.

Comment 15 Alejandro Guizar 2009-03-31 09:02:07 UTC

Attachment: Added: indexes.diff

Comment 16 trev 2009-03-31 11:27:37 UTC

 The messages must be persistent and there must be a  fork/join in the process

Comment 17 trev 2009-03-31 11:28:14 UTC

We are applying the patch and will hand it over to QE to try the test aginst it

Comment 18 Kevin Conner 2009-03-31 12:08:07 UTC

The real trigger for this instance of the bug is concurrent access (from any source) to the BYTEARRAY variable tables within the context of a transaction.

This is not necessarily from the same process instance although the fork/join within the bpm orchestration 4 does provoke it quickly.

Comment 19 Martin Vecera 2009-03-31 14:31:05 UTC

Quickstart to reproduce the problem.
Run with ant deploy, ant deployProcess, ant startProcess

Comment 20 Martin Vecera 2009-03-31 14:31:05 UTC

Attachment: Added: bpm_orchestration_deadlock.tar.bz2

Comment 21 Alejandro Guizar 2009-03-31 22:51:52 UTC

I was able to reproduce at last. I was missing the byte array variable updates as Kevin mentioned.
The good news is that after applying the patch the problem went away :-)

Comment 22 Martin Vecera 2009-04-01 13:14:53 UTC

The fix has been verified with test build from Julian and Trevor. Waiting for another CR to resolve the issue.

Comment 23 Martin Vecera 2009-04-02 14:54:28 UTC

Verified with CR6.