Bug 586943 - Configuration for a compatible group times out trying to update the config for the second group member
Summary: Configuration for a compatible group times out trying to update the config fo...
Alias: None
Product: RHQ Project
Classification: Other
Component: Configuration
Version: 1.4
Hardware: All
OS: Linux
Target Milestone: ---
: ---
Assignee: John Sanda
QA Contact: Corey Welton
Depends On:
Blocks: jon-sprint10-bugs
TreeView+ depends on / blocked
Reported: 2010-04-28 14:44 UTC by John Sefler
Modified: 2010-08-12 16:49 UTC (History)
2 users (show)

Fixed In Version: 2.4
Doc Type: Bug Fix
Doc Text:
Clone Of:
Last Closed: 2010-08-12 16:49:59 UTC

Attachments (Terms of Use)
stack trace (173.82 KB, image/png)
2010-04-28 14:44 UTC, John Sefler
no flags Details

Description John Sefler 2010-04-28 14:44:39 UTC
Created attachment 409856 [details]
stack trace

Description of problem:
After creating a compatible group of Datasources, I try to use the group CONFIGURATION to update the config for the members of the group and the update times out.  It appears to update the first member of the group, but times out on the rest.

Version-Release number of selected component (if applicable):
version: 3.0.0-SNAPSHOT
build number: 20fe0ec 

How reproducible:

Steps to Reproduce:
1. on an inventoried JBossAS Server, create two Datasources:
   a. navigate to the server's INVENTORY tab
   b. Create New: - Datasource  OK
       default - the default template  CONTINUE
       Resource Name:  JohnsDatasource1
       Type:           No TX Datasource
       JNDI Name:      JohnsDatasource1
       Driver Class:   ${rhq.server.database.driver-class}
       Connection Url: ${rhq.server.database.connection-url}
   c. Create New: - Datasource  OK
       default - the default template  CONTINUE
       Resource Name:  JohnsDatasource2
       Type:           Local TX Datasource
       JNDI Name:      JohnsDatasource2
       Driver Class:   ${rhq.server.database.driver-class}
       Connection Url: ${rhq.server.database.connection-url}
2. Groups > New Group
       Name: JohnsDatasources
       Contains: Compatible Resources   - Datasource
       check JohnDatasource1
       check JohnDatasource2
       click the right arrow
       set Max Pool Size 21
3. Click to the CURRENT subtab

Actual results:

Configuration update is currently in progress. Please wait a few moments and refresh the page.

After ten minutes of refreshing, the individual datasource configuration change fails on the second datasource.

If I open the datasource's individiually and look at its CONFIGURATION/HISTORY, I find a Failure with a stack trace:
Timed out - did not complete after 872741 ms (the timeout period was 600000 ms)

See the attached screen shot.

PS. Nice stack trace formatting... https://bugzilla.redhat.com/show_bug.cgi?id=535805

Comment 1 Charles Crouch 2010-05-12 16:39:55 UTC
This sounds like a regression.

Comment 2 John Sanda 2010-05-13 14:58:18 UTC
Here is the underlying exception from the server log,

javax.persistence.PersistenceException: org.hibernate.HibernateException: A collection with cascade="all-delete-orphan" was no longer referenced by the owning entity instance: org.rhq.core.domain.configuration.PropertyMap.map
        at org.hibernate.ejb.AbstractEntityManagerImpl.throwPersistenceException(AbstractEntityManagerImpl.java:629)        at org.hibernate.ejb.AbstractEntityManagerImpl$1.beforeCompletion(AbstractEntityManagerImpl.java:524)
        at com.arjuna.ats.internal.jta.resources.arjunacore.SynchronizationImple.beforeCompletion(SynchronizationImpl
        at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.beforeCompletion(TwoPhaseCoordinator.java:247)
        at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.end(TwoPhaseCoordinator.java:86)        at com.arjuna.ats.arjuna.AtomicAction.commit(AtomicAction.java:177)        at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commitAndDisassociate(TransactionImple.java:1389)        at com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.commit(BaseTransaction.java:135)        at com.arjuna.ats.jbossatx.BaseTransactionManagerDelegate.commit(BaseTransactionManagerDelegate.java:87)        at org.jboss.aspects.tx.TxPolicy.endTransaction(TxPolicy.java:175)        at org.jboss.aspects.tx.TxPolicy.invokeInOurTx(TxPolicy.java:87)        at org.jboss.aspects.tx.TxInterceptor$Required.invoke(TxInterceptor.java:191)        at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:101)
        at org.jboss.aspects.tx.TxPropagationInterceptor.invoke(TxPropagationInterceptor.java:95)        at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:101)
        at org.jboss.ejb3.stateless.StatelessInstanceInterceptor.invoke(StatelessInstanceInterceptor.java:62)        at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:101)
at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:101)
        at org.jboss.aspects.security.AuthenticationInterceptor.invoke(AuthenticationInterceptor.java:77)
        at org.jboss.ejb3.security.Ejb3AuthenticationInterceptor.invoke(Ejb3AuthenticationInterceptor.java:110)
        at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:101)
        at org.jboss.ejb3.ENCPropagationInterceptor.invoke(ENCPropagationInterceptor.java:46)
        at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:101)
        at org.jboss.ejb3.asynchronous.AsynchronousInterceptor.invoke(AsynchronousInterceptor.java:106)
        at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:101)
        at org.jboss.ejb3.stateless.StatelessContainer.localInvoke(StatelessContainer.java:240)
        at org.jboss.ejb3.stateless.StatelessContainer.localInvoke(StatelessContainer.java:210)
        at org.jboss.ejb3.stateless.StatelessLocalProxy.invoke(StatelessLocalProxy.java:84)
        at $Proxy338.executeResourceConfigurationUpdate(Unknown Source)
        at org.rhq.enterprise.server.configuration.job.GroupResourceConfigurationUpdateJob.executeConfigurationUpdate(GroupResourceConfigurationUpdateJob.java:62)
        at org.rhq.enterprise.server.configuration.job.AbstractGroupConfigurationUpdateJob.processGroupConfigurationUpdate(AbstractGroupConfigurationUpdateJob.java:85)
        at org.rhq.enterprise.server.configuration.job.AbstractGroupConfigurationUpdateJob.execute(AbstractGroupConfigurationUpdateJob.java:64)
        at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
        at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:525)
Caused by: org.hibernate.HibernateException: A collection with cascade="all-delete-orphan" was no longer referenced by the owning entity instance: org.rhq.core.domain.configuration.PropertyMap.map
        at org.hibernate.engine.Collections.processDereferencedCollection(Collections.java:96)
        at org.hibernate.engine.Collections.processUnreachableCollection(Collections.java:39)
        at org.hibernate.event.def.AbstractFlushingEventListener.flushCollections(AbstractFlushingEventListener.java:218)
        at org.hibernate.event.def.AbstractFlushingEventListener.flushEverythingToExecutions(AbstractFlushingEventListener.java:77)
[org.rhq.enterprise.server.configuration.job.GroupResourceConfigurationUpdateJob] Failed to execute one or more Resource Configuration updates that were part of a group update - details: java.lang.RuntimeException:javax.transaction.RollbackException: [com.arjuna.ats.internal.jta.transaction.arjunacore.commitwhenaborted] [com.arjuna.ats.internal.jta.transaction.arjunacore.commitwhenaborted] Can't commit because the transaction is in aborted state -> javax.transaction.RollbackException:[com.arjuna.ats.internal.jta.transaction.arjunacore.commitwhenaborted] [com.arjuna.ats.internal.jta.transaction.arjunacore.commitwhenaborted] Can't commit because the transaction is in aborted state -> javax.persistence.PersistenceException:org.hibernate.HibernateException: A collection with cascade="all-delete-orphan" was no longer referenced by the owning entity instance: org.rhq.core.domain.configuration.PropertyMap.map -> org.hibernate.HibernateException:A collection with cascade="all-delete-orphan" was no longer referenced by the owning entity instance: org.rhq.core.domain.configuration.PropertyMap.map

Comment 3 John Sanda 2010-05-14 11:51:07 UTC
After further testing I found that this bug is not specific to a particular resource type, and it will happen even with a group containing only a single resource. It does not happen though when you try to update the resource configuration outside of a group. I was also able to produce the same hibernate exception for PropertyList.list.

Hibernate exceptions are occurring because the underlying collections in PropertyMap and PropertyList which include a cascade style of delete_orphan, are getting dereferenced. I have not been able to determine when/where the PropertyMap.map and PropertyList.list collections are getting dereferenced. I have pushed a commit to master that works around the problem by creating a deep copy of the Configuration object, stripped of any hibernate proxies.

commit hash - b75773e304ac6a5fc6ac6f76fb7ab8301499d91b

I am moving this to ON_QA, but I am going to continue investigating to see if I can figure out where those collections are getting dereferenced. I am also going to try to get some automated tests in place for this. This should be reproducible via test code.

Comment 4 Corey Welton 2010-05-21 15:04:29 UTC
Going to bounce this back to dev.  I created a group of rhq agents and attempted to change a timeout value somewhere in the Miscellaneous pane.  First one works, subsequent ones fail.

Comment 5 John Sanda 2010-05-21 20:59:34 UTC
I saw this happen on Corey's server earlier today; however, there was no hibernate exception. The failed updates just reported a timeout. The configuration update appears to fail consistently for a group of five agents. I also reproduced the timeout with a group of two agents as well as a group of two data sources. I have not yet been able to reproduce locally yet though.

Comment 6 John Sanda 2010-05-26 18:53:05 UTC
I tested this again on Corey's server today with a more recent build and did not run into the timeout. Moving back to ON_QA so that Corey can retest

Comment 7 Sudhir D 2010-06-01 13:25:17 UTC
I verified this on Build# 182 Version: 2.4 SNAPSHOT.

Configuration update is working as expected. I can see both the datasources configuration getting updated correctly. 

There is no timeout happening with this build. 

Marking this bug as verified.

Comment 8 Corey Welton 2010-08-12 16:49:59 UTC
Mass-closure of verified bugs against JON.

Note You need to log in before you can comment on or make changes to this bug.