Bug 713744 - ResourceMetadataManagerBean Hangs on Oracle 11g
ResourceMetadataManagerBean Hangs on Oracle 11g
Status: CLOSED CURRENTRELEASE
Product: RHQ Project
Classification: Other
Component: Core Server (Show other bugs)
4.0.1
Unspecified Linux
high Severity high (vote)
: ---
: ---
Assigned To: John Sanda
Mike Foley
:
Depends On:
Blocks: jon30-bugs
  Show dependency treegraph
 
Reported: 2011-06-16 09:08 EDT by Stefan Negrea
Modified: 2012-02-07 14:28 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-02-07 14:28:33 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Stefan Negrea 2011-06-16 09:08:51 EDT
ResourceMetadataManagerBean hangs on line 401 (operationMetadataMgr.updateMetadata(...) in method mergeExistingType when connected to Oracle 11g.

How reproducible:
The problem can be reproduced easily by running enterprise/server/jar tests.

Steps to Reproduce:
1. Go to modules/enterprise/server/jar module.
2. Run ResourceMetadataManagerBeanTest tests using an Oracle database test connection.
3. Observe the behaviour of upgradePlugin teste method.
  
Actual results:
Integrations tests hang indefinitely.

Expected results:
Integration tests pass.

Additional info:
Tested the code locally and the code hangs on a socket read (connected to the oracle 11g server) used by jdbc driver. The CPU activity on the Oracle 11g server is negligible, which means the server is under heavy load processing the request just sent.
Comment 1 John Mazzitelli 2011-06-30 17:00:11 EDT
this also happens on postgres when I tweeked the dbunit .xml files.

hangs here:

org.rhq.enterprise.server.resource.metadata.ResourceMetadataManagerBeanTest.upgradePlugin()
Comment 2 John Mazzitelli 2011-06-30 17:21:47 EDT
If I uncomment this in plugin_v1.xml:

<!-- THIS CAUSES A DEADLOCK DURING PLUGIN REGISTRATION!
        <bundle-target>
           <destination-base-dir name="bundleTarget1">
               <value-context>pluginConfiguration</value-context>
               <value-name>connectionPropertyY</value-name>
           </destination-base-dir>
        </bundle-target>
-->

ResourceMetadataManagerBeanTest hangs during plugin update.
Comment 3 John Mazzitelli 2011-06-30 17:22:15 EDT
raising to high priority - this may be a problem with plugin updates for plugins that define bundle-target elements.
Comment 4 John Sanda 2011-07-01 12:11:00 EDT
There were a few XXXMetadatManagerBean methods that were missing the REQUIRES_NEW transaction attribute and were not performing their updates in their own, separate transactions. I was able to reproduce the deadlock described in comment 2, and putting in REQUIRES_NEW so that those method execute in their own transactions resolved the issue.

commit hash: 0ef2b576b995d39ecc31d6b84d52c9d57d9fa498
Comment 5 John Sanda 2011-07-01 12:54:22 EDT
Looks like the code changes in 0ef2b576b995d39ecc31d6b84d52c9d57d9fa498 did not resolve the issue on oracle. Moving back to ON_DEV for further investigation.
Comment 6 John Sanda 2011-07-01 14:13:12 EDT
My previous comment was based on the tests hanging in the oracle hudson job. I proceeded to test locally against oracle 10g. Without the fix, I reproduce the deadlock. Running with the fix, there is no deadlock and tests pass. It could be that the oracle instance used by hudson is not in a good state. Going to move this back to ON_QA as I have tested this against both oracle and postgresql.
Comment 7 John Sanda 2011-07-08 08:57:42 EDT
Found another place through a test where ResourceMetadataManagerBean is deadlocking. The method, testAddDeleteTemplate, in the class UpdateConfigurationSubsystemTest triggers the deadlock which is happening in ResourceMetadataManagerBean.mergeExistingType at the following line (399ish):

resourceConfigMetadataMgr.updateResourceConfigurationDefinition(existingType, resourceType)

I do not think that the above method call is causing the deadlock. I think it has to do with transaction boundaries established prior to mergeExistingType being called. Moving back to ON_DEV for further investigation.
Comment 8 John Sanda 2011-07-08 21:56:23 EDT
This deadlock (described in comment 7) was discovered and reproduced by running the test UpdateConfigurationSubsystemTest.testAddDeleteTemplate. The deadlock
occurs in ResourceMetadataManagerBean.mergeExistingType on line 397 which is,

  resourceConfigMetadataMgr.updateResourceConfigurationDefinition(existingType,
      resourceType);

The problem as with all of these deadlock issues involves the table rhq_config. In the test method, a plugin that declares a config template, is installed and then updated. In an initial transaction that is started in PluginManagerBean, a write lock is allocated to update rhq_plugin. updateResourceConfigurationDefinition executes in its own transaction, and because the test involves some config templates, a write lock is needed for rhq_config since there is a one-to-one assocation between rhq_config_template and rhq_config. That lock cannot be obtained because rhq_plugin has a FK with rhq_config for the ServerPlugin class, hence the deadlock.

With this commit, PluginManagerBean.registerPlugin has been refactored to avoid the deadlock. It first calls a new method, installPluginJar, which executes in its own transaction and handles updating the Plugin object. After that method is called, registerPluginTypes is called. That call starts a new transaction, and registerPluginTypes now only handles the meta data updates.

commit hash: 3cc1ede3a509763de99633876bf90891204890ee

I will wait to move this back to ON_QA until we get a passing hudson build on the oracle job.
Comment 9 Mike Foley 2011-07-25 16:29:37 EDT
documenting passing tests on Oracle

https://hudson.qa.jboss.com/hudson/view/RHQ%20Core/job/rhq-master/220/

additionally, i use oracle as my rhq repo on a daily basis ... including running the ui automation.
Comment 10 Mike Foley 2012-02-07 14:28:33 EST
changing status of VERIFIED BZs for JON 2.4.2 and JON 3.0 to CLOSED/CURRENTRELEASE

Note You need to log in before you can comment on or make changes to this bug.