Red Hat Bugzilla – Bug 713744
ResourceMetadataManagerBean Hangs on Oracle 11g
Last modified: 2012-02-07 14:28:33 EST
ResourceMetadataManagerBean hangs on line 401 (operationMetadataMgr.updateMetadata(...) in method mergeExistingType when connected to Oracle 11g.
The problem can be reproduced easily by running enterprise/server/jar tests.
Steps to Reproduce:
1. Go to modules/enterprise/server/jar module.
2. Run ResourceMetadataManagerBeanTest tests using an Oracle database test connection.
3. Observe the behaviour of upgradePlugin teste method.
Integrations tests hang indefinitely.
Integration tests pass.
Tested the code locally and the code hangs on a socket read (connected to the oracle 11g server) used by jdbc driver. The CPU activity on the Oracle 11g server is negligible, which means the server is under heavy load processing the request just sent.
this also happens on postgres when I tweeked the dbunit .xml files.
If I uncomment this in plugin_v1.xml:
<!-- THIS CAUSES A DEADLOCK DURING PLUGIN REGISTRATION!
ResourceMetadataManagerBeanTest hangs during plugin update.
raising to high priority - this may be a problem with plugin updates for plugins that define bundle-target elements.
There were a few XXXMetadatManagerBean methods that were missing the REQUIRES_NEW transaction attribute and were not performing their updates in their own, separate transactions. I was able to reproduce the deadlock described in comment 2, and putting in REQUIRES_NEW so that those method execute in their own transactions resolved the issue.
commit hash: 0ef2b576b995d39ecc31d6b84d52c9d57d9fa498
Looks like the code changes in 0ef2b576b995d39ecc31d6b84d52c9d57d9fa498 did not resolve the issue on oracle. Moving back to ON_DEV for further investigation.
My previous comment was based on the tests hanging in the oracle hudson job. I proceeded to test locally against oracle 10g. Without the fix, I reproduce the deadlock. Running with the fix, there is no deadlock and tests pass. It could be that the oracle instance used by hudson is not in a good state. Going to move this back to ON_QA as I have tested this against both oracle and postgresql.
Found another place through a test where ResourceMetadataManagerBean is deadlocking. The method, testAddDeleteTemplate, in the class UpdateConfigurationSubsystemTest triggers the deadlock which is happening in ResourceMetadataManagerBean.mergeExistingType at the following line (399ish):
I do not think that the above method call is causing the deadlock. I think it has to do with transaction boundaries established prior to mergeExistingType being called. Moving back to ON_DEV for further investigation.
This deadlock (described in comment 7) was discovered and reproduced by running the test UpdateConfigurationSubsystemTest.testAddDeleteTemplate. The deadlock
occurs in ResourceMetadataManagerBean.mergeExistingType on line 397 which is,
The problem as with all of these deadlock issues involves the table rhq_config. In the test method, a plugin that declares a config template, is installed and then updated. In an initial transaction that is started in PluginManagerBean, a write lock is allocated to update rhq_plugin. updateResourceConfigurationDefinition executes in its own transaction, and because the test involves some config templates, a write lock is needed for rhq_config since there is a one-to-one assocation between rhq_config_template and rhq_config. That lock cannot be obtained because rhq_plugin has a FK with rhq_config for the ServerPlugin class, hence the deadlock.
With this commit, PluginManagerBean.registerPlugin has been refactored to avoid the deadlock. It first calls a new method, installPluginJar, which executes in its own transaction and handles updating the Plugin object. After that method is called, registerPluginTypes is called. That call starts a new transaction, and registerPluginTypes now only handles the meta data updates.
commit hash: 3cc1ede3a509763de99633876bf90891204890ee
I will wait to move this back to ON_QA until we get a passing hudson build on the oracle job.
documenting passing tests on Oracle
additionally, i use oracle as my rhq repo on a daily basis ... including running the ui automation.
changing status of VERIFIED BZs for JON 2.4.2 and JON 3.0 to CLOSED/CURRENTRELEASE