Red Hat Bugzilla – Full Text Bug Listing
|Summary:||ORA-02049 during upgrade from JON-3.1.0.GA to JON-3.1.1.ER1 with Oracle|
|Product:||[Other] RHQ Project||Reporter:||Filip Brychta <fbrychta>|
|Component:||Core Server||Assignee:||RHQ Project Maintainer <rhq-maint>|
|Status:||CLOSED CURRENTRELEASE||QA Contact:||Mike Foley <mfoley>|
|Version:||JON 3.1.1||CC:||hrupp, jsanda, jshaughn, mazz|
|Target Release:||JON 3.1.1|
|Fixed In Version:||Doc Type:||Bug Fix|
|Doc Text:||Story Points:||---|
|:||848384 (view as bug list)||Environment:|
|Last Closed:||2013-09-03 11:02:32 EDT||Type:||Bug|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
|Bug Depends On:|
Description Filip Brychta 2012-08-07 10:35:39 EDT
Created attachment 602762 [details] complete logs Description of problem: There is a following exception: 'java.sql.BatchUpdateException: ORA-02049: timeout: distributed transaction waiting for lock' in server log during upgrade from JON-3.1.0.GA to JON-3.1.1.ER1. Installation takes much longer than usual. Version-Release number of selected component (if applicable): JON-3.1.1.ER1 How reproducible: 3 of 3 Steps to Reproduce: 1. JON-3.1.0.GA with eap plugin installed and running, EAP-5.1.2 running (profile ALL), RHQ agent and eap imported 2. follow upgrade procedure http://docs.redhat.com/docs/en-US/JBoss_Operations_Network/3.1/html/Installation_Guide/upgrading.html note: plugins were copied before JON-3.1.1.ER1 was started - cp jon-plugin-pack-eap-3.1.1.ER1/* jon-server-3.1.1.ER1/plugins/ 3.check logs Actual results: 'java.sql.BatchUpdateException: ORA-02049: timeout: distributed transaction waiting for lock' exception in server log Expected results: no exceptions Additional info: complete logs attached
Comment 2 Heiko W. Rupp 2012-08-10 09:14:54 EDT
So to recap, you had a JON 3.1.0.GA server set up and then tried to upgrade to a 3.1.1.ER1. Which plugins were installed on 3.1.0? Did you on upgrade load the plugin packs before starting the 3.1.1 server? Was an as7/eap6 in inventory of the 3.1.0 server?
Comment 3 Filip Brychta 2012-08-10 10:36:56 EDT
(In reply to comment #2) > So to recap, you had a JON 3.1.0.GA server set up and then tried to upgrade > to a 3.1.1.ER1. > Yes > Which plugins were installed on 3.1.0? jon-plugin-pack-eap-3.1.0.GA >Did you on upgrade load the plugin > packs before starting the 3.1.1 server? yes > Was an as7/eap6 in inventory of the 3.1.0 server? EAP-5.1.2 was running and imported to inventory More accurate: 1- JON-3.1.0.GA with jon-plugin-pack-eap-3.1.0.GA set up, EAP-5.1.2 runnig with 'all' profile (./run.sh -c all) and imported to inventory, the rhq agent resource was imported as well 2- the rhq agent was prepared for upgrade (running in the background according to upgrade manual) 3- follow upgrade manual: - stop 3.1.0.GA server - cp plugins to $RHQ_SERVER_HOME/plugins (cp jon-plugin-pack-eap-3.1.1.ER1/* jon-server-3.1.1.ER1/plugins/) - start 3.1.1.ER1 server and finish upgrade
Comment 4 John Mazzitelli 2012-08-13 11:28:45 EDT
can I assume this setting in your rhq-server.properties is "1" ??? # The number of concurrent threads used to deploy plugins. # Currently, it is not recommended to increase this value. rhq.server.plugin-deployer-threads=1 This needs to be 1. Nothing larger.
Comment 6 Filip Brychta 2012-08-14 03:17:43 EDT
(In reply to comment #4) > can I assume this setting in your rhq-server.properties is "1" ??? > > # The number of concurrent threads used to deploy plugins. > # Currently, it is not recommended to increase this value. > rhq.server.plugin-deployer-threads=1 > > This needs to be 1. Nothing larger. Yes, i did not touch this property.
Comment 7 Jay Shaughnessy 2012-08-15 09:22:19 EDT
it d36c9fbdf488fbc6d7e8e6b31bad506e1d0d150d or: Jay Shaughnessy <firstname.lastname@example.org> : Wed Aug 15 09:20:44 2012 -0400 [Bug 846353 - ORA-02049 during upgrade from JON-3.1.0.GA to JON-3.1.1.ER1 wit After a lot of investigation it seems that the problem occurs occasionally when we remove obsolete properties from resource configuration. It does not happen every time, on fact it's fairly rare, although for the same DB, and the same plugin update, it is repeatable. This makes it seem like the locking issue is due mainly to unpredictable locking at the db level, and likely the fact that we occasionally hit a page lock due to some other prior update to the config table. Since the config table stores so many different types of data it's not obvious how we would identify the conflict. The approach taken was to try and reduce possible conflict by increasing the granularity of metadata update transactions. Prior to this change we used a single encompassing transaction for a plugin update, that means all types were updated under one umbrella transaction. Not one transaction, because we already use nested transactions in several places, but using one umbrella transaction increases the chance of that transaction holding a lock that could affect a nested transaction. We still maintain the umbrella transaction but this commit breaks it up such that a nested transaction is used for the update of each resource type in the plugin. That means each type update will not hold any locks when it has completed. This change seems to be working as the AS7 plugin now updates successfully. Additionally: - added some more INFO level logging to give some basic progress during a plugin update. - added some more debug logging as well - removed a bunch of unnecessary em.flush calls - used the return value of some em.merge calls to ensure using the up to date entity. Cherry pick of master c9eb53bb8dab7393d9b5383fbe5bab99088487ed. Test Notes: Upgrades of as many plugin plugins as possible from older versions to latest (4.5 versions) for oracle and postgres. If possible, data in inventory will make the test even more robust, although it was not necessary for the original issue.
Comment 8 John Sanda 2012-08-22 01:50:27 EDT
Moving to ON_QA. The JON 3.1.1 ER3 build is available at https://brewweb.devel.redhat.com/buildinfo?buildID=230321.
Comment 9 Filip Brychta 2012-08-27 05:14:23 EDT
Verified on JON 3.1.1 ER3 for described scenario. More complex scenarios to do.
Comment 10 Heiko W. Rupp 2013-09-03 11:02:32 EDT
Bulk closing of old issues in VERIFIED state.