Bug 594706

Summary: Upgrade JON 2.3.1 to JON 2.4 with Oracle 10g gives error in server log: ORA-00060: deadlock detected while waiting for resource
Product: [Other] RHQ Project Reporter: Rajan Timaniya <rtimaniy>
Component: Core ServerAssignee: Jay Shaughnessy <jshaughn>
Status: CLOSED CURRENTRELEASE QA Contact: Corey Welton <cwelton>
Severity: medium Docs Contact:
Priority: urgent    
Version: 3.0.0CC: jshaughn
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: 2.4 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-08-12 16:58:40 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 565628, 591531    
Attachments:
Description Flags
server log
none
server log none

Description Rajan Timaniya 2010-05-21 12:07:56 UTC
Created attachment 415655 [details]
server log

Description of problem:
Upgrade JON 2.3.1 to JON 2.4 with Oracle 10g gives exception (JDBCExceptionReporter) in server log file.

2010-05-21 16:15:46,256 INFO  [org.rhq.enterprise.server.core.plugin.ProductPluginDeployer] Deploying [15] new or updated agent plugins: [JBossAS, JBossAS5, RHQServer, IIS, Database, JBossCache3, Postgres, Augeas, JMX, JBossCache, Platforms, Apache, Hibernate, Tomcat, RHQAgent]

2010-05-21 16:15:51,931 WARN  [org.hibernate.util.JDBCExceptionReporter] SQL Error: 60, SQLState: 61000

2010-05-21 16:15:51,943 ERROR [org.hibernate.util.JDBCExceptionReporter] ORA-00060: deadlock detected while waiting for resource



2010-05-21 16:15:51,944 WARN  [org.hibernate.util.JDBCExceptionReporter] SQL Error: 60, SQLState: 61000

2010-05-21 16:15:51,944 ERROR [org.hibernate.util.JDBCExceptionReporter] ORA-00060: deadlock detected while waiting for resource



2010-05-21 16:15:51,944 ERROR [org.hibernate.event.def.AbstractFlushingEventListener] Could not synchronize database state with session

org.hibernate.exception.LockAcquisitionException: Could not execute JDBC batch update

	at org.hibernate.exception.SQLStateConverter.convert(SQLStateConverter.java:87)

	at org.hibernate.exception.JDBCExceptionHelper.convert(JDBCExceptionHelper.java:43)

	at org.hibernate.jdbc.AbstractBatcher.executeBatch(AbstractBatcher.java:254)

	at org.hibernate.jdbc.AbstractBatcher.prepareStatement(AbstractBatcher.java:93)

Version-Release number of selected component (if applicable):


Version-Release number of selected component (if applicable):
JON build #160 - Revision:  10620
http://hudson-qe.rhq.rdu.redhat.com:8080/job/jon/160/

How reproducible:
Always

Steps to Reproduce:
1) Install JON 2.3.1 with Oracle 10g

2) Install/configured 3 agents

3) Perform operations on JON 2.3.1

	(i) Create alerts for platform, JBoss AS4, RHQ Agent, Compatible group (alert condition is Free Memory > 0.1 OR Total Memory < 999999)

	(ii) Create groups (Mix, Compatible, Group defination)

	(iii)Content repository syncronization (for JBoss Patch)

	(iv) Create Roles and Users

	(v) Change INVENTORY -> CONNECTION of JBoss AS4 

	(vi)Change configuration values of 'Postgres' and create 'History'

	(vii)Perform LDAP settings

	(viii)Create 'SCHEDULES' for 'View Process List' on platform

4) Upgrade JON 2.3.1 to JON 2.4 (JON build #160)

5) Log-in to JON 2.4 (JON build #160) and verify all performed operations (alert, schedules, groups, users, roles, LDAP settings, etc..)

6) Check server log for any database error/exception



  
Actual results:
Upgrade JON 2.3.1 to JON 2.4 with Oracle 10g gives exception (JDBCExceptionReporter) in server log file
.

Expected results:
Upgrade JON 2.3.1. to JON 2.4 should upgrade with all its data without any error/exception.


Additional info:
Please refer attached server log.

Comment 1 Rajan Timaniya 2010-05-21 12:14:06 UTC
Note: There is no other database connection(s) except JON.

Comment 2 Charles Crouch 2010-05-24 17:57:20 UTC
Rajan, 
Can you determine if this also fails on Postgres?
Thanks

Comment 3 Jay Shaughnessy 2010-05-25 15:01:03 UTC
This is fallout from recent work done (by me) to fix other issues in plugin upgrade regarding prop def enumerations.  I added code to perform better cleanup of obsolete prop defs (and related enums).  This cleanup step seems to causing the deadlocks. We do concurrent plugin registration/updates and, it seems, oracle's locking mechanisms are causing issues when two (or more) plugins are trying to update the Logging resource (which uses enums) for JVMs.

The original work is in d3fa4b8aceb726dc8eda7c6164e4693cb415512c.

Anyway, it seems removing the cleanup code solves the issue. This leaves some useless prop def structure in the database after an upgrade. Note, the old code did not work at all in this area and also did not clean up, so we're at least somewhat better off.

I'm going to spend a bit more time trying to see if we can do some cleanup of obsolete prop defs, possibly in a more batch type mechanism as opposed to the incremental approach that is failing. But, if it seems too hairy perhaps we should just leave it as is, legacy dbs most likely already have a bunch of garbage already.

Comment 4 Jay Shaughnessy 2010-05-25 18:35:28 UTC
I think it's not worth trying to do the cleanup.  I've commented out the cleanup code and added a bit of doc and a debug log message.

I think this issue is resolved.

Comment 5 Jay Shaughnessy 2010-05-25 18:36:47 UTC
commit 63ad14acc094e44dad2a79885a8ef72368c08782

Comment 6 Rajan Timaniya 2010-05-27 09:57:51 UTC
Verified on JON build #170 - Revision: 10621

Observed that there is no - ORA-00060: deadlock detected while waiting for resource

Comment 7 Rajan Timaniya 2010-05-31 08:53:55 UTC
Description of problem:
Again observed 'ORA-00060: deadlock detected while waiting for resource' during upgrade JON 2.3.1 to JON 2.4 (JON build #177) with Oracle 10g.

2010-05-31 13:02:11,684 INFO  [org.rhq.enterprise.server.resource.ResourceManagerBean] User [org.rhq.core.domain.auth.Subject[id=1,name=admin]] is marking resource [Resource[id=10124, type=RHQ Server Content Source Server Plugin Subsystem, key=rhq.serverplugin:service=ContentSourcePluginService, name=RHQ Server Content Source Server Plugin Subsystem, parent=rajantest RHQ Server,  JBoss AS 4.2.3.GA default (0.0.0.0:2099)]] for asynchronous deletion
2010-05-31 13:02:19,019 WARN  [org.hibernate.util.JDBCExceptionReporter] SQL Error: 60, SQLState: 61000
2010-05-31 13:02:19,019 ERROR [org.hibernate.util.JDBCExceptionReporter] ORA-00060: deadlock detected while waiting for resource

2010-05-31 13:02:19,020 ERROR [org.rhq.enterprise.server.resource.ResourceManagerBean] Bulk named query delete error for 'ResourceConfigurationUpdate.deleteByResources2' for [10124]: org.hibernate.exception.LockAcquisitionException: could not execute update query

Version-Release number of selected component (if applicable):
JON build #177 - Revision:  10621
http://hudson-qe.rhq.rdu.redhat.com:8080/job/jon/177/

1) Install JON 2.3.1 on Oracle 10g
2) Create some alerts
   (i) Alert:
       Alert -1: Created on platform, the alert condition is Free Memory > 0.1
       Alert -2: Created on JBoss AS server, the alert condition is Free Memory
> 0.1
       Alert -3: Created on RHQ Agent, the alert condition is Free Memory > 0.1
3) Upgrade JON 2.3.1 to JON 2.4 (build #177) (same database Oracle 10g)
4) Login to JON 2.4 and select platform or (any) resource

Actual results:
Upgrade JON 2.3.1 to JON 2.4 with Oracle 10g gives error/exception - ORA-00060: deadlock detected while waiting for resource

Expected results:
Upgrade JON 2.3.1 to JON 2.4 should upgrade with all its data without any
error/exception.

Comment 8 Rajan Timaniya 2010-05-31 08:55:47 UTC
Created attachment 418209 [details]
server log

Comment 9 Charles Crouch 2010-06-02 20:43:17 UTC
This should be marked VERIFIED right Rajan?

Comment 10 Rajan Timaniya 2010-06-03 04:44:30 UTC
Charles, the bug test for JON build #170 there wasn't observed 'ORA-00060: deadlock detected while waiting for resource' error. But during upgrade test for JON build #177, observed same error again.

Should I open separate bug for this?

Comment 11 Jay Shaughnessy 2010-06-03 19:09:02 UTC
This deadlock is completely different from the original deadlock in this issue.  It probably merits a separate BZ entry.

I don't think the repro steps are sufficient as this must have taken place after an uninventory of resources.  Checking back with QA about this...

Comment 12 Rajan Timaniya 2010-06-04 11:02:27 UTC
As the deadlock error mentioned in comment-7 (https://bugzilla.redhat.com/show_bug.cgi?id=594706#c7) is different from the original deadlock, so I will again retest this with JON 2.4 Beta0 release and open new bug.

Comment 13 Corey Welton 2010-08-12 16:58:40 UTC
Mass-closure of verified bugs against JON.