Bug 749707 - Quartz lock error if alert sender waits after scheduleing operation
Summary: Quartz lock error if alert sender waits after scheduleing operation
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: RHQ Project
Classification: Other
Component: Alerts
Version: 4.1
Hardware: Unspecified
OS: Unspecified
medium
medium vote
Target Milestone: ---
: ---
Assignee: RHQ Project Maintainer
QA Contact: Mike Foley
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-10-28 01:30 UTC by Steven North
Modified: 2014-05-08 18:33 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-05-08 18:33:17 UTC


Attachments (Terms of Use)
Example alert pluign that illustrates problem. See build.sh to build stand-alone. (10.69 MB, application/x-gzip-compressed)
2011-10-28 01:30 UTC, Steven North
no flags Details
This is the soruce files and XML. (3.37 KB, application/x-gzip-compressed)
2011-10-28 01:37 UTC, Steven North
no flags Details

Description Steven North 2011-10-28 01:30:41 UTC
Created attachment 530595 [details]
Example alert pluign that illustrates problem.  See build.sh to build stand-alone.

Description of problem:

AN alert that waits after executing an operation on  a resource causes Quartz error.

Version-Release number of selected component (if applicable):

RHQ 4.1

How reproducible:

Build attached alert plugin, load onto RHQ server.  Create an alert definition on a Linux platform resource conditioned on Free Memory > 0 (should alert right away) and diable after one trigger.  Alert sender will schedule "veiwProcessList" operation on the platform resource then wait for 5 minutes.  Quartz will throw execptions like those shown below.  (I tried just waiting, but the schedule operation is needed to cause the problem.)

  
Actual results:


2011-10-27 19:40:44,747 INFO  [org.rhq.enterprise.server.plugin.pc.alert.AlertSenderPluginManager] Server plugin [alert-problem] is loaded but disabled
2011-10-27 19:40:44,747 INFO  [org.rhq.enterprise.server.plugin.pc.MasterServerPluginContainer] Loaded server plugin [alert-problem]
2011-10-27 19:40:44,764 INFO  [org.rhq.enterprise.server.plugin.pc.alert.AlertSenderPluginManager] Enabling server plugin [alert-problem]
2011-10-27 19:40:45,172 INFO  [com.ocsystems.rtiplugin.MyAlertComponent] PROBLEM: init:
2011-10-27 19:40:45,175 INFO  [com.ocsystems.rtiplugin.MyAlertComponent] PROBLEM: start:
2011-10-27 19:40:45,175 INFO  [org.rhq.enterprise.server.plugin.ServerPluginsBean] Server plugin [alert-problem] has been registered in the database
2011-10-27 19:40:45,176 INFO  [org.rhq.enterprise.server.core.plugin.ServerPluginScanner] Registered server plugin [alert-problem], version 1
2011-10-27 19:41:10,560 INFO  [org.rhq.enterprise.server.alert.engine.jms.AlertConditionConsumerBean] AlertCondition[id=10311] has been removed after it was triggered; this message will be discarded
2011-10-27 19:41:23,874 WARN  [org.hibernate.hql.ast.QueryTranslatorImpl] firstResult/maxResults specified with collection fetch; applying in memory!
2011-10-27 19:41:26,265 WARN  [org.hibernate.hql.ast.QueryTranslatorImpl] firstResult/maxResults specified with collection fetch; applying in memory!
2011-10-27 19:41:27,295 WARN  [org.hibernate.hql.ast.QueryTranslatorImpl] firstResult/maxResults specified with collection fetch; applying in memory!
2011-10-27 19:41:27,439 WARN  [org.hibernate.hql.ast.QueryTranslatorImpl] firstResult/maxResults specified with collection fetch; applying in memory!
2011-10-27 19:41:27,677 WARN  [org.hibernate.hql.ast.QueryTranslatorImpl] firstResult/maxResults specified with collection fetch; applying in memory!
2011-10-27 19:41:27,879 WARN  [org.hibernate.hql.ast.QueryTranslatorImpl] firstResult/maxResults specified with collection fetch; applying in memory!
2011-10-27 19:41:27,943 WARN  [org.hibernate.hql.ast.QueryTranslatorImpl] firstResult/maxResults specified with collection fetch; applying in memory!
2011-10-27 19:41:28,315 WARN  [org.hibernate.hql.ast.QueryTranslatorImpl] firstResult/maxResults specified with collection fetch; applying in memory!
2011-10-27 19:41:28,839 WARN  [org.hibernate.hql.ast.QueryTranslatorImpl] firstResult/maxResults specified with collection fetch; applying in memory!
2011-10-27 19:41:29,110 INFO  [com.ocsystems.rtiplugin.MyAlertSender] PROBLEM: send:
2011-10-27 19:41:29,412 INFO  [com.ocsystems.rtiplugin.MyAlertSender] PROBLEM: wait 300000
2011-10-27 19:41:31,637 WARN  [org.hibernate.hql.ast.QueryTranslatorImpl] firstResult/maxResults specified with collection fetch; applying in memory!
2011-10-27 19:41:35,108 WARN  [org.hibernate.hql.ast.QueryTranslatorImpl] firstResult/maxResults specified with collection fetch; applying in memory!
2011-10-27 19:41:57,259 INFO  [org.rhq.enterprise.server.cloud.instance.CacheConsistencyManagerBean] rhel5-64 took [134]ms to reload global cache
2011-10-27 19:41:57,628 INFO  [org.rhq.enterprise.server.cloud.instance.CacheConsistencyManagerBean] rhel5-64 took [348]ms to reload cache for 1 agents
2011-10-27 19:41:59,406 ERROR [org.quartz.core.ErrorLogger] An error occured while releasing trigger 'org.rhq.enterprise.server.scheduler.jobs.SavedSearchResultCountRecalculationJob.org.rhq.enterprise.server.scheduler.jobs.SavedSearchResultCountRecalculationJob'
org.quartz.impl.jdbcjobstore.LockException: Failure obtaining db row lock: ERROR: canceling statement due to statement timeout [See nested exception: org.postgresql.util.PSQLException: ERROR: canceling statement due to statement timeout]
	at org.quartz.impl.jdbcjobstore.StdRowLockSemaphore.executeSQL(StdRowLockSemaphore.java:112)
	at org.quartz.impl.jdbcjobstore.DBSemaphore.obtainLock(DBSemaphore.java:112)
	at org.quartz.impl.jdbcjobstore.JobStoreSupport.executeInNonManagedTXLock(JobStoreSupport.java:3753)
	at org.quartz.impl.jdbcjobstore.JobStoreSupport.executeInNonManagedTXLock(JobStoreSupport.java:3722)
	at org.quartz.impl.jdbcjobstore.JobStoreSupport.releaseAcquiredTrigger(JobStoreSupport.java:2805)
	at org.quartz.core.QuartzSchedulerThread.run(QuartzSchedulerThread.java:305)
Caused by: org.postgresql.util.PSQLException: ERROR: canceling statement due to statement timeout
	at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2102)
	at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1835)
	at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257)
	at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:500)
	at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:388)
	at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:273)
	at org.jboss.resource.adapter.jdbc.CachedPreparedStatement.executeQuery(CachedPreparedStatement.java:90)
	at org.jboss.resource.adapter.jdbc.WrappedPreparedStatement.executeQuery(WrappedPreparedStatement.java:342)
	at org.quartz.impl.jdbcjobstore.StdRowLockSemaphore.executeSQL(StdRowLockSemaphore.java:92)
	... 5 more
2011-10-27 19:43:18,549 ERROR [org.quartz.impl.jdbcjobstore.JobStoreCMT] MisfireHandler: Error handling misfires: Failure obtaining db row lock: ERROR: canceling statement due to statement timeout
org.quartz.impl.jdbcjobstore.LockException: Failure obtaining db row lock: ERROR: canceling statement due to statement timeout [See nested exception: org.postgresql.util.PSQLException: ERROR: canceling statement due to statement timeout]
	at org.quartz.impl.jdbcjobstore.StdRowLockSemaphore.executeSQL(StdRowLockSemaphore.java:112)
	at org.quartz.impl.jdbcjobstore.DBSemaphore.obtainLock(DBSemaphore.java:112)
	at org.quartz.impl.jdbcjobstore.JobStoreSupport.doRecoverMisfires(JobStoreSupport.java:3124)
	at org.quartz.impl.jdbcjobstore.JobStoreSupport$MisfireHandler.manage(JobStoreSupport.java:3887)
	at org.quartz.impl.jdbcjobstore.JobStoreSupport$MisfireHandler.run(JobStoreSupport.java:3907)
Caused by: org.postgresql.util.PSQLException: ERROR: canceling statement due to statement timeout
	at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2102)
	at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1835)
	at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257)
	at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:500)
	at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:388)
	at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:273)
	at org.jboss.resource.adapter.jdbc.CachedPreparedStatement.executeQuery(CachedPreparedStatement.java:90)
	at org.jboss.resource.adapter.jdbc.WrappedPreparedStatement.executeQuery(WrappedPreparedStatement.java:342)
	at org.quartz.impl.jdbcjobstore.StdRowLockSemaphore.executeSQL(StdRowLockSemaphore.java:92)
	... 4 more
2011-10-27 19:44:59,428 ERROR [org.quartz.core.ErrorLogger] An error occured while releasing trigger 'org.rhq.enterprise.server.scheduler.jobs.SavedSearchResultCountRecalculationJob.org.rhq.enterprise.server.scheduler.jobs.SavedSearchResultCountRecalculationJob'
org.quartz.impl.jdbcjobstore.LockException: Failure obtaining db row lock: ERROR: canceling statement due to statement timeout [See nested exception: org.postgresql.util.PSQLException: ERROR: canceling statement due to statement timeout]
	at org.quartz.impl.jdbcjobstore.StdRowLockSemaphore.executeSQL(StdRowLockSemaphore.java:112)
	at org.quartz.impl.jdbcjobstore.DBSemaphore.obtainLock(DBSemaphore.java:112)
	at org.quartz.impl.jdbcjobstore.JobStoreSupport.executeInNonManagedTXLock(JobStoreSupport.java:3753)
	at org.quartz.impl.jdbcjobstore.JobStoreSupport.executeInNonManagedTXLock(JobStoreSupport.java:3722)
	at org.quartz.impl.jdbcjobstore.JobStoreSupport.releaseAcquiredTrigger(JobStoreSupport.java:2805)
	at org.quartz.core.QuartzSchedulerThread.releaseTriggerRetryLoop(QuartzSchedulerThread.java:534)
	at org.quartz.core.QuartzSchedulerThread.run(QuartzSchedulerThread.java:314)
Caused by: org.postgresql.util.PSQLException: ERROR: canceling statement due to statement timeout
	at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2102)
	at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1835)
	at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257)
	at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:500)
	at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:388)
	at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:273)
	at org.jboss.resource.adapter.jdbc.CachedPreparedStatement.executeQuery(CachedPreparedStatement.java:90)
	at org.jboss.resource.adapter.jdbc.WrappedPreparedStatement.executeQuery(WrappedPreparedStatement.java:342)
	at org.quartz.impl.jdbcjobstore.StdRowLockSemaphore.executeSQL(StdRowLockSemaphore.java:92)
	... 6 more
2011-10-27 19:46:29,469 INFO  [org.rhq.enterprise.server.util.concurrent.AlertSerializer] tid= 1517270: alertDefinitionId=10301: releasing write lock after being locked for millis=300592
2011-10-27 19:46:29,469 INFO  [org.rhq.enterprise.server.util.concurrent.AlertSerializer] tid= 1517394: alertDefinitionId=10301: acquired write lock in millis=202881
2011-10-27 19:46:29,479 INFO  [org.rhq.enterprise.server.util.concurrent.AlertSerializer] tid= 1517604: alertDefinitionId=10301: acquired write lock in millis=138655
2011-10-27 19:46:29,483 INFO  [org.rhq.enterprise.server.util.concurrent.AlertSerializer] tid= 1517814: alertDefinitionId=10301: acquired write lock in millis=79202
2011-10-27 19:46:29,488 INFO  [org.rhq.enterprise.server.util.concurrent.AlertSerializer] tid= 1518033: alertDefinitionId=10301: acquired write lock in millis=19375
2011-10-27 19:46:35,956 INFO  [STDOUT] (I) Remove RTI metrics snapshot: /var/opt/ocsystems/rti/ee/log/jboss_default/rti-metrics-20111027-193535-11020_rhel5-64.txt

Expected results:


Operation should execute, alert wait, and no error in server log.


Additional info:

Comment 1 Steven North 2011-10-28 01:37:24 UTC
Created attachment 530596 [details]
This is the soruce files and XML.

This is just the source files and XML.

Comment 2 Jay Shaughnessy 2014-05-08 18:33:17 UTC
I believe this was probably fixed when we restructured the transactioning and execution order for alert notifications.


Note You need to log in before you can comment on or make changes to this bug.