Bug 749707 - Quartz lock error if alert sender waits after scheduleing operation
Quartz lock error if alert sender waits after scheduleing operation
Status: CLOSED CURRENTRELEASE
Product: RHQ Project
Classification: Other
Component: Alerts (Show other bugs)
4.1
Unspecified Unspecified
medium Severity medium (vote)
: ---
: ---
Assigned To: RHQ Project Maintainer
Mike Foley
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2011-10-27 21:30 EDT by Steven North
Modified: 2014-05-08 14:33 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-05-08 14:33:17 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Example alert pluign that illustrates problem. See build.sh to build stand-alone. (10.69 MB, application/x-gzip-compressed)
2011-10-27 21:30 EDT, Steven North
no flags Details
This is the soruce files and XML. (3.37 KB, application/x-gzip-compressed)
2011-10-27 21:37 EDT, Steven North
no flags Details

  None (edit)
Description Steven North 2011-10-27 21:30:41 EDT
Created attachment 530595 [details]
Example alert pluign that illustrates problem.  See build.sh to build stand-alone.

Description of problem:

AN alert that waits after executing an operation on  a resource causes Quartz error.

Version-Release number of selected component (if applicable):

RHQ 4.1

How reproducible:

Build attached alert plugin, load onto RHQ server.  Create an alert definition on a Linux platform resource conditioned on Free Memory > 0 (should alert right away) and diable after one trigger.  Alert sender will schedule "veiwProcessList" operation on the platform resource then wait for 5 minutes.  Quartz will throw execptions like those shown below.  (I tried just waiting, but the schedule operation is needed to cause the problem.)

  
Actual results:


2011-10-27 19:40:44,747 INFO  [org.rhq.enterprise.server.plugin.pc.alert.AlertSenderPluginManager] Server plugin [alert-problem] is loaded but disabled
2011-10-27 19:40:44,747 INFO  [org.rhq.enterprise.server.plugin.pc.MasterServerPluginContainer] Loaded server plugin [alert-problem]
2011-10-27 19:40:44,764 INFO  [org.rhq.enterprise.server.plugin.pc.alert.AlertSenderPluginManager] Enabling server plugin [alert-problem]
2011-10-27 19:40:45,172 INFO  [com.ocsystems.rtiplugin.MyAlertComponent] PROBLEM: init:
2011-10-27 19:40:45,175 INFO  [com.ocsystems.rtiplugin.MyAlertComponent] PROBLEM: start:
2011-10-27 19:40:45,175 INFO  [org.rhq.enterprise.server.plugin.ServerPluginsBean] Server plugin [alert-problem] has been registered in the database
2011-10-27 19:40:45,176 INFO  [org.rhq.enterprise.server.core.plugin.ServerPluginScanner] Registered server plugin [alert-problem], version 1
2011-10-27 19:41:10,560 INFO  [org.rhq.enterprise.server.alert.engine.jms.AlertConditionConsumerBean] AlertCondition[id=10311] has been removed after it was triggered; this message will be discarded
2011-10-27 19:41:23,874 WARN  [org.hibernate.hql.ast.QueryTranslatorImpl] firstResult/maxResults specified with collection fetch; applying in memory!
2011-10-27 19:41:26,265 WARN  [org.hibernate.hql.ast.QueryTranslatorImpl] firstResult/maxResults specified with collection fetch; applying in memory!
2011-10-27 19:41:27,295 WARN  [org.hibernate.hql.ast.QueryTranslatorImpl] firstResult/maxResults specified with collection fetch; applying in memory!
2011-10-27 19:41:27,439 WARN  [org.hibernate.hql.ast.QueryTranslatorImpl] firstResult/maxResults specified with collection fetch; applying in memory!
2011-10-27 19:41:27,677 WARN  [org.hibernate.hql.ast.QueryTranslatorImpl] firstResult/maxResults specified with collection fetch; applying in memory!
2011-10-27 19:41:27,879 WARN  [org.hibernate.hql.ast.QueryTranslatorImpl] firstResult/maxResults specified with collection fetch; applying in memory!
2011-10-27 19:41:27,943 WARN  [org.hibernate.hql.ast.QueryTranslatorImpl] firstResult/maxResults specified with collection fetch; applying in memory!
2011-10-27 19:41:28,315 WARN  [org.hibernate.hql.ast.QueryTranslatorImpl] firstResult/maxResults specified with collection fetch; applying in memory!
2011-10-27 19:41:28,839 WARN  [org.hibernate.hql.ast.QueryTranslatorImpl] firstResult/maxResults specified with collection fetch; applying in memory!
2011-10-27 19:41:29,110 INFO  [com.ocsystems.rtiplugin.MyAlertSender] PROBLEM: send:
2011-10-27 19:41:29,412 INFO  [com.ocsystems.rtiplugin.MyAlertSender] PROBLEM: wait 300000
2011-10-27 19:41:31,637 WARN  [org.hibernate.hql.ast.QueryTranslatorImpl] firstResult/maxResults specified with collection fetch; applying in memory!
2011-10-27 19:41:35,108 WARN  [org.hibernate.hql.ast.QueryTranslatorImpl] firstResult/maxResults specified with collection fetch; applying in memory!
2011-10-27 19:41:57,259 INFO  [org.rhq.enterprise.server.cloud.instance.CacheConsistencyManagerBean] rhel5-64 took [134]ms to reload global cache
2011-10-27 19:41:57,628 INFO  [org.rhq.enterprise.server.cloud.instance.CacheConsistencyManagerBean] rhel5-64 took [348]ms to reload cache for 1 agents
2011-10-27 19:41:59,406 ERROR [org.quartz.core.ErrorLogger] An error occured while releasing trigger 'org.rhq.enterprise.server.scheduler.jobs.SavedSearchResultCountRecalculationJob.org.rhq.enterprise.server.scheduler.jobs.SavedSearchResultCountRecalculationJob'
org.quartz.impl.jdbcjobstore.LockException: Failure obtaining db row lock: ERROR: canceling statement due to statement timeout [See nested exception: org.postgresql.util.PSQLException: ERROR: canceling statement due to statement timeout]
	at org.quartz.impl.jdbcjobstore.StdRowLockSemaphore.executeSQL(StdRowLockSemaphore.java:112)
	at org.quartz.impl.jdbcjobstore.DBSemaphore.obtainLock(DBSemaphore.java:112)
	at org.quartz.impl.jdbcjobstore.JobStoreSupport.executeInNonManagedTXLock(JobStoreSupport.java:3753)
	at org.quartz.impl.jdbcjobstore.JobStoreSupport.executeInNonManagedTXLock(JobStoreSupport.java:3722)
	at org.quartz.impl.jdbcjobstore.JobStoreSupport.releaseAcquiredTrigger(JobStoreSupport.java:2805)
	at org.quartz.core.QuartzSchedulerThread.run(QuartzSchedulerThread.java:305)
Caused by: org.postgresql.util.PSQLException: ERROR: canceling statement due to statement timeout
	at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2102)
	at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1835)
	at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257)
	at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:500)
	at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:388)
	at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:273)
	at org.jboss.resource.adapter.jdbc.CachedPreparedStatement.executeQuery(CachedPreparedStatement.java:90)
	at org.jboss.resource.adapter.jdbc.WrappedPreparedStatement.executeQuery(WrappedPreparedStatement.java:342)
	at org.quartz.impl.jdbcjobstore.StdRowLockSemaphore.executeSQL(StdRowLockSemaphore.java:92)
	... 5 more
2011-10-27 19:43:18,549 ERROR [org.quartz.impl.jdbcjobstore.JobStoreCMT] MisfireHandler: Error handling misfires: Failure obtaining db row lock: ERROR: canceling statement due to statement timeout
org.quartz.impl.jdbcjobstore.LockException: Failure obtaining db row lock: ERROR: canceling statement due to statement timeout [See nested exception: org.postgresql.util.PSQLException: ERROR: canceling statement due to statement timeout]
	at org.quartz.impl.jdbcjobstore.StdRowLockSemaphore.executeSQL(StdRowLockSemaphore.java:112)
	at org.quartz.impl.jdbcjobstore.DBSemaphore.obtainLock(DBSemaphore.java:112)
	at org.quartz.impl.jdbcjobstore.JobStoreSupport.doRecoverMisfires(JobStoreSupport.java:3124)
	at org.quartz.impl.jdbcjobstore.JobStoreSupport$MisfireHandler.manage(JobStoreSupport.java:3887)
	at org.quartz.impl.jdbcjobstore.JobStoreSupport$MisfireHandler.run(JobStoreSupport.java:3907)
Caused by: org.postgresql.util.PSQLException: ERROR: canceling statement due to statement timeout
	at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2102)
	at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1835)
	at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257)
	at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:500)
	at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:388)
	at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:273)
	at org.jboss.resource.adapter.jdbc.CachedPreparedStatement.executeQuery(CachedPreparedStatement.java:90)
	at org.jboss.resource.adapter.jdbc.WrappedPreparedStatement.executeQuery(WrappedPreparedStatement.java:342)
	at org.quartz.impl.jdbcjobstore.StdRowLockSemaphore.executeSQL(StdRowLockSemaphore.java:92)
	... 4 more
2011-10-27 19:44:59,428 ERROR [org.quartz.core.ErrorLogger] An error occured while releasing trigger 'org.rhq.enterprise.server.scheduler.jobs.SavedSearchResultCountRecalculationJob.org.rhq.enterprise.server.scheduler.jobs.SavedSearchResultCountRecalculationJob'
org.quartz.impl.jdbcjobstore.LockException: Failure obtaining db row lock: ERROR: canceling statement due to statement timeout [See nested exception: org.postgresql.util.PSQLException: ERROR: canceling statement due to statement timeout]
	at org.quartz.impl.jdbcjobstore.StdRowLockSemaphore.executeSQL(StdRowLockSemaphore.java:112)
	at org.quartz.impl.jdbcjobstore.DBSemaphore.obtainLock(DBSemaphore.java:112)
	at org.quartz.impl.jdbcjobstore.JobStoreSupport.executeInNonManagedTXLock(JobStoreSupport.java:3753)
	at org.quartz.impl.jdbcjobstore.JobStoreSupport.executeInNonManagedTXLock(JobStoreSupport.java:3722)
	at org.quartz.impl.jdbcjobstore.JobStoreSupport.releaseAcquiredTrigger(JobStoreSupport.java:2805)
	at org.quartz.core.QuartzSchedulerThread.releaseTriggerRetryLoop(QuartzSchedulerThread.java:534)
	at org.quartz.core.QuartzSchedulerThread.run(QuartzSchedulerThread.java:314)
Caused by: org.postgresql.util.PSQLException: ERROR: canceling statement due to statement timeout
	at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2102)
	at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1835)
	at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257)
	at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:500)
	at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:388)
	at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:273)
	at org.jboss.resource.adapter.jdbc.CachedPreparedStatement.executeQuery(CachedPreparedStatement.java:90)
	at org.jboss.resource.adapter.jdbc.WrappedPreparedStatement.executeQuery(WrappedPreparedStatement.java:342)
	at org.quartz.impl.jdbcjobstore.StdRowLockSemaphore.executeSQL(StdRowLockSemaphore.java:92)
	... 6 more
2011-10-27 19:46:29,469 INFO  [org.rhq.enterprise.server.util.concurrent.AlertSerializer] tid= 1517270: alertDefinitionId=10301: releasing write lock after being locked for millis=300592
2011-10-27 19:46:29,469 INFO  [org.rhq.enterprise.server.util.concurrent.AlertSerializer] tid= 1517394: alertDefinitionId=10301: acquired write lock in millis=202881
2011-10-27 19:46:29,479 INFO  [org.rhq.enterprise.server.util.concurrent.AlertSerializer] tid= 1517604: alertDefinitionId=10301: acquired write lock in millis=138655
2011-10-27 19:46:29,483 INFO  [org.rhq.enterprise.server.util.concurrent.AlertSerializer] tid= 1517814: alertDefinitionId=10301: acquired write lock in millis=79202
2011-10-27 19:46:29,488 INFO  [org.rhq.enterprise.server.util.concurrent.AlertSerializer] tid= 1518033: alertDefinitionId=10301: acquired write lock in millis=19375
2011-10-27 19:46:35,956 INFO  [STDOUT] (I) Remove RTI metrics snapshot: /var/opt/ocsystems/rti/ee/log/jboss_default/rti-metrics-20111027-193535-11020_rhel5-64.txt

Expected results:


Operation should execute, alert wait, and no error in server log.


Additional info:
Comment 1 Steven North 2011-10-27 21:37:24 EDT
Created attachment 530596 [details]
This is the soruce files and XML.

This is just the source files and XML.
Comment 2 Jay Shaughnessy 2014-05-08 14:33:17 EDT
I believe this was probably fixed when we restructured the transactioning and execution order for alert notifications.

Note You need to log in before you can comment on or make changes to this bug.