Bug 749707
Summary: | Quartz lock error if alert sender waits after scheduleing operation | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Other] RHQ Project | Reporter: | Steven North <swn> | ||||||
Component: | Alerts | Assignee: | RHQ Project Maintainer <rhq-maint> | ||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Mike Foley <mfoley> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 4.1 | CC: | hrupp, jshaughn | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2014-05-08 18:33:17 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Created attachment 530596 [details]
This is the soruce files and XML.
This is just the source files and XML.
I believe this was probably fixed when we restructured the transactioning and execution order for alert notifications. |
Created attachment 530595 [details] Example alert pluign that illustrates problem. See build.sh to build stand-alone. Description of problem: AN alert that waits after executing an operation on a resource causes Quartz error. Version-Release number of selected component (if applicable): RHQ 4.1 How reproducible: Build attached alert plugin, load onto RHQ server. Create an alert definition on a Linux platform resource conditioned on Free Memory > 0 (should alert right away) and diable after one trigger. Alert sender will schedule "veiwProcessList" operation on the platform resource then wait for 5 minutes. Quartz will throw execptions like those shown below. (I tried just waiting, but the schedule operation is needed to cause the problem.) Actual results: 2011-10-27 19:40:44,747 INFO [org.rhq.enterprise.server.plugin.pc.alert.AlertSenderPluginManager] Server plugin [alert-problem] is loaded but disabled 2011-10-27 19:40:44,747 INFO [org.rhq.enterprise.server.plugin.pc.MasterServerPluginContainer] Loaded server plugin [alert-problem] 2011-10-27 19:40:44,764 INFO [org.rhq.enterprise.server.plugin.pc.alert.AlertSenderPluginManager] Enabling server plugin [alert-problem] 2011-10-27 19:40:45,172 INFO [com.ocsystems.rtiplugin.MyAlertComponent] PROBLEM: init: 2011-10-27 19:40:45,175 INFO [com.ocsystems.rtiplugin.MyAlertComponent] PROBLEM: start: 2011-10-27 19:40:45,175 INFO [org.rhq.enterprise.server.plugin.ServerPluginsBean] Server plugin [alert-problem] has been registered in the database 2011-10-27 19:40:45,176 INFO [org.rhq.enterprise.server.core.plugin.ServerPluginScanner] Registered server plugin [alert-problem], version 1 2011-10-27 19:41:10,560 INFO [org.rhq.enterprise.server.alert.engine.jms.AlertConditionConsumerBean] AlertCondition[id=10311] has been removed after it was triggered; this message will be discarded 2011-10-27 19:41:23,874 WARN [org.hibernate.hql.ast.QueryTranslatorImpl] firstResult/maxResults specified with collection fetch; applying in memory! 2011-10-27 19:41:26,265 WARN [org.hibernate.hql.ast.QueryTranslatorImpl] firstResult/maxResults specified with collection fetch; applying in memory! 2011-10-27 19:41:27,295 WARN [org.hibernate.hql.ast.QueryTranslatorImpl] firstResult/maxResults specified with collection fetch; applying in memory! 2011-10-27 19:41:27,439 WARN [org.hibernate.hql.ast.QueryTranslatorImpl] firstResult/maxResults specified with collection fetch; applying in memory! 2011-10-27 19:41:27,677 WARN [org.hibernate.hql.ast.QueryTranslatorImpl] firstResult/maxResults specified with collection fetch; applying in memory! 2011-10-27 19:41:27,879 WARN [org.hibernate.hql.ast.QueryTranslatorImpl] firstResult/maxResults specified with collection fetch; applying in memory! 2011-10-27 19:41:27,943 WARN [org.hibernate.hql.ast.QueryTranslatorImpl] firstResult/maxResults specified with collection fetch; applying in memory! 2011-10-27 19:41:28,315 WARN [org.hibernate.hql.ast.QueryTranslatorImpl] firstResult/maxResults specified with collection fetch; applying in memory! 2011-10-27 19:41:28,839 WARN [org.hibernate.hql.ast.QueryTranslatorImpl] firstResult/maxResults specified with collection fetch; applying in memory! 2011-10-27 19:41:29,110 INFO [com.ocsystems.rtiplugin.MyAlertSender] PROBLEM: send: 2011-10-27 19:41:29,412 INFO [com.ocsystems.rtiplugin.MyAlertSender] PROBLEM: wait 300000 2011-10-27 19:41:31,637 WARN [org.hibernate.hql.ast.QueryTranslatorImpl] firstResult/maxResults specified with collection fetch; applying in memory! 2011-10-27 19:41:35,108 WARN [org.hibernate.hql.ast.QueryTranslatorImpl] firstResult/maxResults specified with collection fetch; applying in memory! 2011-10-27 19:41:57,259 INFO [org.rhq.enterprise.server.cloud.instance.CacheConsistencyManagerBean] rhel5-64 took [134]ms to reload global cache 2011-10-27 19:41:57,628 INFO [org.rhq.enterprise.server.cloud.instance.CacheConsistencyManagerBean] rhel5-64 took [348]ms to reload cache for 1 agents 2011-10-27 19:41:59,406 ERROR [org.quartz.core.ErrorLogger] An error occured while releasing trigger 'org.rhq.enterprise.server.scheduler.jobs.SavedSearchResultCountRecalculationJob.org.rhq.enterprise.server.scheduler.jobs.SavedSearchResultCountRecalculationJob' org.quartz.impl.jdbcjobstore.LockException: Failure obtaining db row lock: ERROR: canceling statement due to statement timeout [See nested exception: org.postgresql.util.PSQLException: ERROR: canceling statement due to statement timeout] at org.quartz.impl.jdbcjobstore.StdRowLockSemaphore.executeSQL(StdRowLockSemaphore.java:112) at org.quartz.impl.jdbcjobstore.DBSemaphore.obtainLock(DBSemaphore.java:112) at org.quartz.impl.jdbcjobstore.JobStoreSupport.executeInNonManagedTXLock(JobStoreSupport.java:3753) at org.quartz.impl.jdbcjobstore.JobStoreSupport.executeInNonManagedTXLock(JobStoreSupport.java:3722) at org.quartz.impl.jdbcjobstore.JobStoreSupport.releaseAcquiredTrigger(JobStoreSupport.java:2805) at org.quartz.core.QuartzSchedulerThread.run(QuartzSchedulerThread.java:305) Caused by: org.postgresql.util.PSQLException: ERROR: canceling statement due to statement timeout at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2102) at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1835) at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257) at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:500) at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:388) at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:273) at org.jboss.resource.adapter.jdbc.CachedPreparedStatement.executeQuery(CachedPreparedStatement.java:90) at org.jboss.resource.adapter.jdbc.WrappedPreparedStatement.executeQuery(WrappedPreparedStatement.java:342) at org.quartz.impl.jdbcjobstore.StdRowLockSemaphore.executeSQL(StdRowLockSemaphore.java:92) ... 5 more 2011-10-27 19:43:18,549 ERROR [org.quartz.impl.jdbcjobstore.JobStoreCMT] MisfireHandler: Error handling misfires: Failure obtaining db row lock: ERROR: canceling statement due to statement timeout org.quartz.impl.jdbcjobstore.LockException: Failure obtaining db row lock: ERROR: canceling statement due to statement timeout [See nested exception: org.postgresql.util.PSQLException: ERROR: canceling statement due to statement timeout] at org.quartz.impl.jdbcjobstore.StdRowLockSemaphore.executeSQL(StdRowLockSemaphore.java:112) at org.quartz.impl.jdbcjobstore.DBSemaphore.obtainLock(DBSemaphore.java:112) at org.quartz.impl.jdbcjobstore.JobStoreSupport.doRecoverMisfires(JobStoreSupport.java:3124) at org.quartz.impl.jdbcjobstore.JobStoreSupport$MisfireHandler.manage(JobStoreSupport.java:3887) at org.quartz.impl.jdbcjobstore.JobStoreSupport$MisfireHandler.run(JobStoreSupport.java:3907) Caused by: org.postgresql.util.PSQLException: ERROR: canceling statement due to statement timeout at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2102) at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1835) at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257) at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:500) at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:388) at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:273) at org.jboss.resource.adapter.jdbc.CachedPreparedStatement.executeQuery(CachedPreparedStatement.java:90) at org.jboss.resource.adapter.jdbc.WrappedPreparedStatement.executeQuery(WrappedPreparedStatement.java:342) at org.quartz.impl.jdbcjobstore.StdRowLockSemaphore.executeSQL(StdRowLockSemaphore.java:92) ... 4 more 2011-10-27 19:44:59,428 ERROR [org.quartz.core.ErrorLogger] An error occured while releasing trigger 'org.rhq.enterprise.server.scheduler.jobs.SavedSearchResultCountRecalculationJob.org.rhq.enterprise.server.scheduler.jobs.SavedSearchResultCountRecalculationJob' org.quartz.impl.jdbcjobstore.LockException: Failure obtaining db row lock: ERROR: canceling statement due to statement timeout [See nested exception: org.postgresql.util.PSQLException: ERROR: canceling statement due to statement timeout] at org.quartz.impl.jdbcjobstore.StdRowLockSemaphore.executeSQL(StdRowLockSemaphore.java:112) at org.quartz.impl.jdbcjobstore.DBSemaphore.obtainLock(DBSemaphore.java:112) at org.quartz.impl.jdbcjobstore.JobStoreSupport.executeInNonManagedTXLock(JobStoreSupport.java:3753) at org.quartz.impl.jdbcjobstore.JobStoreSupport.executeInNonManagedTXLock(JobStoreSupport.java:3722) at org.quartz.impl.jdbcjobstore.JobStoreSupport.releaseAcquiredTrigger(JobStoreSupport.java:2805) at org.quartz.core.QuartzSchedulerThread.releaseTriggerRetryLoop(QuartzSchedulerThread.java:534) at org.quartz.core.QuartzSchedulerThread.run(QuartzSchedulerThread.java:314) Caused by: org.postgresql.util.PSQLException: ERROR: canceling statement due to statement timeout at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2102) at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1835) at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257) at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:500) at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:388) at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:273) at org.jboss.resource.adapter.jdbc.CachedPreparedStatement.executeQuery(CachedPreparedStatement.java:90) at org.jboss.resource.adapter.jdbc.WrappedPreparedStatement.executeQuery(WrappedPreparedStatement.java:342) at org.quartz.impl.jdbcjobstore.StdRowLockSemaphore.executeSQL(StdRowLockSemaphore.java:92) ... 6 more 2011-10-27 19:46:29,469 INFO [org.rhq.enterprise.server.util.concurrent.AlertSerializer] tid= 1517270: alertDefinitionId=10301: releasing write lock after being locked for millis=300592 2011-10-27 19:46:29,469 INFO [org.rhq.enterprise.server.util.concurrent.AlertSerializer] tid= 1517394: alertDefinitionId=10301: acquired write lock in millis=202881 2011-10-27 19:46:29,479 INFO [org.rhq.enterprise.server.util.concurrent.AlertSerializer] tid= 1517604: alertDefinitionId=10301: acquired write lock in millis=138655 2011-10-27 19:46:29,483 INFO [org.rhq.enterprise.server.util.concurrent.AlertSerializer] tid= 1517814: alertDefinitionId=10301: acquired write lock in millis=79202 2011-10-27 19:46:29,488 INFO [org.rhq.enterprise.server.util.concurrent.AlertSerializer] tid= 1518033: alertDefinitionId=10301: acquired write lock in millis=19375 2011-10-27 19:46:35,956 INFO [STDOUT] (I) Remove RTI metrics snapshot: /var/opt/ocsystems/rti/ee/log/jboss_default/rti-metrics-20111027-193535-11020_rhel5-64.txt Expected results: Operation should execute, alert wait, and no error in server log. Additional info: