Bug 1119331
| Summary: | Job Misfire handler fails: AlertAvailabilityDurationJob' job:Job class must implement the Job interface | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [JBoss] JBoss Operations Network | Reporter: | Larry O'Leary <loleary> | ||||
| Component: | Monitoring - Alerts | Assignee: | Lukas Krejci <lkrejci> | ||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Mike Foley <mfoley> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | JON 3.1.2 | CC: | genman, hrupp, jshaughn, lkrejci, loleary, mmahoney | ||||
| Target Milestone: | DR02 | ||||||
| Target Release: | JON 3.3.0 | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: |
A hotfix change to how availability duration jobs were scheduled (from using the Quartz job service to an EJB timer) caused jobs scheduled before the hotfix was applied--but not evaluated or expired when the server was shut down--to produce Job Misfire Handler errors. The fix now takes into account availability duration jobs left behind and changes the job to meet the new scheduling mechanism.
|
Story Points: | --- | ||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2014-12-11 14:02:18 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
Given hotfixes are not allowed to touch database I do believe this is a case of stale data in the database. The org.rhq.enterprise.server.scheduler.jobs.AlertAvailabilityDurationJob used to be a Quartz job class that was later transformed into an EJB timer. Note that this change is NOT present in the release/jon3.1.x branch, just the hotfix branch. The only explanation for the exception I have is that the job was kept in the database but after applying the hotfix, the job class was no longer able to carry out the job. That said, I assume this exception is essentially harmless because the job is being run normally but as an EJB timer. in master
commit 29d93c61270b35c9a8733fcf83d3a8dde78d907b
Author: Lukas Krejci <lkrejci>
Date: Wed Jul 16 21:48:22 2014 +0200
[BZ 1119331] Handle the jobs potentially left behind after conversion
of the avail duration check job into an EJB time.
commit a90aae26a2574f74bcae1a7fab78e7164b759d1a
Author: Lukas Krejci <lkrejci>
Date: Wed Jul 16 21:58:18 2014 +0200
[BZ 1119331] Fix a typo in the DB upgrade step
*** Bug 1120445 has been marked as a duplicate of this bug. *** Setting to modified as this is in release/jon3.3.x Moving to ON_QA as available to test with brew build of DR01: https://brewweb.devel.redhat.com//buildinfo?buildID=373993 Steps to reproduce:
1. Install and start JBoss EAP 6 standalone server.
2. Install and start JBoss ON 3.1.2.GA system (pre server hotfix-05).
3. Import JBoss EAP standalone server into inventory.
4. Configure JBoss EAP standalone resource's connection settings.
5. Create availability duration alert definition (stays down 10 minutes) on JBoss EAP server.
*Name*: `Down too long`
*Condition Type*: _Availability Duration_
*Availability Duration*: _Stays Down_
*Duration*: _10_ _minutes_
6. Verify alert definition is working:
#. Shutdown JBoss EAP server.
#. Wait 11 minutes.
Alert should have been triggered.
#. Start JBoss EAP server.
#. Wait for availability to report as UP.
7. Shutdown JBoss EAP server and wait 2 minutes.
8. Shutdown JBoss ON system.
9. Wait 8 minutes.
10. Apply server hotfix-05 or later.
11. Start JBoss ON system.
Actual results:
JBoss ON server.log reports the following ERROR every four minutes:
2014-07-22 14:01:56,624 ERROR [org.quartz.impl.jdbcjobstore.JobStoreCMT] MisfireHandler: Error handling misfires: Couldn't store trigger 'AVAIL_DURATION_DOWN-10004' for 'org.rhq.enterprise.server.scheduler.jobs.AlertAvailabilityDurationJob' job:Job class must implement the Job interface.
org.quartz.JobPersistenceException: Couldn't store trigger 'AVAIL_DURATION_DOWN-10004' for 'org.rhq.enterprise.server.scheduler.jobs.AlertAvailabilityDurationJob' job:Job class must implement the Job interface. [See nested exception: java.lang.IllegalArgumentException: Job class must implement the Job interface.]
Expected results:
No error.
Created attachment 924525 [details]
Alert
|
Description of problem: Every four minutes, the following ERROR is reported in the server log: ERROR [org.quartz.impl.jdbcjobstore.JobStoreCMT] MisfireHandler: Error handling misfires: Couldn't store trigger 'AVAIL_DURATION_DOWN-712781' for 'org.rhq.enterprise.server.scheduler.jobs.AlertAvailabilityDurationJob' job:Job class must implement the Job interface. org.quartz.JobPersistenceException: Couldn't store trigger 'AVAIL_DURATION_DOWN-712781' for 'org.rhq.enterprise.server.scheduler.jobs.AlertAvailabilityDurationJob' job:Job class must implement the Job interface. [See nested exception: java.lang.IllegalArgumentException: Job class must implement the Job interface.] at org.quartz.impl.jdbcjobstore.JobStoreSupport.storeTrigger(JobStoreSupport.java:1246) at org.quartz.impl.jdbcjobstore.JobStoreSupport.doUpdateOfMisfiredTrigger(JobStoreSupport.java:1014) at org.quartz.impl.jdbcjobstore.JobStoreSupport.recoverMisfiredJobs(JobStoreSupport.java:956) at org.quartz.impl.jdbcjobstore.JobStoreSupport.doRecoverMisfires(JobStoreSupport.java:3126) at org.quartz.impl.jdbcjobstore.JobStoreSupport$MisfireHandler.manage(JobStoreSupport.java:3887) at org.quartz.impl.jdbcjobstore.JobStoreSupport$MisfireHandler.run(JobStoreSupport.java:3907) Caused by: java.lang.IllegalArgumentException: Job class must implement the Job interface. at org.quartz.JobDetail.setJobClass(JobDetail.java:280) at org.quartz.impl.jdbcjobstore.StdJDBCDelegate.selectJobDetail(StdJDBCDelegate.java:897) at org.quartz.impl.jdbcjobstore.JobStoreSupport.storeTrigger(JobStoreSupport.java:1202) ... 5 more This error is also reported every four minutes from a different server at different times. On server start, it is evident that an availability duration alert was scheduled and missed: DEBUG [org.rhq.enterprise.server.scheduler.EnhancedSchedulerImpl] Looks like repeating job [org.rhq.enterprise.server.scheduler.jobs.SavedSearchResultCountRecalculationJob:org.rhq.enterprise.server.scheduler.jobs.SavedSearchResultCountRecalculationJob] is already scheduled - removing it so it can be rescheduled... Version-Release number of selected component (if applicable): 3.1.2 - Build: d2b7ee5:38ab852 -- Server Hotfix-08 How reproducible: It is not clear how this issue first occurs. However, once it does, it happens constantly every four minutes.