Bug 1119331

Summary: Job Misfire handler fails: AlertAvailabilityDurationJob' job:Job class must implement the Job interface
Product: [JBoss] JBoss Operations Network Reporter: Larry O'Leary <loleary>
Component: Monitoring - AlertsAssignee: Lukas Krejci <lkrejci>
Status: CLOSED CURRENTRELEASE QA Contact: Mike Foley <mfoley>
Severity: high Docs Contact:
Priority: unspecified    
Version: JON 3.1.2CC: genman, hrupp, jshaughn, lkrejci, loleary, mmahoney
Target Milestone: DR02   
Target Release: JON 3.3.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
A hotfix change to how availability duration jobs were scheduled (from using the Quartz job service to an EJB timer) caused jobs scheduled before the hotfix was applied--but not evaluated or expired when the server was shut down--to produce Job Misfire Handler errors. The fix now takes into account availability duration jobs left behind and changes the job to meet the new scheduling mechanism.
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-12-11 14:02:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Alert none

Description Larry O'Leary 2014-07-14 14:25:31 UTC
Description of problem:
Every four minutes, the following ERROR is reported in the server log:

    ERROR [org.quartz.impl.jdbcjobstore.JobStoreCMT] MisfireHandler: Error handling misfires: Couldn't store trigger 'AVAIL_DURATION_DOWN-712781' for 'org.rhq.enterprise.server.scheduler.jobs.AlertAvailabilityDurationJob' job:Job class must implement the Job interface.
    org.quartz.JobPersistenceException: Couldn't store trigger 'AVAIL_DURATION_DOWN-712781' for 'org.rhq.enterprise.server.scheduler.jobs.AlertAvailabilityDurationJob' job:Job class must implement the Job interface. [See nested exception: java.lang.IllegalArgumentException: Job class must implement the Job interface.]
        at org.quartz.impl.jdbcjobstore.JobStoreSupport.storeTrigger(JobStoreSupport.java:1246)
        at org.quartz.impl.jdbcjobstore.JobStoreSupport.doUpdateOfMisfiredTrigger(JobStoreSupport.java:1014)
        at org.quartz.impl.jdbcjobstore.JobStoreSupport.recoverMisfiredJobs(JobStoreSupport.java:956)
        at org.quartz.impl.jdbcjobstore.JobStoreSupport.doRecoverMisfires(JobStoreSupport.java:3126)
        at org.quartz.impl.jdbcjobstore.JobStoreSupport$MisfireHandler.manage(JobStoreSupport.java:3887)
        at org.quartz.impl.jdbcjobstore.JobStoreSupport$MisfireHandler.run(JobStoreSupport.java:3907)
    Caused by: java.lang.IllegalArgumentException: Job class must implement the Job interface.
        at org.quartz.JobDetail.setJobClass(JobDetail.java:280)
        at org.quartz.impl.jdbcjobstore.StdJDBCDelegate.selectJobDetail(StdJDBCDelegate.java:897)
        at org.quartz.impl.jdbcjobstore.JobStoreSupport.storeTrigger(JobStoreSupport.java:1202)
        ... 5 more


This error is also reported every four minutes from a different server at different times.

On server start, it is evident that an availability duration alert was scheduled and missed:

    DEBUG [org.rhq.enterprise.server.scheduler.EnhancedSchedulerImpl] Looks like repeating job [org.rhq.enterprise.server.scheduler.jobs.SavedSearchResultCountRecalculationJob:org.rhq.enterprise.server.scheduler.jobs.SavedSearchResultCountRecalculationJob] is already scheduled - removing it so it can be rescheduled...


Version-Release number of selected component (if applicable):
3.1.2 - Build: d2b7ee5:38ab852 -- Server Hotfix-08

How reproducible:
It is not clear how this issue first occurs. However, once it does, it happens constantly every four minutes.

Comment 1 Lukas Krejci 2014-07-15 21:39:16 UTC
Given hotfixes are not allowed to touch database I do believe this is a case of stale data in the database.

The org.rhq.enterprise.server.scheduler.jobs.AlertAvailabilityDurationJob used to be a Quartz job class that was later transformed into an EJB timer.

Note that this change is NOT present in the release/jon3.1.x branch, just the hotfix branch.

The only explanation for the exception I have is that the job was kept in the database but after applying the hotfix, the job class was no longer able to carry out the job.

Comment 2 Lukas Krejci 2014-07-15 21:40:54 UTC
That said, I assume this exception is essentially harmless because the job is being run normally but as an EJB timer.

Comment 7 Lukas Krejci 2014-07-16 19:52:07 UTC
in master

commit 29d93c61270b35c9a8733fcf83d3a8dde78d907b
Author: Lukas Krejci <lkrejci>
Date:   Wed Jul 16 21:48:22 2014 +0200

    [BZ 1119331] Handle the jobs potentially left behind after conversion
    of the avail duration check job into an EJB time.

Comment 8 Lukas Krejci 2014-07-16 19:59:38 UTC
commit a90aae26a2574f74bcae1a7fab78e7164b759d1a
Author: Lukas Krejci <lkrejci>
Date:   Wed Jul 16 21:58:18 2014 +0200

    [BZ 1119331] Fix a typo in the DB upgrade step

Comment 12 Jay Shaughnessy 2014-07-17 13:43:04 UTC
*** Bug 1120445 has been marked as a duplicate of this bug. ***

Comment 16 Heiko W. Rupp 2014-07-31 15:27:31 UTC
Setting to modified as this is in release/jon3.3.x

Comment 17 Simeon Pinder 2014-07-31 15:51:33 UTC
Moving to ON_QA as available to test with brew build of DR01: https://brewweb.devel.redhat.com//buildinfo?buildID=373993

Comment 18 Larry O'Leary 2014-08-05 23:07:35 UTC
Steps to reproduce: 

1.  Install and start JBoss EAP 6 standalone server.
2.  Install and start JBoss ON 3.1.2.GA system (pre server hotfix-05).
3.  Import JBoss EAP standalone server into inventory.
4.  Configure JBoss EAP standalone resource's connection settings.
5.  Create availability duration alert definition (stays down 10 minutes) on JBoss EAP server.

    *Name*: `Down too long`
    *Condition Type*: _Availability Duration_
    *Availability Duration*: _Stays Down_
    *Duration*: _10_ _minutes_
    
6.  Verify alert definition is working:

    #.  Shutdown JBoss EAP server.
    #.  Wait 11 minutes.
    
    Alert should have been triggered.
    
    #.  Start JBoss EAP server.
    #.  Wait for availability to report as UP.
    
7.  Shutdown JBoss EAP server and wait 2 minutes.
8.  Shutdown JBoss ON system.
9.  Wait 8 minutes.
10. Apply server hotfix-05 or later.
11. Start JBoss ON system.


Actual results:
JBoss ON server.log reports the following ERROR every four minutes:

    2014-07-22 14:01:56,624 ERROR [org.quartz.impl.jdbcjobstore.JobStoreCMT] MisfireHandler: Error handling misfires: Couldn't store trigger 'AVAIL_DURATION_DOWN-10004' for 'org.rhq.enterprise.server.scheduler.jobs.AlertAvailabilityDurationJob' job:Job class must implement the Job interface.
    org.quartz.JobPersistenceException: Couldn't store trigger 'AVAIL_DURATION_DOWN-10004' for 'org.rhq.enterprise.server.scheduler.jobs.AlertAvailabilityDurationJob' job:Job class must implement the Job interface. [See nested exception: java.lang.IllegalArgumentException: Job class must implement the Job interface.]


Expected results:
No error.

Comment 19 Matt Mahoney 2014-08-06 16:01:48 UTC
Created attachment 924525 [details]
Alert