Bug 535204 (RHQ-1925) - Lots of quartz warnings in HA mode
Summary: Lots of quartz warnings in HA mode
Keywords:
Status: CLOSED WONTFIX
Alias: RHQ-1925
Product: RHQ Project
Classification: Other
Component: No Component
Version: 1.2
Hardware: All
OS: All
low
medium
Target Milestone: ---
: ---
Assignee: RHQ Project Maintainer
QA Contact:
URL: http://jira.rhq-project.org/browse/RH...
Whiteboard:
Depends On:
Blocks: rhq_triage
TreeView+ depends on / blocked
 
Reported: 2009-04-03 13:02 UTC by Heiko W. Rupp
Modified: 2010-08-25 15:39 UTC (History)
0 users

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-08-25 15:39:36 UTC
Embargoed:


Attachments (Terms of Use)

Description Heiko W. Rupp 2009-04-03 13:02:00 UTC
I am running two servers in HA mode (with no agent connected yet).

The server started first shows the following messages repeatedly

14:51:15,503 WARN  [JobStoreCMT] This scheduler instance (fedora9.home.pilhuhn.de1238762655424) is still active but was recovered by another instance in the cluster.  This may cause inconsistent behavior.

The logging of this message seems to have started around the time the the 2nd server was starting the scheduler.
Note that the clocks are not completely in sync (small number of seconds away).


Comment 1 Joseph Marques 2009-04-03 13:10:38 UTC
from my readings on this issue, i don't think this is fatal.  however, just to be on the safe side, do a little investigation for me.  try and figure out whether all servers are still round-robin'ing the clustered quartz jobs, and that the ejb-timer-based jobs are still functioning on each server.

note: the quartz documentation notes that the clocks must be within 1 second of each other -- http://www.opensymphony.com/quartz/wikidocs/TutorialLesson11.html

Comment 2 John Mazzitelli 2009-08-17 17:11:23 UTC
also saw this when the oracle database was killed , then the server was killed (the entire test environment was shutdown unbeknowst to me).

I think this might have something to do with a stateful job currently in progress during that shutdown. Since stateful jobs can only run on a single box, if the system was killed and then restarted, that stateful job might get restarted on another box from where it was originally running. This is just a guess, but I definitely started getting these warn messages when restarted after the DB and server was abruptly killed.

Comment 3 John Mazzitelli 2009-08-17 17:13:53 UTC
just found out one of my nodes' clock is off by 12 hours. I bet that's the cause

Comment 4 Red Hat Bugzilla 2009-11-10 20:49:36 UTC
This bug was previously known as http://jira.rhq-project.org/browse/RHQ-1925


Comment 5 wes hayutin 2010-02-16 16:52:25 UTC
Temporarily adding the keyword "SubBug" so we can be sure we have accounted for all the bugs.

keyword:
new = Tracking + FutureFeature + SubBug

Comment 6 wes hayutin 2010-02-16 16:58:12 UTC
making sure we're not missing any bugs in rhq_triage

Comment 7 Corey Welton 2010-08-25 15:39:36 UTC
Closing per 25-Aug triage.


Note You need to log in before you can comment on or make changes to this bug.