535204 – (RHQ-1925) Lots of quartz warnings in HA mode

Bug 535204 (RHQ-1925) - Lots of quartz warnings in HA mode

Summary: Lots of quartz warnings in HA mode

Keywords:
Status:	CLOSED WONTFIX
Alias:	RHQ-1925
Product:	RHQ Project
Classification:	Other
Component:	No Component
Sub Component:
Version:	1.2
Hardware:	All
OS:	All
Priority:	low
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	RHQ Project Maintainer
QA Contact:
Docs Contact:
URL:	http://jira.rhq-project.org/browse/RH...
Whiteboard:
Depends On:
Blocks:	rhq_triage
TreeView+	depends on / blocked

Reported:	2009-04-03 13:02 UTC by Heiko W. Rupp
Modified:	2010-08-25 15:39 UTC (History)
CC List:	0 users
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2010-08-25 15:39:36 UTC
Embargoed:

Attachments	(Terms of Use)

Description Heiko W. Rupp 2009-04-03 13:02:00 UTC

I am running two servers in HA mode (with no agent connected yet).

The server started first shows the following messages repeatedly

14:51:15,503 WARN  [JobStoreCMT] This scheduler instance (fedora9.home.pilhuhn.de1238762655424) is still active but was recovered by another instance in the cluster.  This may cause inconsistent behavior.

The logging of this message seems to have started around the time the the 2nd server was starting the scheduler.
Note that the clocks are not completely in sync (small number of seconds away).

Comment 1 Joseph Marques 2009-04-03 13:10:38 UTC

from my readings on this issue, i don't think this is fatal.  however, just to be on the safe side, do a little investigation for me.  try and figure out whether all servers are still round-robin'ing the clustered quartz jobs, and that the ejb-timer-based jobs are still functioning on each server.

note: the quartz documentation notes that the clocks must be within 1 second of each other -- http://www.opensymphony.com/quartz/wikidocs/TutorialLesson11.html

Comment 2 John Mazzitelli 2009-08-17 17:11:23 UTC

also saw this when the oracle database was killed , then the server was killed (the entire test environment was shutdown unbeknowst to me).

I think this might have something to do with a stateful job currently in progress during that shutdown. Since stateful jobs can only run on a single box, if the system was killed and then restarted, that stateful job might get restarted on another box from where it was originally running. This is just a guess, but I definitely started getting these warn messages when restarted after the DB and server was abruptly killed.

Comment 3 John Mazzitelli 2009-08-17 17:13:53 UTC

just found out one of my nodes' clock is off by 12 hours. I bet that's the cause

Comment 4 Red Hat Bugzilla 2009-11-10 20:49:36 UTC

This bug was previously known as http://jira.rhq-project.org/browse/RHQ-1925

Comment 5 wes hayutin 2010-02-16 16:52:25 UTC

Temporarily adding the keyword "SubBug" so we can be sure we have accounted for all the bugs.

keyword:
new = Tracking + FutureFeature + SubBug

Comment 6 wes hayutin 2010-02-16 16:58:12 UTC

making sure we're not missing any bugs in rhq_triage

Comment 7 Corey Welton 2010-08-25 15:39:36 UTC

Closing per 25-Aug triage.

Note You need to log in before you can comment on or make changes to this bug.