Bug 681006 - Abnormally high cpu usage in taskomatic after upgrade to Satellite 5.4
Abnormally high cpu usage in taskomatic after upgrade to Satellite 5.4
Status: CLOSED ERRATA
Product: Red Hat Satellite 5
Classification: Red Hat
Component: Other (Show other bugs)
540
All Linux
urgent Severity high
: ---
: ---
Assigned To: Jan Pazdziora
Pavel Novotny
: Regression
Depends On:
Blocks: sat54-errata
  Show dependency treegraph
 
Reported: 2011-02-28 13:46 EST by Karl Abbott
Modified: 2011-05-23 15:07 EDT (History)
4 users (show)

See Also:
Fixed In Version: quartz-1.8.4-1.el5sat
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2011-03-21 11:25:44 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Karl Abbott 2011-02-28 13:46:13 EST
Description of problem:

Customer repeatedly experiences very high cpu usage since upgrading to Satellite 5.4 stemming from taskomatic and more notably, it's java process.

CPU usage has not gone down in 3 weeks of running Satellite 5.4

Version-Release number of selected component (if applicable):

Satellite 5.4 -- latest packages.

Problem packages:

quartz-1.8.1
quartz-oracle-1.8.1

How reproducible:

100% at customer's site -- 3 to 4 other customers with this or similar problem. Not 100% sure yet if this is the same problem as those other customers -- working with fellow TAMs to figure that one out.

Steps to Reproduce:
1. Upgrade to Satellite 5.4
2. Be running quartz and quartz-oracle at the 1.8.1 level
3.
  
Actual results:

Unexplained CPU hike that won't die down.

Expected results:

Normal CPU usage across the board.

Additional info:

In this particular case, we started by gathering thread dumps out of the taskomatic java process and noticed that the CPU usage time was being dominated by the Quartz Scheduler around lines 280-300.

Upon researching information on the Quartz Scheduler online, I found this bug:

https://jira.terracotta.org/jira/browse/QTZ-50

The regression referred to by this bug is a 100% cpu utilization in quartz problem that hits right at line 287 -- the very part of the Quartz Scheduler that is acting up on the customer's setup.

This bug was introduced in 1.8.1 and was fixed in 1.8.3. The latest version of quartz is 1.8.4 and so I rolled unsupported rpms for quartz and quartz-oracle of 1.8.4 based on the spec found in brew. Those packages can be found at:

http://people.redhat.com/kabbott/sat-quartz

My customer was willing to test these packages even with the understanding that they were unsupported, testing packages. After upgrading to quartz-1.8.4, the abnormally high cpu usage went away immediately.

I would like to ask that we rebase quartz to 1.8.4 as soon as possible as I would like to move my customer to a supported configuration that doesn't have the abnormally high cpu usage problem.
Comment 2 Jan Pazdziora 2011-03-04 06:05:46 EST
Rebased to quartz-1.8.4, tagged and built.
Comment 4 Xixi 2011-03-08 16:45:10 EST
This bug may impact rhn-search too as it uses the same quartz scheduler as taskomatic.
Comment 6 Pavel Novotny 2011-03-14 14:22:45 EDT
Verified.

Old package(s) (quartz-1.8.1-3.el5sat):
  Observed regular 100% CPU usage peaks caused by taskomatic Java processes. 

New package(s) (quartz-1.8.4-1.el5sat):
  The problem has gone, no unusual high CPU usage experienced.
Comment 7 errata-xmlrpc 2011-03-21 11:25:44 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0367.html

Note You need to log in before you can comment on or make changes to this bug.