Bug 734599 - it takes 14 seconds to update a metric schedule on a compat group with 1,000 members
Summary: it takes 14 seconds to update a metric schedule on a compat group with 1,000 ...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: RHQ Project
Classification: Other
Component: Core UI
Version: 4.0.1
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: ---
: ---
Assignee: Robert Buck
QA Contact: Mike Foley
URL:
Whiteboard:
Depends On:
Blocks: jon3 jon30-perf rhq-gui-timeouts
TreeView+ depends on / blocked
 
Reported: 2011-08-30 21:25 UTC by Ian Springer
Modified: 2013-08-06 00:40 UTC (History)
2 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2012-02-07 19:20:21 UTC
Embargoed:


Attachments (Terms of Use)
patch (11.85 KB, patch)
2011-09-15 14:01 UTC, Robert Buck
no flags Details | Diff
ui error #1 (32.73 KB, image/png)
2011-09-29 19:28 UTC, Mike Foley
no flags Details
ui error #2 (76.22 KB, image/png)
2011-09-29 19:29 UTC, Mike Foley
no flags Details
details of error updating collection interval (6.82 KB, text/plain)
2011-09-29 19:29 UTC, Mike Foley
no flags Details

Description Ian Springer 2011-08-30 21:25:04 UTC
This is exceeds are acceptable duration of 10s for an RPC call.

Comment 1 Robert Buck 2011-09-15 14:01:52 UTC
Created attachment 523386 [details]
patch

Comment 2 Robert Buck 2011-09-15 14:04:29 UTC
commit 50804e161fad368da955a45f8e6d89ae934c4e64
Author: Robert Buck <rbuck>
Date:   2011-09-15 10:03:58 -0400

  [BZ 734599] Change notification of schedule updates to agents so it uses quartz, reducing the time to update a metric schedule on a compat group with 1,000 members from 14s to 1.5s.

Comment 3 Robert Buck 2011-09-15 15:04:42 UTC
branch: feature/performance
commit f93ffa9d47045e3026f3297b63ca2473bad36d73
Author: Robert Buck <rbuck>
Date:   2011-09-15 11:03:34 -0400

    [BZ 734599] Change notification of schedule updates to agents so it uses quartz, reducing the time to update a metric schedule on a compat group with 1,000 members from 14s to 1.5s.

Comment 4 Charles Crouch 2011-09-22 12:56:58 UTC
Waiting for review

Comment 5 Robert Buck 2011-09-29 11:41:49 UTC
To reproduce:

1. setup dynagroa having 1000+ resources
2. update metric schedules for multiple metrics, change from 30 minute default to 30 seconds, e.g.
3. the operation should take less than 2 seconds now

Comment 6 Mike Foley 2011-09-29 19:28:16 UTC
i am documenting some issues that occurred while verifying this:

on a compat group with 1 item (RHQ agent)...

attachment 1 [details] ... UI error after updating collection interval
attachment 2 [details] ... error in message center
attachment 3 [details] ... details of exception

Comment 7 Mike Foley 2011-09-29 19:28:49 UTC
Created attachment 525629 [details]
ui error #1

Comment 8 Mike Foley 2011-09-29 19:29:16 UTC
Created attachment 525630 [details]
ui error #2

Comment 9 Mike Foley 2011-09-29 19:29:57 UTC
Created attachment 525631 [details]
details of error updating collection interval

Comment 10 Mike Foley 2011-09-29 19:30:51 UTC
i am unable to verify this.  

bob ... can we talk about this?

Comment 11 Mike Foley 2011-09-29 20:23:23 UTC
Documenting the keystrokes to create a compatible group:


top-level menu ....Inventory
Compatible Groups  (in lower left corner)
click button New
enter a name for the compat group ... i used 'test'
add 1 item  ... i added the RHQ Agent
click finish

Comment 12 Robert Buck 2011-09-30 16:13:20 UTC
Here are my test results:

BEFORE:
            <schedules>
               <schedule>
                  <schedule-id>10034</schedule-id>
                  <name>NumberSuccessfulCommandsSent</name>
                  <enabled>true</enabled>
                  <interval>30000</interval>
               </schedule>
            </schedules>

AFTER:
            <schedules>
               <schedule>
                  <schedule-id>10034</schedule-id>
                  <name>NumberSuccessfulCommandsSent</name>
                  <enabled>true</enabled>
                  <interval>45000</interval>
               </schedule>
            </schedules>

Agent Metrics:
              AgentHomeDirectory: /some/path/to/agents/local/rhq-agent
      AgentServerClockDifference: 3
    AverageExecutionTimeReceived: 0
        AverageExecutionTimeSent: 21
                     CurrentTime: Fri Sep 30 12:09:29 EDT 2011
                JVMActiveThreads: 34
                   Memory - Heap: Used: 43.37 MB, Committed: 80.81 MB, Max:
119.34 MB
               Memory - Non Heap: Used: 25.44 MB, Committed: 40.60 MB, Max:
117.44 MB
             NumberAgentRestarts: 1
        NumberCommandsActiveSent: 0
           NumberCommandsInQueue: 0
           NumberCommandsSpooled: 0
    NumberFailedCommandsReceived: 0
        NumberFailedCommandsSent: 0
NumberSuccessfulCommandsReceived: 3
    NumberSuccessfulCommandsSent: 108
     NumberTotalCommandsReceived: 3
         NumberTotalCommandsSent: 108
            ReasonForLastRestart: PROCESS_START
                         Sending: true
                          Uptime: 26.2 minutes (1573)
                         Version: 4.1.0-SNAPSHOT

Steps:
(1) create a new compat group, name it "test"
(2) select available resource "RHQ Agent"
(3) after clicking finish, select item, select monitoring, schedules
(4) then choose 1 metric from the list, and set the collection interval
    e.g. "number of commands successfully sent", change 10 minutes to 30
seconds

Comment 13 Robert Buck 2011-09-30 16:14:28 UTC
Oh yeah, in the local agent, the commands I ran were:

> inventory --xml
> metrics

Comment 14 Robert Buck 2011-10-03 12:11:35 UTC
I have re-verified this, and Jay has too. For both of us this works, and even after performing a sterile build. This is either an environmental issue, or Oracle-only issue, though I suspect the former as there were no changes on the database side.

Comment 15 Mike Foley 2011-10-03 14:57:22 UTC
tested with 10/03/2011 build on Postgres = PASS (same as Bob and Jay)
tested with 10/03/2011 build on Oracle   = FAIL

issue is Oracle-specific.

Comment 16 Ian Springer 2011-10-05 15:14:57 UTC
[master b857cf3] fixes this by changing NotifyAgentsOfScheduleUpdatesJob to use:

LookupUtil.getSchedulerBean()

rather than:

jobExecutionContext.getScheduler()

to obtain the Quartz scheduler interface. We don't use the latter anywhere else in the code base. Presumably the issue with using it was that it returned a non-XA-aware scheduler, which messed things up when the same thread then tried to make an EJB call.

Comment 17 Mike Foley 2011-10-05 18:00:10 UTC
encountered an error while trying to verify this BZ

which i logged here:  https://bugzilla.redhat.com/show_bug.cgi?id=743683

Comment 18 Mike Foley 2011-10-05 20:08:55 UTC
verified build #475

Comment 19 Mike Foley 2012-02-07 19:20:21 UTC
changing status of VERIFIED BZs for JON 2.4.2 and JON 3.0 to CLOSED/CURRENTRELEASE


Note You need to log in before you can comment on or make changes to this bug.