This is exceeds are acceptable duration of 10s for an RPC call.
Created attachment 523386 [details] patch
commit 50804e161fad368da955a45f8e6d89ae934c4e64 Author: Robert Buck <rbuck> Date: 2011-09-15 10:03:58 -0400 [BZ 734599] Change notification of schedule updates to agents so it uses quartz, reducing the time to update a metric schedule on a compat group with 1,000 members from 14s to 1.5s.
branch: feature/performance commit f93ffa9d47045e3026f3297b63ca2473bad36d73 Author: Robert Buck <rbuck> Date: 2011-09-15 11:03:34 -0400 [BZ 734599] Change notification of schedule updates to agents so it uses quartz, reducing the time to update a metric schedule on a compat group with 1,000 members from 14s to 1.5s.
Waiting for review
To reproduce: 1. setup dynagroa having 1000+ resources 2. update metric schedules for multiple metrics, change from 30 minute default to 30 seconds, e.g. 3. the operation should take less than 2 seconds now
i am documenting some issues that occurred while verifying this: on a compat group with 1 item (RHQ agent)... attachment 1 [details] ... UI error after updating collection interval attachment 2 [details] ... error in message center attachment 3 [details] ... details of exception
Created attachment 525629 [details] ui error #1
Created attachment 525630 [details] ui error #2
Created attachment 525631 [details] details of error updating collection interval
i am unable to verify this. bob ... can we talk about this?
Documenting the keystrokes to create a compatible group: top-level menu ....Inventory Compatible Groups (in lower left corner) click button New enter a name for the compat group ... i used 'test' add 1 item ... i added the RHQ Agent click finish
Here are my test results: BEFORE: <schedules> <schedule> <schedule-id>10034</schedule-id> <name>NumberSuccessfulCommandsSent</name> <enabled>true</enabled> <interval>30000</interval> </schedule> </schedules> AFTER: <schedules> <schedule> <schedule-id>10034</schedule-id> <name>NumberSuccessfulCommandsSent</name> <enabled>true</enabled> <interval>45000</interval> </schedule> </schedules> Agent Metrics: AgentHomeDirectory: /some/path/to/agents/local/rhq-agent AgentServerClockDifference: 3 AverageExecutionTimeReceived: 0 AverageExecutionTimeSent: 21 CurrentTime: Fri Sep 30 12:09:29 EDT 2011 JVMActiveThreads: 34 Memory - Heap: Used: 43.37 MB, Committed: 80.81 MB, Max: 119.34 MB Memory - Non Heap: Used: 25.44 MB, Committed: 40.60 MB, Max: 117.44 MB NumberAgentRestarts: 1 NumberCommandsActiveSent: 0 NumberCommandsInQueue: 0 NumberCommandsSpooled: 0 NumberFailedCommandsReceived: 0 NumberFailedCommandsSent: 0 NumberSuccessfulCommandsReceived: 3 NumberSuccessfulCommandsSent: 108 NumberTotalCommandsReceived: 3 NumberTotalCommandsSent: 108 ReasonForLastRestart: PROCESS_START Sending: true Uptime: 26.2 minutes (1573) Version: 4.1.0-SNAPSHOT Steps: (1) create a new compat group, name it "test" (2) select available resource "RHQ Agent" (3) after clicking finish, select item, select monitoring, schedules (4) then choose 1 metric from the list, and set the collection interval e.g. "number of commands successfully sent", change 10 minutes to 30 seconds
Oh yeah, in the local agent, the commands I ran were: > inventory --xml > metrics
I have re-verified this, and Jay has too. For both of us this works, and even after performing a sterile build. This is either an environmental issue, or Oracle-only issue, though I suspect the former as there were no changes on the database side.
tested with 10/03/2011 build on Postgres = PASS (same as Bob and Jay) tested with 10/03/2011 build on Oracle = FAIL issue is Oracle-specific.
[master b857cf3] fixes this by changing NotifyAgentsOfScheduleUpdatesJob to use: LookupUtil.getSchedulerBean() rather than: jobExecutionContext.getScheduler() to obtain the Quartz scheduler interface. We don't use the latter anywhere else in the code base. Presumably the issue with using it was that it returned a non-XA-aware scheduler, which messed things up when the same thread then tried to make an EJB call.
encountered an error while trying to verify this BZ which i logged here: https://bugzilla.redhat.com/show_bug.cgi?id=743683
verified build #475
changing status of VERIFIED BZs for JON 2.4.2 and JON 3.0 to CLOSED/CURRENTRELEASE