Bug 536444 - (RHQ-792) asynchronous updating of metric tempates / schedules
asynchronous updating of metric tempates / schedules
Status: CLOSED CURRENTRELEASE
Product: RHQ Project
Classification: Other
Component: Monitoring (Show other bugs)
1.0.1
All All
high Severity medium (vote)
: ---
: ---
Assigned To: Joseph Marques
Corey Welton
http://jira.rhq-project.org/browse/RH...
: Improvement
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-09-06 02:12 EDT by Joseph Marques
Modified: 2010-08-12 12:51 EDT (History)
0 users

See Also:
Fixed In Version: 2.4
Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-08-12 12:51:43 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Joseph Marques 2008-09-06 02:12:00 EDT
today, an update to a metric schedule requires that the corresponding agent be online.  while this doesn't seem that restrictive, it becomes problematic at large scales.  in particular, updating of a metric template (which updates all schedules for all resources of the corresponding type) require that all agents for all resources be online.

improvement: make schedule updates asynchronous.  using the UI should return immediately, whether or not the agent is online.  if the agent is online, it should get the update (relatively) immediately.  if the agent if offline, there should be logic that the agent checks where the server-side schedules need to be updated.
Comment 1 Joseph Marques 2008-09-06 19:18:10 EDT
rev1373 - test to make sure the corresponding AgentClient is up, before interacting with it (use 2000ms timeout); 
on the crazy off-chance that the AgentClient is available but sending the report fails, catch Throwable to make sure the caller's request will continue to completion; 
update mtime's of resources whose MeasurementSchedules are being changed (directly on the resource, or indirectly through the metric template); 
while i was at it, improve the performance of the end-to-end flow for metric tempalte updates by batch all ResourceMeasurementSchedulesRequest's for a single agent into a single remote method call and, more importantly, single check against the availability of the corresponding AgentClient; 
Comment 2 Joseph Marques 2008-09-06 19:52:12 EDT
test 1 - single resource, update schedule, agent online

1) go to some resource in inventory whose agent is up
2) navigate to monitor > configuration subtab
3) change some collection interval to something odd like 42s
4) go to that agent prompt and execute "inventory -x -e inv.dat"
5) then in a sep terminal execute "cat inv.dat | grep 42000 -c" and make sure the count is precisely 1

test 2 - single resource, disable schedule, agent online

1) go to some resource in inventory whose agent is up
2) navigate to monitor > configuration subtab
3) disable the collection interval that you previously marked with 42s
4) go to that agent prompt and execute "inventory -x -e inv.dat"
5) then in a sep terminal execute "cat inv.dat | grep 42000 -B 1" to make sure this schedule has been disabled agent-side
Comment 3 Joseph Marques 2008-09-06 19:52:23 EDT
test 3 - single resource, update schedule, agent offline

1) repeat steps 1-3 from test 1...but change the time to, say, 31 seconds
2) turn the agent back on and wait 30-60 seconds for the first inventory report to be sent (you can confirm when this happens if you tail the server log)
3) repeat steps 4 & 5 from test 1

test 4 - single resource, disable schedule, agent offline

1) repeat steps 1-3 from test 2...but disable the one you just set to 31 seconds
2) turn the agent back on and wait 30-60 seconds for the first inventory report to be sent (you can confirm when this happens if you tail the server log)
3) repeat steps 4 & 5 from test 2
Comment 4 Joseph Marques 2008-09-06 19:52:34 EDT
test 5 - multi-resource, update schedule, agent online

1) go to admin > monitoring defaults > choose some resource type (and make sure you count the number of resources of that type in your inventory, we'll call this value X)
2) change some collection interval for a single schedule to something odd like 53s
3) go to that agent prompt and execute "inventory -x -e inv.dat"
4) then in a sep terminal execute "cat inv.dat | grep 53000 -c" and make sure the count is precisely X

test 6 - multi-resource, disable schedule, agent online

1) go to admin > monitoring defaults > choose some resource type (and make sure you count the number of resources of that type in your inventory, we'll call this value X)
2) disable the collection interval that you previously marked with 53s
3) go to that agent prompt and execute "inventory -x -e inv.dat"
4) then in a sep terminal execute "cat inv.dat | grep 53000 -B 1" and make sure that all X entries are disabled now
Comment 5 Joseph Marques 2008-09-06 19:58:02 EDT
test 7 - multi-resource, update schedule, agent offline

1) repeat steps 1 & 2 from test 5, but change the time to, say, 67 seconds
2) turn the agent back on and wait 30-60 seconds for the first inventory report to be sent (you can confirm when this happens if you tail the server log) 
3) repeat steps 3 & 4 from test 5

test 8 - multi-resource, disable schedule, agent offline

1) repeat steps 1 & 2 from test 6, but disable the one you just set to 67 seconds
2) turn the agent back on and wait 30-60 seconds for the first inventory report to be sent (you can confirm when this happens if you tail the server log) 
3) repeat steps 3 & 4 from test 6
Comment 6 Joseph Marques 2008-09-06 20:33:55 EDT
rev1375 - scale back logging on the sever-side for measurement schedule and metric template updates; 
Comment 7 Corey Welton 2008-09-19 14:04:57 EDT
QA Verified.
Comment 8 Red Hat Bugzilla 2009-11-10 16:17:00 EST
This bug was previously known as http://jira.rhq-project.org/browse/RHQ-792
This bug is related to RHQ-1996
This bug is related to RHQ-2246
Comment 9 wes hayutin 2010-02-16 16:09:30 EST
Mass move to component = Monitoring
Comment 10 Joseph Marques 2010-07-01 01:20:47 EDT
This shouldn't be in assigned state.  It has been verified for nearly 9 months now.
Comment 11 Corey Welton 2010-07-01 08:58:41 EDT
QA Closing :)
Comment 12 Corey Welton 2010-08-12 12:51:43 EDT
Mass-closure of verified bugs against JON.

Note You need to log in before you can comment on or make changes to this bug.