Bug 536048 (RHQ-43)

Summary: Metric avg wrong for groups
Product: [Other] RHQ Project Reporter: Heiko W. Rupp <hrupp>
Component: MonitoringAssignee: Charles Crouch <ccrouch>
Status: CLOSED WONTFIX QA Contact:
Severity: medium Docs Contact:
Priority: high    
Version: 0.1CC: cwelton, hbrock, jshaughn
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: All   
URL: http://jira.rhq-project.org/browse/RHQ-43
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-05-12 15:46:23 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Bug Depends On:    
Bug Blocks: 678340    

Description Heiko W. Rupp 2008-03-06 14:52:00 EST
20:31:06 < ghinkle> there is another problem in metrics... the aggregation of group data
20:32:47 < ghinkle> if i have metric A scheduled for 5 minutes on one resource and 10 minutes for another and then look at them in a group where the buckets are say 10 minutes surrounding all data points, the average displayed will be 6.667 instead of 7.5

Comment 1 Heiko W. Rupp 2008-03-07 12:24:50 EST
The avg and sum values for groups are wrong.
Especially it makes no sense that if one value is collected 10 times in a timespan and the other only 1 time, that the first one gets 90% of the avg and the other 10.

Comment 2 Charles Crouch 2008-03-11 10:32:52 EDT
Joe, could you investigate whats required to address this issue
Comment 3 Charles Crouch 2008-04-02 11:05:22 EDT
(10:00:22 AM) ghinkle: ccrouch, i recommend we push rhq-43 out of the next release

We had similar problems in JON 1.4
Comment 4 John Mazzitelli 2008-04-02 11:08:11 EDT
I think the "workaround" would be to not have different schedules for the same metric across resources in the group.  I'm sure there are reasons for doing it otherwise, but it seems to me the more common use case would be to collect the same metric with the same collection schedule across resources.
Comment 5 Joseph Marques 2008-12-18 13:08:54 EST
mazz, in general that sounds like a good strategy, but it's nearly impossible to prevent that in practice.  the issue lies in the fact that with a ManyToMany relationship between Resources and ResourceGroups, a resource (and, thus, it's metrics) may actually be in many, many different groups.  

pragmatically speaking, your suggested workaround would then scale up to requiring that all metrics of a particular type must be moved in tandem - i.e., disallow individual schedules and only allow updates at the metric template-level...because if we didn't do it that way the logic would simply be too complex to get correct across all current group permutations in the system.

i think this is fixable, just not fun / simple.
Comment 6 John Mazzitelli 2008-12-18 13:12:25 EST
maybe we check and if we see two or more different coll interval values, we just put a yellow bar at the top of the graphs and say, "these values may be inaccurate due to the differing intervals across resources"
Comment 7 Red Hat Bugzilla 2009-11-10 16:09:19 EST
This bug was previously known as http://jira.rhq-project.org/browse/RHQ-43
Comment 8 wes hayutin 2010-02-16 16:09:50 EST
Mass move to component = Monitoring