Description of problem: In order to improve granularity of the monitoring used for System Dashboard we would like to lower the sampling interval from 1 minute. Related bug: https://bugzilla.redhat.com/show_bug.cgi?id=1306626 In order to lower the interval, a change to engine heartbeat is also required. Steps to Reproduce: 1. Install engine+dwh 2. Test in debug mode the time that sampling is taking is less then the interval.
What is the new required collection interval? 20 /15 seconds?
I'm again not sure that's needed. Leaving need info on Dary. If it is changed we must make sure cfme isn't broken in any way.
We need to lower the interval for better accuracy in the dashboards. cfme afaik collects our samples and divide them by 4 in order for it to match vmware 20 seconds interval. So we need to decide if to align with vmware on 20 seconds or have better accuracy at 15 seconds. A fix for cfme will be required in both cases.
So what happens if the released cfme works with 4.0? It will just fail? We should carefully test that, and then decide if to change that. Anyway, I wouldn't do that for 4.0, but only for 4.1. Lets discuss next week.
Please look into changing this in 4.0 without breaking CFME collection. This is critical to have valuable monitoring info.
2 options for implementation: 1. Adding to History_configurations the Interval between samples for cfme ease of use. 2. Have cfme read the minutes in status directly from the db - each sample holds the "minutes in status" column that represents the time between samples. From the dwh perspective the 2nd option that does not involve changes to the dwh is preferred. A update to ovirt-metrics in cfme is required in both cases. Please update on how to proceed with this.
(In reply to Shirly Radco from comment #6) > 2 options for implementation: > 1. Adding to History_configurations the Interval between samples for cfme > ease of use. > 2. Have cfme read the minutes in status directly from the db - each sample > holds the "minutes in status" column that represents the time between > samples. > Can you give an example of option #2 on how it looks like today and how it will look like when you do 20 seconds interval?
In ovirt-engine-dwhd.conf.in the default is set to # Samples Collection Interleave in Seconds DWH_SAMPLING=60 Currently, Each record in samples table has "minutes_in_status" for default setup it is equal to 1.00 (1 minute). For 20 seconds this will be 0.33.
When lowering the sampling interval we will change this to seconds_in_status for more accurate calculation.
(In reply to Shirly Radco from comment #9) > When lowering the sampling interval we will change this to seconds_in_status > for more accurate calculation. So you'll have both? As CFME also work with older versions.
I can maintain a calculated column of 'minutes_in_status'
verified in ovirt-engine-dwh-4.0.2-1.el7ev.noarch
*** Bug 1108144 has been marked as a duplicate of this bug. ***