Bug 1501996 - NOR doesn't use 30 days' worth of metrics
Summary: NOR doesn't use 30 days' worth of metrics
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat CloudForms Management Engine
Classification: Red Hat
Component: C&U Capacity and Utilization
Version: 5.8.0
Hardware: All
OS: All
unspecified
medium
Target Milestone: GA
: 5.10.0
Assignee: James Wong
QA Contact: Tasos Papaioannou
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-10-13 16:44 UTC by Tasos Papaioannou
Modified: 2019-02-07 23:00 UTC (History)
5 users (show)

Fixed In Version: 5.10.0.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-02-07 23:00:32 UTC
Category: ---
Cloudforms Team: ---
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2019:0212 0 None None None 2019-02-07 23:00:38 UTC

Description Tasos Papaioannou 2017-10-13 16:44:27 UTC
Description of problem:

The values shown in "Normal Operating Ranges (over 30 days)" only ever uses 29 days of metric rollups.

Version-Release number of selected component (if applicable):

5.8.2.2

How reproducible:

100%


Steps to Reproduce:
1.) Capture at least 30 days' worth of metrics for a VM.
2.) View the Normal Operating Ranges pane in the VM summary page.
3.) Compare the values to the average over the last 30 daily rollup records in the database.

Actual results:

Normal Operating Ranges only uses the past 29 daily rollups.

Expected results:

Normal Operating Ranges uses the past 30 daily rollups.

Additional info:

Example:

1.) Today is 10/13, and 9/13 was 30 days ago.

vmdb_production=# select now(), now() - interval '30 days' as thirty_days_ago;
              now              |        thirty_days_ago        
-------------------------------+-------------------------------
 2017-10-13 16:24:07.530773+00 | 2017-09-13 16:24:07.530773+00
(1 row)

2.) The auto_test_services VM has daily rollups from the last 30 days:

vmdb_production=# select timestamp, resource_name from metric_rollups where resource_type='VmOrTemplate' and capture_interval_name='daily' and resource_name='auto_test_services' order by timestamp, resource_name;
      timestamp      |   resource_name    
---------------------+--------------------
 2017-09-13 00:00:00 | auto_test_services
 2017-09-14 00:00:00 | auto_test_services
 2017-09-15 00:00:00 | auto_test_services
[...]
 2017-10-11 00:00:00 | auto_test_services
 2017-10-12 00:00:00 | auto_test_services
(30 rows)

3.) The average of cpu_usagemhz_rate_average over the 30 daily rollups is 66.19:

vmdb_production=# select avg(cpu_usagemhz_rate_average) from metric_rollups where resource_type='VmOrTemplate' and capture_interval_name='daily' and resource_name='auto_test_services';
       avg        
------------------
 66.1854722222222
(1 row)

4.) The average of cpu_usagemhz_rate_average over the last 29 daily rollups is 66.16:

vmdb_production=# select avg(cpu_usagemhz_rate_average) from metric_rollups where resource_type='VmOrTemplate' and capture_interval_name='daily' and resource_name='auto_test_services' and timestamp>'2017-09-13 00:00:00';
       avg        
------------------
 66.1573850574713
(1 row)

5.) The Normal Operating Ranges pane claims to be over 30 days, but shows the 29-day average:

Normal Operating Ranges (over 30 days)
CPU 	  	 
	Average 66.16 MHz

Comment 2 James Wong 2018-02-28 22:39:02 UTC
Tasos,

Any chance you have a setup that can reproduce this issue?

I am not able to reproduce on my end: 

I have a db which has more than 30 days worth of data.  However the data was older and I hacked a bit to force the query to occur from the last available daily rollup. It does return 30 days of data.  

The `perfs` is the query object which will be used to calculate the NOR.


(byebug) perfs.all.collect(&:timestamp)
  MetricRollup Load (5.1ms)  SELECT "metric_rollups".* FROM "metric_rollups" WHERE "metric_rollups"."time_profile_id" = 10000000000001 AND "metric_rollups"."capture_interval_name" = $1 AND ("metric_rollups"."timestamp" BETWEEN $2 AND $3) AND "metric_rollups"."resource_type" = 'VmOrTemplate' AND "metric_rollups"."resource_id" = 10000000000168 ORDER BY timestamp  [["capture_interval_name", "daily"], ["timestamp", "2017-10-18 03:00:00"], ["timestamp", "2017-11-17 03:00:00"]]
  MetricRollup Inst Including Associations (1.0ms - 30rows)
[Thu, 19 Oct 2017 00:00:00 UTC +00:00, Fri, 20 Oct 2017 00:00:00 UTC +00:00, Sat, 21 Oct 2017 00:00:00 UTC +00:00, Sun, 22 Oct 2017 00:00:00 UTC +00:00, Mon, 23 Oct 2017 00:00:00 UTC +00:00, Tue, 24 Oct 2017 00:00:00 UTC +00:00, Wed, 25 Oct 2017 00:00:00 UTC +00:00, Thu, 26 Oct 2017 00:00:00 UTC +00:00, Fri, 27 Oct 2017 00:00:00 UTC +00:00, Sat, 28 Oct 2017 00:00:00 UTC +00:00, Sun, 29 Oct 2017 00:00:00 UTC +00:00, Mon, 30 Oct 2017 00:00:00 UTC +00:00, Tue, 31 Oct 2017 00:00:00 UTC +00:00, Wed, 01 Nov 2017 00:00:00 UTC +00:00, Thu, 02 Nov 2017 00:00:00 UTC +00:00, Fri, 03 Nov 2017 00:00:00 UTC +00:00, Sat, 04 Nov 2017 00:00:00 UTC +00:00, Sun, 05 Nov 2017 00:00:00 UTC +00:00, Mon, 06 Nov 2017 00:00:00 UTC +00:00, Tue, 07 Nov 2017 00:00:00 UTC +00:00, Wed, 08 Nov 2017 00:00:00 UTC +00:00, Thu, 09 Nov 2017 00:00:00 UTC +00:00, Fri, 10 Nov 2017 00:00:00 UTC +00:00, Sat, 11 Nov 2017 00:00:00 UTC +00:00, Sun, 12 Nov 2017 00:00:00 UTC +00:00, Mon, 13 Nov 2017 00:00:00 UTC +00:00, Tue, 14 Nov 2017 00:00:00 UTC +00:00, Wed, 15 Nov 2017 00:00:00 UTC +00:00, Thu, 16 Nov 2017 00:00:00 UTC +00:00, Fri, 17 Nov 2017 00:00:00 UTC +00:00]
(byebug) perfs.all.collect(&:timestamp).count
  MetricRollup Load (4.7ms)  SELECT "metric_rollups".* FROM "metric_rollups" WHERE "metric_rollups"."time_profile_id" = 10000000000001 AND "metric_rollups"."capture_interval_name" = $1 AND ("metric_rollups"."timestamp" BETWEEN $2 AND $3) AND "metric_rollups"."resource_type" = 'VmOrTemplate' AND "metric_rollups"."resource_id" = 10000000000168 ORDER BY timestamp  [["capture_interval_name", "daily"], ["timestamp", "2017-10-18 03:00:00"], ["timestamp", "2017-11-17 03:00:00"]]
  MetricRollup Inst Including Associations (0.7ms - 30rows)
30

Thanks,
James

Comment 3 Tasos Papaioannou 2018-03-14 18:19:35 UTC
Hi James,

I apologize for the delay; I was on leave for several weeks. I don't have a reproducer at the moment, but I can work on setting one up and let you know when it's ready.

-Tasos

Comment 5 Tasos Papaioannou 2018-04-16 19:13:04 UTC
Appliance is:

10.8.197.172
web UI: admin/smartvm
SSH:    root/smartvm

Under the rhv41 provider, the cu-24x7 VM shows:

Memory 	  	 
Max 	1.36 GB
High 	1.2 GB
Average	953.28 MB
Low 	680.06 MB 

Looking in the database, we see 31 consecutive daily rollups, through yesterday:

****
vmdb_production=# select now();
              now              
-------------------------------
 2018-04-16 15:04:17.760951-04
(1 row)

vmdb_production=# select timestamp, derived_memory_used from metric_rollups where capture_interval_name='daily' and resource_type='VmOrTemplate' and resource_name='cu-24x7' and parent_ems_id=6 order by timestamp desc;
      timestamp      | derived_memory_used 
---------------------+---------------------
 2018-04-15 00:00:00 |             1392.64
 2018-04-14 00:00:00 |             1351.68
 2018-04-13 00:00:00 |    1340.49185185185
 2018-04-12 00:00:00 |             1310.72
 2018-04-11 00:00:00 |    1274.44385185185
 2018-04-10 00:00:00 |    1244.54874074074
 2018-04-09 00:00:00 |    1214.94755555556
 2018-04-08 00:00:00 |    1184.05688888889
 2018-04-07 00:00:00 |    1147.05066666667
 2018-04-06 00:00:00 |    1113.94133333333
 2018-04-05 00:00:00 |    1085.39052679495
 2018-04-04 00:00:00 |    1055.01392592593
 2018-04-03 00:00:00 |    1023.97155555556
 2018-04-02 00:00:00 |    984.101925925926
 2018-04-01 00:00:00 |    953.239703703704
 2018-03-31 00:00:00 |    922.955851851852
 2018-03-30 00:00:00 |    892.378074074074
 2018-03-29 00:00:00 |              860.16
 2018-03-28 00:00:00 |    820.954074074074
 2018-03-27 00:00:00 |    790.641777777778
 2018-03-26 00:00:00 |    761.486222222222
 2018-03-25 00:00:00 |    730.396444444445
 2018-03-24 00:00:00 |              696.32
 2018-03-23 00:00:00 |    660.432592592593
 2018-03-22 00:00:00 |    630.878814814815
 2018-03-21 00:00:00 |    600.357925925926
 2018-03-20 00:00:00 |     570.30162962963
 2018-03-19 00:00:00 |              532.48
 2018-03-18 00:00:00 |    499.190518518519
 2018-03-17 00:00:00 |    468.840296296297
 2018-03-16 00:00:00 |    438.158222222222
(31 rows)
****

The average of derived_memory_used from the last 30 rollups doesn't match what's in the web UI:

****
vmdb_production=# select avg(derived_memory_used) from metric_rollups where capture_interval_name='daily' and resource_type='VmOrTemplate' and resource_name='cu-24x7' and parent_ems_id=6 and timestamp >= '2018-03-17 00:00:00';
       avg        
------------------
 937.133758300572
(1 row)
****

The average from the last 29 rollups does match:

****
vmdb_production=# select avg(derived_memory_used) from metric_rollups where capture_interval_name='daily' and resource_type='VmOrTemplate' and resource_name='cu-24x7' and parent_ems_id=6 and timestamp >= 
'2018-03-18 00:00:00';
       avg        
------------------
 953.281808714513
(1 row)
****

Comment 7 CFME Bot 2018-06-04 18:11:27 UTC
New commit detected on ManageIQ/manageiq/master:

https://github.com/ManageIQ/manageiq/commit/f08b0cdc1d6c99e8e7b158184673e4389409feed
commit f08b0cdc1d6c99e8e7b158184673e4389409feed
Author:     James Wong <jwong>
AuthorDate: Wed May  2 09:03:10 2018 -0400
Commit:     James Wong <jwong>
CommitDate: Wed May  2 09:03:10 2018 -0400

    NOR covers 30 days

    fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1501996

 app/models/metric/long_term_averages.rb | 3 +-
 app/models/vm_or_template/right_sizing.rb | 12 +-
 spec/models/metric_spec.rb | 5 +-
 spec/models/vm_or_template_spec.rb | 6 +
 4 files changed, 19 insertions(+), 7 deletions(-)

Comment 8 Tasos Papaioannou 2018-07-11 14:58:35 UTC
Verified on 5.10.0.3.

Comment 9 errata-xmlrpc 2019-02-07 23:00:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:0212


Note You need to log in before you can comment on or make changes to this bug.