Is this a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1038869 ? In that ticket we are seeing the "wrong" value for total CPUs in various places. However, the backend database is correct. One issue there is in the UI where the summary and other places report vcpus instead of total cores. In addition, there is a problem with chargeback using the wrong value.
One interesting I found in that other ticket is that our VMware appliance is configured with 4 cpus @ 1 cores each = 4 logical CPUs, but our RHEV appliance is configured with 1 cpu @ 4 cores = 4 logical CPUs. If chargeback is looking at CPUs, then different numbers are used. I think chargeback should be looking at the logical CPUs column.
There are really four things going on here. 1) The customer has not configured the Metrics authentication for the RHEVM provider. Therefore absolutely no realtime compute metrics are being collected for the RHEVM VMs. (I know this non-obvious given that there are hourly rollups, but there's more). 2) Storage C&U has an out-of-the-box default configuration to collect Storage metrics every 2 hours. *AND*, when Storage metrics are collected, it attempts to update an hourly rollup record for each VM on that Storage. Here's the kicker, Storage C&U and Compute C&U happen on independent schedules. This means that when the Storage C&U gets ready to update the hourly rollup record for each VM, it looks for an hourly row to update. If there's no hourly row yet, it creates one, assuming that Compute C&U will just come along later and update that same row. 3) Anytime an hourly record is created for a VM, a set of "derived" fields are calculated. Among those derived fields are: * cpu_usagemhz_rate_average * derived_vm_numvcpus * derived_memory_used * derived_memory_available When looking at the metric rollup data in comment #4, it's clear that cpu_usagemhz_rate_average and derived_memory_used are nil, while derived_vm_numvcpus and derived_memory_available are non-nil. This is because the cpu_usagemhz_rate_average and derived_memory_used values come from Compute metrics. While derived_vm_numvcpus and derived_memory_available are static values that come from inventory collection (refresh). 4) Finally, the chargeback report is configured to charge for "allocated" memory and CPU. Taking this all into account, we can explain everything happening here: --> There are VM hourly rollups every two hours for RHEVM VMs because: - there are no RHEVM VM realtime metrics because it's not configured, and - the Storage C&U, running every 2 hours, has created rollups for VMs without realtime metrcis but with static memory and CPU values collected from inventory (refresh). --> The chargeback report shows values because it is going off of hourly rollups for VMs and looking at CPU and memory allocated to the VMs. The underlying question is: What *should* happen?
Spoke to Oleg, and he had a great idea. If we bump Storage C&U up from collecting every two hours to every hour, then we don't have any gaps, and we automatically get the derived values hourly.
https://github.com/ManageIQ/manageiq/pull/2815
New commit detected on manageiq/master: https://github.com/ManageIQ/manageiq/commit/435a5b58d56f194922b1b3d818e065141ad6f95c commit 435a5b58d56f194922b1b3d818e065141ad6f95c Author: Jason Frey <jfrey> AuthorDate: Thu Apr 30 15:34:16 2015 -0400 Commit: Jason Frey <jfrey> CommitDate: Fri May 1 12:59:34 2015 -0400 Do not derive "available" values if we don't have any usage values. The lack of cpu or mem usage values implies that the target being collected is either off, or not configured for collection. In both cases, collecting "allocated" values does not make sense. If off, the target will not be given those resources, so they are not really available. If not configured for collection, then we should not be doing the derivation at all. The circumstance for this situation occurs when normal C&U for a target is not enabled, but storage C&U still occurs. When storage C&U comes along it calls process_derived_columns, but some of those derived columns should not be calculated in that state. If the normal C&U were to come along later, then it would fill in the missing details. https://bugzilla.redhat.com/show_bug.cgi?id=1038869 https://bugzilla.redhat.com/show_bug.cgi?id=1212164 vmdb/app/models/metric/common.rb | 2 + vmdb/app/models/metric/processing.rb | 15 ++- vmdb/spec/factories/metric.rb | 21 +--- vmdb/spec/factories/metric_rollup.rb | 20 ++++ vmdb/spec/models/metric/processing_spec.rb | 180 +++++++++++++++++++++++++++++ vmdb/spec/models/metric_spec.rb | 6 +- 6 files changed, 223 insertions(+), 21 deletions(-) create mode 100644 vmdb/spec/factories/metric_rollup.rb create mode 100644 vmdb/spec/models/metric/processing_spec.rb
New commit detected on manageiq/master: https://github.com/ManageIQ/manageiq/commit/8abba9ae65cc87696f7c45399ceb2dae100cd610 commit 8abba9ae65cc87696f7c45399ceb2dae100cd610 Author: Jason Frey <jfrey> AuthorDate: Thu Apr 30 15:27:23 2015 -0400 Commit: Jason Frey <jfrey> CommitDate: Thu Apr 30 17:27:37 2015 -0400 Change storage capture to 60m to avoid leaving gaps in metrics_rollups. https://bugzilla.redhat.com/show_bug.cgi?id=1038869 https://bugzilla.redhat.com/show_bug.cgi?id=1212164 vmdb/config/vmdb.tmpl.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
New commit detected on cfme/5.3.z: https://code.engineering.redhat.com/gerrit/gitweb?p=cfme.git;a=commitdiff;h=0da46e4efcb3cce468b9f32ab92609187858b207 commit 0da46e4efcb3cce468b9f32ab92609187858b207 Author: Jason Frey <jfrey> AuthorDate: Thu Apr 30 15:34:16 2015 -0400 Commit: Jason Frey <jfrey> CommitDate: Wed May 6 12:46:14 2015 -0400 Do not derive "available" values if we don't have any usage values. The lack of cpu or mem usage values implies that the target being collected is either off, or not configured for collection. In both cases, collecting "allocated" values does not make sense. If off, the target will not be given those resources, so they are not really available. If not configured for collection, then we should not be doing the derivation at all. The circumstance for this situation occurs when normal C&U for a target is not enabled, but storage C&U still occurs. When storage C&U comes along it calls process_derived_columns, but some of those derived columns should not be calculated in that state. If the normal C&U were to come along later, then it would fill in the missing details. https://bugzilla.redhat.com/show_bug.cgi?id=1038869 https://bugzilla.redhat.com/show_bug.cgi?id=1212164 https://bugzilla.redhat.com/show_bug.cgi?id=1219144 vmdb/app/models/metric/common.rb | 2 + vmdb/app/models/metric/processing.rb | 15 ++- vmdb/spec/factories/metric.rb | 21 +--- vmdb/spec/factories/metric_rollup.rb | 20 ++++ vmdb/spec/models/metric/processing_spec.rb | 180 +++++++++++++++++++++++++++++ vmdb/spec/models/metric_spec.rb | 6 +- 6 files changed, 223 insertions(+), 21 deletions(-) create mode 100644 vmdb/spec/factories/metric_rollup.rb create mode 100644 vmdb/spec/models/metric/processing_spec.rb
New commit detected on cfme/5.3.z: https://code.engineering.redhat.com/gerrit/gitweb?p=cfme.git;a=commitdiff;h=5398fb75e9779e88967a2b8dea803f90e03c05fa commit 5398fb75e9779e88967a2b8dea803f90e03c05fa Author: Jason Frey <jfrey> AuthorDate: Thu Apr 30 15:27:23 2015 -0400 Commit: Jason Frey <jfrey> CommitDate: Wed May 6 12:46:02 2015 -0400 Change storage capture to 60m to avoid leaving gaps in metrics_rollups. https://bugzilla.redhat.com/show_bug.cgi?id=1038869 https://bugzilla.redhat.com/show_bug.cgi?id=1212164 https://bugzilla.redhat.com/show_bug.cgi?id=1219144 vmdb/config/vmdb.tmpl.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
Verified that there are no gaps in metrics_rollups when a VM is powered off. Verified in 5.4.0.2.
*** Bug 1004057 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-1100.html