Bug 1325405 - C&U Metrics Processor memory and timeout issues associated with 'perf_rollup' method and vmware host and vm isntances
Summary: C&U Metrics Processor memory and timeout issues associated with 'perf_rollup'...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat CloudForms Management Engine
Classification: Red Hat
Component: Performance
Version: 5.5.0
Hardware: All
OS: All
high
urgent
Target Milestone: GA
: 5.5.4
Assignee: Keenan Brock
QA Contact: Nandini Chandra
URL:
Whiteboard: c&u
Depends On: 1322485
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-04-08 18:09 UTC by Chris Pelland
Modified: 2019-11-14 07:45 UTC (History)
11 users (show)

Fixed In Version: 5.5.4.0
Doc Type: Bug Fix
Doc Text:
Previously, the Capacity and Utilization metrics processor worker fetched all historical performance data to report metrics, causing the query to fail due to the extremely large amount of data to process. This has been fixed in the code by only loading recent performance state records. As a result, the process no longer times out and the Capacity and Utilization metrics are reported successfully.
Clone Of: 1322485
Environment:
Last Closed: 2016-05-31 13:42:24 UTC
Category: ---
Cloudforms Team: ---
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:1101 0 normal SHIPPED_LIVE CFME 5.5.4 bug fixes and enhancement update 2016-05-31 17:40:10 UTC

Comment 8 Keenan Brock 2016-05-06 19:58:11 UTC
Let me describe the actual bug:

before:
    When rolling up metrics, the system fetches all historical performance data (vm_performance_states).
    This query alone took over 24 minutes to run and timed out.

after:
    Just fetch the performance data for the current hour/day.
    date added to query: SELECT "vim_performance_states".* FROM "vim_performance_states" ...
    no longer see errors with text "timed out after "
    no longer see "Timed Out Active Message"

reproduction 1:
    access a system that has run cap&u for many days. (any provider type)
    see before and after to note change in error rates

reproduction 2:
    change log levels to debug (so you will see the sql running on the server
    note the vim_performance_states query. It will still have the long id list, but it will also have a date query in it

Comment 9 Nandini Chandra 2016-05-12 19:56:42 UTC
On my appliance,I changed the log levels to debug, but I wasn't able to see this query at all in the logs.

the SELECT "vim_performance_states".* FROM "vim_performance_states" ...

Reproducer:
1)Manage a provider and enable C&U collection for the provider
2)Capture C&U data for a few hours/days.
3)Disable C&U collection for at least 1 day.
4)Re-enable C&U collection

Before fix:
When C&U collection is re-enabled, CFME fetches all historical performance dats.

After fix:
When C&U collection is re-enabled, CFME fetches performance data for the current hour only.

Verified that CFME fetches performance data for the current hour only by 
looking at the DB itself.Marking this as VERIFIED.

Verified in 5.5.4.0.

Comment 11 errata-xmlrpc 2016-05-31 13:42:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1101


Note You need to log in before you can comment on or make changes to this bug.