| Summary: | C&U Metrics Processor memory and timeout issues associated with 'perf_rollup' method and vmware host and vm isntances | ||
|---|---|---|---|
| Product: | Red Hat CloudForms Management Engine | Reporter: | Chris Pelland <cpelland> |
| Component: | Performance | Assignee: | Keenan Brock <kbrock> |
| Status: | CLOSED ERRATA | QA Contact: | Nandini Chandra <nachandr> |
| Severity: | urgent | Docs Contact: | |
| Priority: | high | ||
| Version: | 5.5.0 | CC: | carnott, cpelland, dmetzger, fdewaley, jhardy, jprause, kbrock, mfeifer, nachandr, obarenbo, thenness |
| Target Milestone: | GA | Keywords: | ZStream |
| Target Release: | 5.5.4 | ||
| Hardware: | All | ||
| OS: | All | ||
| Whiteboard: | c&u | ||
| Fixed In Version: | 5.5.4.0 | Doc Type: | Bug Fix |
| Doc Text: |
Previously, the Capacity and Utilization metrics processor worker fetched all historical performance data to report metrics, causing the query to fail due to the extremely large amount of data to process. This has been fixed in the code by only loading recent performance state records. As a result, the process no longer times out and the Capacity and Utilization metrics are reported successfully.
|
Story Points: | --- |
| Clone Of: | 1322485 | Environment: | |
| Last Closed: | 2016-05-31 13:42:24 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Bug Depends On: | 1322485 | ||
| Bug Blocks: | |||
On my appliance,I changed the log levels to debug, but I wasn't able to see this query at all in the logs. the SELECT "vim_performance_states".* FROM "vim_performance_states" ... Reproducer: 1)Manage a provider and enable C&U collection for the provider 2)Capture C&U data for a few hours/days. 3)Disable C&U collection for at least 1 day. 4)Re-enable C&U collection Before fix: When C&U collection is re-enabled, CFME fetches all historical performance dats. After fix: When C&U collection is re-enabled, CFME fetches performance data for the current hour only. Verified that CFME fetches performance data for the current hour only by looking at the DB itself.Marking this as VERIFIED. Verified in 5.5.4.0. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2016:1101 |
Let me describe the actual bug: before: When rolling up metrics, the system fetches all historical performance data (vm_performance_states). This query alone took over 24 minutes to run and timed out. after: Just fetch the performance data for the current hour/day. date added to query: SELECT "vim_performance_states".* FROM "vim_performance_states" ... no longer see errors with text "timed out after " no longer see "Timed Out Active Message" reproduction 1: access a system that has run cap&u for many days. (any provider type) see before and after to note change in error rates reproduction 2: change log levels to debug (so you will see the sql running on the server note the vim_performance_states query. It will still have the long id list, but it will also have a date query in it