Bug 1414485 - CloudForms generating WARN messages from OpenShift metrics cpu average out of range
Summary: CloudForms generating WARN messages from OpenShift metrics cpu average out of...
Alias: None
Product: Red Hat CloudForms Management Engine
Classification: Red Hat
Component: C&U Capacity and Utilization
Version: 5.6.0
Hardware: Unspecified
OS: Unspecified
Target Milestone: GA
: cfme-future
Assignee: Greg Blomquist
QA Contact: Einat Pacifici
Whiteboard: container
Depends On:
TreeView+ depends on / blocked
Reported: 2017-01-18 16:04 UTC by myoder
Modified: 2021-09-09 12:05 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Last Closed: 2018-09-18 11:15:43 UTC
Category: Bug
Cloudforms Team: Container Management
Target Upstream Version:

Attachments (Terms of Use)

Description myoder 2017-01-18 16:04:20 UTC
Description of problem: 

CloudForms is generating a lot of WARN messages in the logs based on metrics it is getting from OpenShift.  The cpu usage rate average ranges from the hundreds to the thousands.  The WARN message below is the largest value I saw in the logs.

[----] W, [2017-01-17T04:50:53.711909 #28092:3d3998]  WARN -- : MIQ(ManageIQ::Providers::Kubernetes::ContainerManager::ContainerGroup#perf_process) [realtime] ManageIQ::Providers::Kubernetes::ContainerManager::C
ontainerGroup name: [logstash-5-ipi6a], id: [123000000001006] Timestamp: [2017-01-17T09:37:20Z], Column [cpu_usage_rate_average]: 'percent value 22624.395684399165 is out of range, resetting to 100.0'

Version-Release number of selected component (if applicable):

How reproducible:
Seems to be consistently generating these messages.

Steps to Reproduce:

Actual results:

Expected results:

Additional info:  

Currently only seeing this WARN message with 5 Container Groups and 1 Container group.  The cu environment has 11 nodes.  I am working on verifying if these Container Groups and Containers are in the same node.

In the course of about 4 hours I'm seeing 2000 log lines being generated by this message.

Comment 3 Dave Johnson 2017-07-14 03:47:14 UTC
Please assess the importance of this issue and update the priority accordingly.  Somewhere it was missed in the bug triage process.  Please refer to https://bugzilla.redhat.com/page.cgi?id=fields.html#priority for a reminder on each priority's definition.

If it's something like a tracker bug where it doesn't matter, please set it to Low/Low.

Comment 7 Yaacov Zamir 2017-12-18 07:18:26 UTC

BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1524626

Sounds similar but not a duplicate, Do you have a way to check if the
patches that fix #1524626 also solve this issue ?

The patches are:
https://github.com/ManageIQ/manageiq-providers-kubernetes/pull/187 - scrape every 60s.
https://github.com/ManageIQ/manageiq-providers-kubernetes/pull/159 - reflector scraping.

Note You need to log in before you can comment on or make changes to this bug.