Bug 1475034

Summary:	Metrics chart reporting 74000 Millicores for an app running on a node with only 8 cores
Product:	OpenShift Container Platform	Reporter:	Eric Jones <erjones>
Component:	Hawkular	Assignee:	Solly Ross <sross>
Status:	CLOSED INSUFFICIENT_DATA	QA Contact:	Junqi Zhao <juzhao>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	3.3.1	CC:	aos-bugs, erjones, mrichter, mwringe, pweil, sross
Target Milestone:	---
Target Release:	3.3.1
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2017-11-03 13:43:28 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Eric Jones 2017-07-25 21:51:43 UTC

Description of problem:
application with several replications running just fine suddenly has metrics reporting significantly more cores that is possible (node has 8 cores, app reported 74,000 millicores).


Version-Release number of selected component (if applicable):
OpenShift Container Platform 3.3.1.11

Additional info:
Attaching files shortly

Comment 2 Matt Wringe 2017-07-26 17:50:01 UTC

@sross: it looks like Heapster is using 15s for its interval, and I believe at this interval we can sometimes get strange cpu usage results back. Is this something we have seen before? A very large cpu spike which is nonsense.

Comment 3 Solly Ross 2017-07-28 19:24:29 UTC

those logs do not look like a healthy Heapster :-/

I'd try switching to an interval of 30s, as well as checking what the summary endpoint says, and what happens if you switch to using the summary source (`--source=kubernetes.summary_api:...` instead of `--source=kubernetes:...`.

We've seen spikes like that due to bad (non-monotonically increasing) CPU metrics and overflow, or occasionally due to bad metrics coming from Kubelet/cAdvisor, but I thought we'd fixed most of those issues.