Bug 1462681

Summary: Containerized install: repeated "Could not get instant cpu stats". Heapster not running.
Product: OpenShift Container Platform Reporter: Mike Fiedler <mifiedle>
Component: NodeAssignee: Derek Carr <decarr>
Status: CLOSED NOTABUG QA Contact: DeShuai Ma <dma>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 3.6.0CC: aos-bugs, jokerman, mmccomas
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-06-19 10:08:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Mike Fiedler 2017-06-19 09:20:38 UTC
Description of problem:

OCP 3.6.116 installed containerized on AWS.  Node logs show repeated instances of 

Could not get instant cpu stats: different number of cpus
Could not get instant cpu stats: cumulative stats decrease

I saw https://bugzilla.redhat.com/show_bug.cgi?id=1437325 but the commentary there was that it was not a problem if not repeated.  In this case it is definitely repeated.   Heapster is not running

I am not seeing --system-reserved for CPU working in this environment.   I will file that as a separate bz but wanted to log this in case it was relevant.


Version-Release number of selected component (if applicable): 3.6.116


How reproducible: Always


Steps to Reproduce:
1. Install OCP 3.6 containerized on AWS EC2 (1 master/etcd, 1 infra, 2 nodes)
2. journalctl -fu atomic-openshift-node and watch for the messages in the description above


Actual results:

Warning messages about unable to retrieve cpu stats.  Also --system-reserved for CPU is having no effect, but will log that as a separate issue.

Comment 1 Mike Fiedler 2017-06-19 10:08:24 UTC
While trying to gather logs for this issue I noticed that atomic-openshift-master was running as an unconfigured process.  Once I yum removed that package and made sure systemd units were clean, I no longer see the issue.   Closing this bug.