Bug 1633387

Summary: KubeletTooManyPods statically compares against 100 instead of --max-pods (-10)
Product: OpenShift Container Platform Reporter: Justin Pierce <jupierce>
Component: MonitoringAssignee: Frederic Branczyk <fbranczy>
Status: CLOSED ERRATA QA Contact: Junqi Zhao <juzhao>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 3.11.0CC: cvogel, fbranczy, mrobson, vjaypurk
Target Milestone: ---Flags: vjaypurk: needinfo? (fbranczy)
Target Release: 4.1.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1690951 (view as bug list) Environment:
Last Closed: 2019-06-04 10:40:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 1690951    

Description Justin Pierce 2018-09-26 21:04:30 UTC
Description of problem:
In our starter clusters, --max-pods is set to 250. This leads to a persistent 
KubeletTooManyPods warning being raised. 

Version-Release number of selected component (if applicable):
v3.11.0-0.21.0

How reproducible:
100%

Steps to Reproduce:
1. Set kubelet max-pods to something > 110 (the default). 
2. Fill node with > 100 pods
3. KubeletTooManyPods will be reported

Actual results:
KubeletTooManyPods reported.

Expected results:
KubeletTooManyPods should be relative to configured --max-pods.

Comment 1 Frederic Branczyk 2018-09-28 10:04:40 UTC
This is indeed a bug, and already in our backlog to improve. In the mean time the best thing I can suggest is to silence this alert, sorry for the inconvenience.

Comment 3 Frederic Branczyk 2019-02-22 16:42:40 UTC
This is bumped to 250 pods in 4.0, the patch that modified this: https://github.com/openshift/cluster-monitoring-operator/pull/238

Comment 5 Junqi Zhao 2019-02-26 08:19:58 UTC
The cloned issue https://jira.coreos.com/browse/MON-344 is fixed, set it to VERIFIED

Comment 7 Matthew Robson 2019-03-19 19:32:39 UTC
Frederic - is the simple change of bumping the default value to 250 something that can be backported for a 3.11.x errata or is silencing the only option?

Comment 8 Frederic Branczyk 2019-03-20 14:03:28 UTC
It's relatively straight forward, but does need to be scheduled into our sprints. This is a PM decision to make if/when.

Comment 9 Christian Heidenreich 2019-03-20 14:16:05 UTC
Since we have fixed this for OCP4 already and it came up a few times, it's ok w/ me to backport this for the next OCP 3.11 z-release if possible.

Comment 10 Frederic Branczyk 2019-03-20 14:22:00 UTC
In that case please create an item in our backlog and make it your responsibility to have it be part of an upcoming sprint.

Comment 12 errata-xmlrpc 2019-06-04 10:40:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758