Bug 1846805
| Summary: | KubeletTooManyPods seems to take into account completed pods. | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | David Hernández Fernández <dahernan> |
| Component: | Monitoring | Assignee: | Sergiusz Urbaniak <surbania> |
| Status: | CLOSED ERRATA | QA Contact: | Junqi Zhao <juzhao> |
| Severity: | low | Docs Contact: | |
| Priority: | medium | ||
| Version: | 4.4 | CC: | abodhe, alegrand, anpicker, ChetRHosey, erooth, kakkoyun, ksathe, lcosic, mloibl, pkrupa, surbania |
| Target Milestone: | --- | Keywords: | Reopened |
| Target Release: | 4.6.0 | ||
| Hardware: | All | ||
| OS: | All | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: |
Currently, kubelet_running_pod_count includes Completed pods, which
is incorrect from the point of KubeTooManyPods alert. Since every pod needs to have container_memory_rss exposed, we can leverage it to find the actual number
of pods running on a node.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-10-27 16:06:58 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
This alert originates from kubernetes-mixin upstream, hence I created https://github.com/kubernetes-monitoring/kubernetes-mixin/issues/442. @David: let's discuss the issue there as it has ramifications on a bigger community. check on one node, example: qe-anusaxen10-nhx4f-master-2
count by(node) ((kube_pod_status_phase{job="kube-state-metrics",phase="Running"} == 1) * on(instance,pod,namespace,cluster) group_left(node) topk by(instance,pod,namespace,cluster) (1, kube_pod_info{job="kube-state-metrics"})) / max by(node) (kube_node_status_capacity_pods{job="kube-state-metrics",node="qe-anusaxen10-nhx4f-master-2"} != 1)
Element Value
{node="qe-anusaxen10-nhx4f-master-2"} 0.112
this node can allocate 250 pods
kube_node_status_capacity_pods{job="kube-state-metrics",node="qe-anusaxen10-nhx4f-master-2"} != 1
Element Value
kube_node_status_capacity_pods{endpoint="https-main",environment="vSphere",instance="10.128.2.10:8443",job="kube-state-metrics",namespace="openshift-monitoring",node="qe-anusaxen10-nhx4f-master-2",pod="kube-state-metrics-75bcb99ff6-6td44",prometheus="openshift-monitoring/k8s",region="unknown",service="kube-state-metrics"} 250
38 pods on this node,28 Running, 10 Completed pods, 250 * 0.112 = 28, the expression does not count Completed pods
# oc get pod --all-namespaces -o wide --no-headers| grep "qe-anusaxen10-nhx4f-master-2" | wc -l
38
# oc get pod --all-namespaces -o wide | grep "qe-anusaxen10-nhx4f-master-2" | grep Running | wc -l
28
# oc get pod --all-namespaces -o wide | grep "qe-anusaxen10-nhx4f-master-2" | grep Completed | wc -l
10
LGTM! The PR merge also is perfect, thanks! Looking forward to see this included in next ERRATAS. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196 |
######################################################################## Description of problem: The following alert is showing up: KubeletTooManyPods alert “Kubelet 'worker.example.com' is running at 103.2% of its Pod capacity” while too many completed pods are still not deleted. ######################################################################## $ oc get PrometheusRule prometheus-k8s-rules -n openshift-monitoring -o yaml | less - alert: KubeletTooManyPods annotations: message: Kubelet '{{ $labels.node }}' is running at {{ $value | humanizePercentage }} of its Pod capacity. expr: | max(max(kubelet_running_pod_count{job="kubelet"}) by(instance) * on(instance) group_left(node) kubelet_node_name{job="kubelet"}) by(node) / max(kube_node_status_capacity_pods{job="kube-state-metrics"}) by(node) > 0.95 for: 15m labels: severity: warning # oc get pods -A -o wide | grep worker03 | wc -l 251 # oc get pods -A -o wide | grep worker03 | grep -v 'Completed|Running' | wc -l 256 # oc get pods -A -o wide | grep worker03 | grep -v 'Running' | wc -l 201 # oc describe node worker03 ... Capacity: cpu: 4 ephemeral-storage: 314020844Ki hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 32936388Ki pods: 250 Allocatable: cpu: 3500m ephemeral-storage: 289401609352 hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 32321988Ki pods: 250 ... ######################################################################## Actual results: Completed pods are taken into account where those does not impact on the resources used. ######################################################################## ####################################################################### Expected results: Completed pods should not be taken into account. ####################################################################### ####################################################################### Additional info: It does not affect to replicas scalability, it just seems confusing. #######################################################################