Hide Forgot
From the attached picture and tested in our cluster, need to change resources.requests.memory to a lower value for thanos-sidecar Container Name: thanos-sidecar resources: map[requests:map[cpu:1m memory:100Mi]] (max (kube_pod_container_resource_requests{resource="memory", namespace="openshift-monitoring",container="thanos-sidecar"}) - max (container_memory_usage_bytes{namespace="openshift-monitoring",container="thanos-sidecar"})) /1024 /1024 {} 69.3671875
*** Bug 1962305 has been marked as a duplicate of this bug. ***
I have made further adjustments in a new PR: https://github.com/openshift/cluster-monitoring-operator/pull/1172
tested with 4.8.0-0.nightly-2021-05-21-233425, need to change resources.requests.memory to a bigger value for prometheus-operator container, please change back to ON_QA if this also fine ******************************** searched sort( max by (container) (container_memory_usage_bytes{namespace="openshift-monitoring"} or on(container) container_memory_rss{namespace="openshift-monitoring"}) - max by (container) (kube_pod_container_resource_requests{resource="memory", namespace="openshift-monitoring"})) / 1024 /1024 result {container="prometheus-operator"} -27.0703125 {container="telemeter-client"} -14.890625 {container="alertmanager"} -13.015625 {container="cluster-monitoring-operator"} -12.5859375 {container="openshift-state-metrics"} -12.3671875 {container="grafana"} -11.5 {container="node-exporter"} -5.5078125 {container="kube-state-metrics"} -4.28125 {container="kube-rbac-proxy-self"} -2.484375 {container="prom-label-proxy"} -1.14453125 {container="kube-rbac-proxy"} -0.18359375 {container="kube-rbac-proxy-main"} 1.30078125 {container="thanos-sidecar"} 3.62109375 {container="kube-rbac-proxy-rules"} 3.703125 {container="prometheus-proxy"} 5.26953125 {container="oauth-proxy"} 6.1015625 {container="reload"} 7.16015625 {container="alertmanager-proxy"} 7.78515625 {container="kube-rbac-proxy-thanos"} 8.29296875 {container="prometheus-adapter"} 9.58203125 {container="grafana-proxy"} 9.8125 {container="config-reloader"} 10.67578125 {container="thanos-query"} 60.05859375 {container="prometheus"} 1112.93359375 ********************************
# for i in $(kubectl -n openshift-monitoring get pod --no-headers | awk '{print $1}'); do echo $i; kubectl -n openshift-monitoring get pod $i -o go-template='{{range.spec.containers}}{{"Container Name: "}}{{.name}}{{"\r\nresources: "}}{{.resources}}{{"\n"}}{{end}}'; echo -e "\n"; done alertmanager-main-0 Container Name: alertmanager resources: map[requests:map[cpu:4m memory:40Mi]] Container Name: config-reloader resources: map[requests:map[cpu:1m memory:10Mi]] Container Name: alertmanager-proxy resources: map[requests:map[cpu:1m memory:20Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:15Mi]] Container Name: prom-label-proxy resources: map[requests:map[cpu:1m memory:20Mi]] alertmanager-main-1 Container Name: alertmanager resources: map[requests:map[cpu:4m memory:40Mi]] Container Name: config-reloader resources: map[requests:map[cpu:1m memory:10Mi]] Container Name: alertmanager-proxy resources: map[requests:map[cpu:1m memory:20Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:15Mi]] Container Name: prom-label-proxy resources: map[requests:map[cpu:1m memory:20Mi]] alertmanager-main-2 Container Name: alertmanager resources: map[requests:map[cpu:4m memory:40Mi]] Container Name: config-reloader resources: map[requests:map[cpu:1m memory:10Mi]] Container Name: alertmanager-proxy resources: map[requests:map[cpu:1m memory:20Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:15Mi]] Container Name: prom-label-proxy resources: map[requests:map[cpu:1m memory:20Mi]] cluster-monitoring-operator-fdb9d949c-vkl5q Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:20Mi]] Container Name: cluster-monitoring-operator resources: map[requests:map[cpu:10m memory:75Mi]] grafana-7bb7f88d68-7ks6f Container Name: grafana resources: map[requests:map[cpu:4m memory:64Mi]] Container Name: grafana-proxy resources: map[requests:map[cpu:1m memory:20Mi]] kube-state-metrics-69cc98557f-stb24 Container Name: kube-state-metrics resources: map[requests:map[cpu:2m memory:80Mi]] Container Name: kube-rbac-proxy-main resources: map[requests:map[cpu:1m memory:15Mi]] Container Name: kube-rbac-proxy-self resources: map[requests:map[cpu:1m memory:15Mi]] node-exporter-2v86g Container Name: node-exporter resources: map[requests:map[cpu:8m memory:32Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:15Mi]] node-exporter-427b8 Container Name: node-exporter resources: map[requests:map[cpu:8m memory:32Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:15Mi]] node-exporter-5whz5 Container Name: node-exporter resources: map[requests:map[cpu:8m memory:32Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:15Mi]] node-exporter-9r2bz Container Name: node-exporter resources: map[requests:map[cpu:8m memory:32Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:15Mi]] node-exporter-khtd6 Container Name: node-exporter resources: map[requests:map[cpu:8m memory:32Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:15Mi]] node-exporter-psxqr Container Name: node-exporter resources: map[requests:map[cpu:8m memory:32Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:15Mi]] openshift-state-metrics-5f54b4ff58-w674d Container Name: kube-rbac-proxy-main resources: map[requests:map[cpu:1m memory:20Mi]] Container Name: kube-rbac-proxy-self resources: map[requests:map[cpu:1m memory:20Mi]] Container Name: openshift-state-metrics resources: map[requests:map[cpu:1m memory:32Mi]] prometheus-adapter-6cb7687895-9bfn8 Container Name: prometheus-adapter resources: map[requests:map[cpu:1m memory:40Mi]] prometheus-adapter-6cb7687895-sppwt Container Name: prometheus-adapter resources: map[requests:map[cpu:1m memory:40Mi]] prometheus-k8s-0 Container Name: prometheus resources: map[requests:map[cpu:70m memory:1Gi]] Container Name: config-reloader resources: map[requests:map[cpu:1m memory:10Mi]] Container Name: thanos-sidecar resources: map[requests:map[cpu:1m memory:25Mi]] Container Name: prometheus-proxy resources: map[requests:map[cpu:1m memory:20Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:15Mi]] Container Name: prom-label-proxy resources: map[requests:map[cpu:1m memory:15Mi]] Container Name: kube-rbac-proxy-thanos resources: map[requests:map[cpu:1m memory:10Mi]] prometheus-k8s-1 Container Name: prometheus resources: map[requests:map[cpu:70m memory:1Gi]] Container Name: config-reloader resources: map[requests:map[cpu:1m memory:10Mi]] Container Name: thanos-sidecar resources: map[requests:map[cpu:1m memory:25Mi]] Container Name: prometheus-proxy resources: map[requests:map[cpu:1m memory:20Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:15Mi]] Container Name: prom-label-proxy resources: map[requests:map[cpu:1m memory:15Mi]] Container Name: kube-rbac-proxy-thanos resources: map[requests:map[cpu:1m memory:10Mi]] prometheus-operator-fd77ffdd8-6brvp Container Name: prometheus-operator resources: map[requests:map[cpu:5m memory:150Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:15Mi]] telemeter-client-5657ccddfb-74fhr Container Name: telemeter-client resources: map[requests:map[cpu:1m memory:40Mi]] Container Name: reload resources: map[requests:map[cpu:1m memory:10Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:20Mi]] thanos-querier-db74d4959-74v2r Container Name: thanos-query resources: map[requests:map[cpu:10m memory:12Mi]] Container Name: oauth-proxy resources: map[requests:map[cpu:1m memory:20Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:15Mi]] Container Name: prom-label-proxy resources: map[requests:map[cpu:1m memory:15Mi]] Container Name: kube-rbac-proxy-rules resources: map[requests:map[cpu:1m memory:15Mi]] thanos-querier-db74d4959-lbqbv Container Name: thanos-query resources: map[requests:map[cpu:10m memory:12Mi]] Container Name: oauth-proxy resources: map[requests:map[cpu:1m memory:20Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:15Mi]] Container Name: prom-label-proxy resources: map[requests:map[cpu:1m memory:15Mi]] Container Name: kube-rbac-proxy-rules resources: map[requests:map[cpu:1m memory:15Mi]]
Hi Junqi, the way to calculate the correct memory request is actually documented here: https://github.com/openshift/enhancements/blob/master/CONVENTIONS.md#resources-and-limits. The guidelines say that the requested memory should be 10% higher than the 90th percentile of memory usage during a CI run. I went ahead and made a PR with the actual query which would calculate the discrepancy: https://github.com/openshift/enhancements/pull/788/files When I ran the query after adjustments, the difference between requested and used memory was within 20Mi ranges. Do you think you can do your verification with this query as well?
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438