Bug 1813221
Summary: | Default openshift install requests too many CPU resources to install all components, requests of components on cluster are wrong | ||||||
---|---|---|---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Sergiusz Urbaniak <surbania> | ||||
Component: | Monitoring | Assignee: | Sergiusz Urbaniak <surbania> | ||||
Status: | CLOSED ERRATA | QA Contact: | Junqi Zhao <juzhao> | ||||
Severity: | urgent | Docs Contact: | |||||
Priority: | urgent | ||||||
Version: | 4.4 | CC: | adahiya, alegrand, anpicker, bbennett, ccoleman, erooth, esimard, jialiu, juzhao, kakkoyun, lcosic, mloibl, pkrupa, rphillips, spasquie, surbania, xiuwang | ||||
Target Milestone: | --- | ||||||
Target Release: | 4.4.z | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | 1812719 | ||||||
: | 1818806 (view as bug list) | Environment: | |||||
Last Closed: | 2020-07-21 10:31:05 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | 1812719 | ||||||
Bug Blocks: | 1818806 | ||||||
Attachments: |
|
Comment 4
Junqi Zhao
2020-04-10 06:51:35 UTC
Tested with 4.4.0-0.nightly-2020-04-16-205909, the result is the same as Comment 4, https://github.com/openshift/openshift-state-metrics/pull/46/files is already in this build now, since it is merged in https://openshift-release.svc.ci.openshift.org//releasestream/4.4.0-0.nightly/release/4.4.0-0.nightly-2020-04-16-162058 and later build. "cpu request for config-reloader container of alertmanager pod is 100m, it is too big, same for openshift-state-metrics container of openshift-state-metrics pod" -> we also need to fix this issue, I think we should do the same for 4.4 as in 4.5, see 4.5 fix: https://github.com/openshift/cluster-monitoring-operator/pull/755/files # for i in $(kubectl -n openshift-monitoring get pod --no-headers | awk '{print $1}'); do echo $i; kubectl -n openshift-monitoring get pod $i -o go-template='{{range.spec.containers}}{{"Container Name: "}}{{.name}}{{"\r\nresources: "}}{{.resources}}{{"\n"}}{{end}}'; echo -e "\n"; done alertmanager-main-0 Container Name: alertmanager resources: map[requests:map[cpu:4m memory:200Mi]] Container Name: config-reloader resources: map[limits:map[cpu:100m memory:25Mi] requests:map[cpu:100m memory:25Mi]] Container Name: alertmanager-proxy resources: map[requests:map[cpu:1m memory:20Mi]] alertmanager-main-1 Container Name: alertmanager resources: map[requests:map[cpu:4m memory:200Mi]] Container Name: config-reloader resources: map[limits:map[cpu:100m memory:25Mi] requests:map[cpu:100m memory:25Mi]] Container Name: alertmanager-proxy resources: map[requests:map[cpu:1m memory:20Mi]] alertmanager-main-2 Container Name: alertmanager resources: map[requests:map[cpu:4m memory:200Mi]] Container Name: config-reloader resources: map[limits:map[cpu:100m memory:25Mi] requests:map[cpu:100m memory:25Mi]] Container Name: alertmanager-proxy resources: map[requests:map[cpu:1m memory:20Mi]] cluster-monitoring-operator-5976bcb888-phhz4 Container Name: cluster-monitoring-operator resources: map[requests:map[cpu:10m memory:50Mi]] grafana-746fcd5f5d-gwqsq Container Name: grafana resources: map[requests:map[cpu:4m memory:100Mi]] Container Name: grafana-proxy resources: map[requests:map[cpu:1m memory:20Mi]] kube-state-metrics-749ff87465-rv85m Container Name: kube-rbac-proxy-main resources: map[requests:map[cpu:1m memory:40Mi]] Container Name: kube-rbac-proxy-self resources: map[requests:map[cpu:1m memory:40Mi]] Container Name: kube-state-metrics resources: map[requests:map[cpu:2m memory:40Mi]] node-exporter-gcmz2 Container Name: node-exporter resources: map[requests:map[cpu:8m memory:180Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:30Mi]] node-exporter-hfvlm Container Name: node-exporter resources: map[requests:map[cpu:8m memory:180Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:30Mi]] node-exporter-k48pg Container Name: node-exporter resources: map[requests:map[cpu:8m memory:180Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:30Mi]] node-exporter-nn7r2 Container Name: node-exporter resources: map[requests:map[cpu:8m memory:180Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:30Mi]] node-exporter-p88s8 Container Name: node-exporter resources: map[requests:map[cpu:8m memory:180Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:30Mi]] node-exporter-wcjgl Container Name: node-exporter resources: map[requests:map[cpu:8m memory:180Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:30Mi]] openshift-state-metrics-5b7d864ff9-9qlt8 Container Name: kube-rbac-proxy-main resources: map[requests:map[cpu:10m memory:20Mi]] Container Name: kube-rbac-proxy-self resources: map[requests:map[cpu:10m memory:20Mi]] Container Name: openshift-state-metrics resources: map[requests:map[cpu:100m memory:150Mi]] prometheus-adapter-864547d858-gnh6l Container Name: prometheus-adapter resources: map[requests:map[cpu:1m memory:20Mi]] prometheus-adapter-864547d858-nkz8h Container Name: prometheus-adapter resources: map[requests:map[cpu:1m memory:20Mi]] prometheus-k8s-0 Container Name: prometheus resources: map[requests:map[cpu:70m memory:1Gi]] Container Name: prometheus-config-reloader resources: map[limits:map[cpu:100m memory:25Mi] requests:map[cpu:1m memory:25Mi]] Container Name: rules-configmap-reloader resources: map[limits:map[cpu:100m memory:25Mi] requests:map[cpu:1m memory:25Mi]] Container Name: thanos-sidecar resources: map[requests:map[cpu:1m memory:100Mi]] Container Name: prometheus-proxy resources: map[requests:map[cpu:1m memory:20Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:20Mi]] Container Name: prom-label-proxy resources: map[requests:map[cpu:1m memory:20Mi]] prometheus-k8s-1 Container Name: prometheus resources: map[requests:map[cpu:70m memory:1Gi]] Container Name: prometheus-config-reloader resources: map[limits:map[cpu:100m memory:25Mi] requests:map[cpu:1m memory:25Mi]] Container Name: rules-configmap-reloader resources: map[limits:map[cpu:100m memory:25Mi] requests:map[cpu:1m memory:25Mi]] Container Name: thanos-sidecar resources: map[requests:map[cpu:1m memory:100Mi]] Container Name: prometheus-proxy resources: map[requests:map[cpu:1m memory:20Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:20Mi]] Container Name: prom-label-proxy resources: map[requests:map[cpu:1m memory:20Mi]] prometheus-operator-5b86546448-d7jwg Container Name: prometheus-operator resources: map[requests:map[cpu:5m memory:60Mi]] telemeter-client-88f5d5c5f-pp9th Container Name: telemeter-client resources: map[requests:map[cpu:1m]] Container Name: reload resources: map[requests:map[cpu:1m]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:20Mi]] thanos-querier-6b44f48967-97wq6 Container Name: thanos-querier resources: map[requests:map[cpu:5m memory:12Mi]] Container Name: oauth-proxy resources: map[requests:map[cpu:1m memory:20Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:20Mi]] Container Name: prom-label-proxy resources: map[requests:map[cpu:1m memory:20Mi]] thanos-querier-6b44f48967-z2rgh Container Name: thanos-querier resources: map[requests:map[cpu:5m memory:12Mi]] Container Name: oauth-proxy resources: map[requests:map[cpu:1m memory:20Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:20Mi]] Container Name: prom-label-proxy resources: map[requests:map[cpu:1m memory:20Mi]] Tested with 4.4.0-0.nightly-2020-05-07-223059, cpu request for config-reloader container of alertmanager pod is 100m, should be reduced, other settings are fine # for i in $(kubectl -n openshift-monitoring get pod --no-headers | awk '{print $1}'); do echo $i; kubectl -n openshift-monitoring get pod $i -o go-template='{{range.spec.containers}}{{"Container Name: "}}{{.name}}{{"\r\nresources: "}}{{.resources}}{{"\n"}}{{end}}'; echo -e "\n"; done alertmanager-main-0 Container Name: alertmanager resources: map[requests:map[cpu:4m memory:200Mi]] Container Name: config-reloader resources: map[limits:map[cpu:100m memory:25Mi] requests:map[cpu:100m memory:25Mi]] Container Name: alertmanager-proxy resources: map[requests:map[cpu:1m memory:20Mi]] alertmanager-main-1 Container Name: alertmanager resources: map[requests:map[cpu:4m memory:200Mi]] Container Name: config-reloader resources: map[limits:map[cpu:100m memory:25Mi] requests:map[cpu:100m memory:25Mi]] Container Name: alertmanager-proxy resources: map[requests:map[cpu:1m memory:20Mi]] alertmanager-main-2 Container Name: alertmanager resources: map[requests:map[cpu:4m memory:200Mi]] Container Name: config-reloader resources: map[limits:map[cpu:100m memory:25Mi] requests:map[cpu:100m memory:25Mi]] Container Name: alertmanager-proxy resources: map[requests:map[cpu:1m memory:20Mi]] cluster-monitoring-operator-587677f6dd-gxcf4 Container Name: cluster-monitoring-operator resources: map[requests:map[cpu:10m memory:50Mi]] grafana-7f56fbfb44-vjmkd Container Name: grafana resources: map[requests:map[cpu:4m memory:100Mi]] Container Name: grafana-proxy resources: map[requests:map[cpu:1m memory:20Mi]] kube-state-metrics-56c6d4b486-2dg4x Container Name: kube-rbac-proxy-main resources: map[requests:map[cpu:1m memory:40Mi]] Container Name: kube-rbac-proxy-self resources: map[requests:map[cpu:1m memory:40Mi]] Container Name: kube-state-metrics resources: map[requests:map[cpu:2m memory:40Mi]] node-exporter-6zvft Container Name: node-exporter resources: map[requests:map[cpu:8m memory:180Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:30Mi]] node-exporter-87xzt Container Name: node-exporter resources: map[requests:map[cpu:8m memory:180Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:30Mi]] node-exporter-9nxhf Container Name: node-exporter resources: map[requests:map[cpu:8m memory:180Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:30Mi]] node-exporter-9q66q Container Name: node-exporter resources: map[requests:map[cpu:8m memory:180Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:30Mi]] node-exporter-xcrg2 Container Name: node-exporter resources: map[requests:map[cpu:8m memory:180Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:30Mi]] node-exporter-xs8hd Container Name: node-exporter resources: map[requests:map[cpu:8m memory:180Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:30Mi]] openshift-state-metrics-65479646c9-r7cdv Container Name: kube-rbac-proxy-main resources: map[requests:map[cpu:1m memory:20Mi]] Container Name: kube-rbac-proxy-self resources: map[requests:map[cpu:1m memory:20Mi]] Container Name: openshift-state-metrics resources: map[requests:map[cpu:1m memory:150Mi]] prometheus-adapter-754766cb98-2kbpp Container Name: prometheus-adapter resources: map[requests:map[cpu:1m memory:20Mi]] prometheus-adapter-754766cb98-vr2kk Container Name: prometheus-adapter resources: map[requests:map[cpu:1m memory:20Mi]] prometheus-k8s-0 Container Name: prometheus resources: map[requests:map[cpu:70m memory:1Gi]] Container Name: prometheus-config-reloader resources: map[limits:map[cpu:100m memory:25Mi] requests:map[cpu:1m memory:25Mi]] Container Name: rules-configmap-reloader resources: map[limits:map[cpu:100m memory:25Mi] requests:map[cpu:1m memory:25Mi]] Container Name: thanos-sidecar resources: map[requests:map[cpu:1m memory:100Mi]] Container Name: prometheus-proxy resources: map[requests:map[cpu:1m memory:20Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:20Mi]] Container Name: prom-label-proxy resources: map[requests:map[cpu:1m memory:20Mi]] prometheus-k8s-1 Container Name: prometheus resources: map[requests:map[cpu:70m memory:1Gi]] Container Name: prometheus-config-reloader resources: map[limits:map[cpu:100m memory:25Mi] requests:map[cpu:1m memory:25Mi]] Container Name: rules-configmap-reloader resources: map[limits:map[cpu:100m memory:25Mi] requests:map[cpu:1m memory:25Mi]] Container Name: thanos-sidecar resources: map[requests:map[cpu:1m memory:100Mi]] Container Name: prometheus-proxy resources: map[requests:map[cpu:1m memory:20Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:20Mi]] Container Name: prom-label-proxy resources: map[requests:map[cpu:1m memory:20Mi]] prometheus-operator-6cf47f8648-njqh6 Container Name: prometheus-operator resources: map[requests:map[cpu:5m memory:60Mi]] telemeter-client-5fdbf4f9fd-76f49 Container Name: telemeter-client resources: map[requests:map[cpu:1m]] Container Name: reload resources: map[requests:map[cpu:1m]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:20Mi]] thanos-querier-5fdf958d8c-9gt9q Container Name: thanos-querier resources: map[requests:map[cpu:5m memory:12Mi]] Container Name: oauth-proxy resources: map[requests:map[cpu:1m memory:20Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:20Mi]] Container Name: prom-label-proxy resources: map[requests:map[cpu:1m memory:20Mi]] thanos-querier-5fdf958d8c-f9b4t Container Name: thanos-querier resources: map[requests:map[cpu:5m memory:12Mi]] Container Name: oauth-proxy resources: map[requests:map[cpu:1m memory:20Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:20Mi]] Container Name: prom-label-proxy resources: map[requests:map[cpu:1m memory:20Mi]] *** Bug 1812999 has been marked as a duplicate of this bug. *** Did not find the fix, move back to MODIFIED Tested with 4.4.0-0.nightly-2020-06-02-202425, cpu request for config-reloader container of alertmanager pod is still 100m alertmanager-main-0 Container Name: alertmanager resources: map[requests:map[cpu:4m memory:200Mi]] Container Name: config-reloader resources: map[limits:map[cpu:100m memory:25Mi] requests:map[cpu:100m memory:25Mi]] Container Name: alertmanager-proxy resources: map[requests:map[cpu:1m memory:20Mi]] Please don't move this back to MODIFIED until additional fixes have been merged. https://github.com/openshift/prometheus-operator/pull/74 is in https://openshift-release.svc.ci.openshift.org/releasestream/4.4.0-0.nightly/release/4.4.0-0.nightly-2020-06-19-230820 and later build tested with 4.4.0-0.nightly-2020-06-21-210301, cpu request for config-reloader container of alertmanager pod is still 100m alertmanager-main-0 Container Name: alertmanager resources: map[requests:map[cpu:4m memory:200Mi]] Container Name: config-reloader resources: map[limits:map[cpu:100m memory:25Mi] requests:map[cpu:100m memory:25Mi]] Container Name: alertmanager-proxy resources: map[requests:map[cpu:1m memory:20Mi]] The reason is that if we only specify the resources.limits, the resources.requests would use the same value as resources.limits # oc -n openshift-monitoring get sts/alertmanager-main -oyaml ... name: config-reloader resources: limits: cpu: 100m memory: 25Mi ... Created attachment 1698244 [details]
alertmanager-main statefulset yaml file
*** Bug 1854002 has been marked as a duplicate of this bug. *** issue is fixed with 4.4.0-0.nightly-2020-07-08-233114 # for i in $(kubectl -n openshift-monitoring get pod --no-headers | awk '{print $1}'); do echo $i; kubectl -n openshift-monitoring get pod $i -o go-template='{{range.spec.containers}}{{"Container Name: "}}{{.name}}{{"\r\nresources: "}}{{.resources}}{{"\n"}}{{end}}'; echo -e "\n"; done alertmanager-main-0 Container Name: alertmanager resources: map[requests:map[cpu:4m memory:200Mi]] Container Name: config-reloader resources: map[requests:map[cpu:1m]] Container Name: alertmanager-proxy resources: map[requests:map[cpu:1m memory:20Mi]] alertmanager-main-1 Container Name: alertmanager resources: map[requests:map[cpu:4m memory:200Mi]] Container Name: config-reloader resources: map[requests:map[cpu:1m]] Container Name: alertmanager-proxy resources: map[requests:map[cpu:1m memory:20Mi]] alertmanager-main-2 Container Name: alertmanager resources: map[requests:map[cpu:4m memory:200Mi]] Container Name: config-reloader resources: map[requests:map[cpu:1m]] Container Name: alertmanager-proxy resources: map[requests:map[cpu:1m memory:20Mi]] cluster-monitoring-operator-6f5d96f5bb-zkj8l Container Name: cluster-monitoring-operator resources: map[requests:map[cpu:10m memory:50Mi]] grafana-85d57fb957-cxfvj Container Name: grafana resources: map[requests:map[cpu:4m memory:100Mi]] Container Name: grafana-proxy resources: map[requests:map[cpu:1m memory:20Mi]] kube-state-metrics-654fb85db9-cxrg2 Container Name: kube-rbac-proxy-main resources: map[requests:map[cpu:1m memory:40Mi]] Container Name: kube-rbac-proxy-self resources: map[requests:map[cpu:1m memory:40Mi]] Container Name: kube-state-metrics resources: map[requests:map[cpu:2m memory:40Mi]] node-exporter-74gxn Container Name: node-exporter resources: map[requests:map[cpu:8m memory:180Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:30Mi]] node-exporter-7t6bx Container Name: node-exporter resources: map[requests:map[cpu:8m memory:180Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:30Mi]] node-exporter-9sqfg Container Name: node-exporter resources: map[requests:map[cpu:8m memory:180Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:30Mi]] node-exporter-qwpk6 Container Name: node-exporter resources: map[requests:map[cpu:8m memory:180Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:30Mi]] node-exporter-r8c7p Container Name: node-exporter resources: map[requests:map[cpu:8m memory:180Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:30Mi]] node-exporter-x4z98 Container Name: node-exporter resources: map[requests:map[cpu:8m memory:180Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:30Mi]] openshift-state-metrics-6776b7d69c-tzxlw Container Name: kube-rbac-proxy-main resources: map[requests:map[cpu:1m memory:20Mi]] Container Name: kube-rbac-proxy-self resources: map[requests:map[cpu:1m memory:20Mi]] Container Name: openshift-state-metrics resources: map[requests:map[cpu:1m memory:150Mi]] prometheus-adapter-578dfd9cd4-fvccl Container Name: prometheus-adapter resources: map[requests:map[cpu:1m memory:20Mi]] prometheus-adapter-578dfd9cd4-wn649 Container Name: prometheus-adapter resources: map[requests:map[cpu:1m memory:20Mi]] prometheus-k8s-0 Container Name: prometheus resources: map[requests:map[cpu:70m memory:1Gi]] Container Name: prometheus-config-reloader resources: map[requests:map[cpu:1m]] Container Name: rules-configmap-reloader resources: map[requests:map[cpu:1m]] Container Name: thanos-sidecar resources: map[requests:map[cpu:1m memory:100Mi]] Container Name: prometheus-proxy resources: map[requests:map[cpu:1m memory:20Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:20Mi]] Container Name: prom-label-proxy resources: map[requests:map[cpu:1m memory:20Mi]] prometheus-k8s-1 Container Name: prometheus resources: map[requests:map[cpu:70m memory:1Gi]] Container Name: prometheus-config-reloader resources: map[requests:map[cpu:1m]] Container Name: rules-configmap-reloader resources: map[requests:map[cpu:1m]] Container Name: thanos-sidecar resources: map[requests:map[cpu:1m memory:100Mi]] Container Name: prometheus-proxy resources: map[requests:map[cpu:1m memory:20Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:20Mi]] Container Name: prom-label-proxy resources: map[requests:map[cpu:1m memory:20Mi]] prometheus-operator-b664c969-zkwt7 Container Name: prometheus-operator resources: map[requests:map[cpu:5m memory:60Mi]] telemeter-client-8f694888-qvh7r Container Name: telemeter-client resources: map[requests:map[cpu:1m]] Container Name: reload resources: map[requests:map[cpu:1m]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:20Mi]] thanos-querier-f87789589-mjkp6 Container Name: thanos-querier resources: map[requests:map[cpu:5m memory:12Mi]] Container Name: oauth-proxy resources: map[requests:map[cpu:1m memory:20Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:20Mi]] Container Name: prom-label-proxy resources: map[requests:map[cpu:1m memory:20Mi]] thanos-querier-f87789589-r7xtv Container Name: thanos-querier resources: map[requests:map[cpu:5m memory:12Mi]] Container Name: oauth-proxy resources: map[requests:map[cpu:1m memory:20Mi]] Container Name: kube-rbac-proxy resources: map[requests:map[cpu:1m memory:20Mi]] Container Name: prom-label-proxy resources: map[requests:map[cpu:1m memory:20Mi]] Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2913 |