The following pods run in the BestEffort QoS with no resource requests openshift-monitoring/kube-state-metrics openshift-monitoring/prometheus-adapter openshift-monitoring/prometheus-k8s openshift-monitoring/prometheus-operator https://github.com/openshift/origin/pull/22787 This can cause eviction, OOMKilling, and CPU starvation. Please add the following to the resource requests to the pods in this component: Memory: kube-state-metrics 120Mi prometheus-adapter 50Mi prometheus-k8s 1Gi prometheus-operator 100Mi CPU: prometheus-k8s 200m all others 10m
At least kube-state-metrics and prometheus-k8s resources heavily depend on cluster size. Should we still go ahead with these values to have something, and fix the rest eventually with autoscaling?
Yes. Literally any setting for requests is better than none at all. The vertical pod autoscaler (VPA) can help with this later.
Ack, I just wanted to clarify that. We'll take care of this. Thanks!
What PR(s) fixed this?
Nevermind, found it https://github.com/openshift/cluster-monitoring-operator/pull/356
qosClass for all pods are Burstable already added resources.requests.memory and resources.requests.cpu for openshift-monitoring/kube-state-metrics openshift-monitoring/prometheus-adapter openshift-monitoring/prometheus-k8s openshift-monitoring/prometheus-operator payload: 4.2.0-0.nightly-2019-06-24-160709
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2922
Follow-up monitoring requests work in bug 1905330.