Description of problem: Prometheus operator is deploying StatefulSet and setting memory limit too small on the rules-configmap-reloader container Version-Release number of selected component (if applicable): OpenShift 4.1.8 Prometheus Operator 0.27.0 Prometheus v2.7.1 How reproducible: 100% Steps to Reproduce: 1. Install Prometheus operator from OperatorHub in OpenShift UI 2. Deploy Prometheus instance with defaults Actual results: rules-configmap-reloader container in prometheus-my-prometheus-0|1 doesn't come up Expected results: prometheus-my-prometheus-0|1 pods should fully start Additional info: Relevant section of StatefulSet created by Prometheus Operator: - name: rules-configmap-reloader image: >- quay.io/coreos/configmap-reload@sha256:e2fd60ff0ae4500a75b80ebaa30e0e7deba9ad107833e8ca53f0047c42c5a057 args: - '--webhook-url=http://localhost:9090/-/reload' - >- --volume-dir=/etc/prometheus/rules/prometheus-my-prometheus-rulefiles-0 resources: limits: cpu: 25m memory: 10Mi oc describe po prometheus-my-prometheus-0 ... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 3m15s default-scheduler Successfully assigned anewtest/prometheus-my-prometheus-0 to ip-10-0-157-150.ap-southeast-1.compute.internal Normal Pulling 3m6s kubelet, ip-10-0-157-150.ap-southeast-1.compute.internal Pulling image "quay.io/prometheus/prometheus:v2.7.1" Normal Pulled 2m52s kubelet, ip-10-0-157-150.ap-southeast-1.compute.internal Successfully pulled image "quay.io/prometheus/prometheus:v2.7.1" Normal Pulling 2m51s kubelet, ip-10-0-157-150.ap-southeast-1.compute.internal Pulling image "quay.io/coreos/prometheus-config-reloader@sha256:61b7969fd1336fd4bbac0622f9e4281f2a2b4ae02ad55748b6fdc65e2be69b73" Normal Pulling 2m41s kubelet, ip-10-0-157-150.ap-southeast-1.compute.internal Pulling image "quay.io/coreos/configmap-reload@sha256:e2fd60ff0ae4500a75b80ebaa30e0e7deba9ad107833e8ca53f0047c42c5a057" Normal Started 2m41s kubelet, ip-10-0-157-150.ap-southeast-1.compute.internal Started container prometheus-config-reloader Normal Pulled 2m41s kubelet, ip-10-0-157-150.ap-southeast-1.compute.internal Successfully pulled image "quay.io/coreos/prometheus-config-reloader@sha256:61b7969fd1336fd4bbac0622f9e4281f2a2b4ae02ad55748b6fdc65e2be69b73" Normal Created 2m41s kubelet, ip-10-0-157-150.ap-southeast-1.compute.internal Created container prometheus-config-reloader Normal Started 2m31s (x2 over 2m51s) kubelet, ip-10-0-157-150.ap-southeast-1.compute.internal Started container prometheus Normal Created 2m31s (x2 over 2m51s) kubelet, ip-10-0-157-150.ap-southeast-1.compute.internal Created container prometheus Normal Pulled 2m31s kubelet, ip-10-0-157-150.ap-southeast-1.compute.internal Successfully pulled image "quay.io/coreos/configmap-reload@sha256:e2fd60ff0ae4500a75b80ebaa30e0e7deba9ad107833e8ca53f0047c42c5a057" Normal Pulled 2m31s kubelet, ip-10-0-157-150.ap-southeast-1.compute.internal Container image "quay.io/prometheus/prometheus:v2.7.1" already present on machine Warning Failed 112s (x6 over 2m31s) kubelet, ip-10-0-157-150.ap-southeast-1.compute.internal Error: set memory limit 10485760 too low; should be at least 12582912 Normal Pulled 100s (x6 over 2m31s) kubelet, ip-10-0-157-150.ap-southeast-1.compute.internal Container image "quay.io/coreos/configmap-reload@sha256:e2fd60ff0ae4500a75b80ebaa30e0e7deba9ad107833e8ca53f0047c42c5a057" already present on machine
Also, you can fix the StatefulSet and set the limit to something like 15Mi and the operator does not revert the change, which I would expect it to if a user comes in and modifies an object that was created and is managed by an operator.
This issue was resolved in Prometheus Operator 0.29 where `rules-configmap-reloader` container now defaults to memory limit of 25Mi. Latest version in OperatorHub is 0.27. I will check with OLM team so we if we can get this bumped to at least 0.29 in the Hub. Matt
Moving to modified based on Matt's comment
Spoke to Sergiusz last week - we are going to get the version of Prometheus in OLM bumped. I will update this thread once complete. Matt
PR here: https://github.com/operator-framework/community-operators/pull/667
Its working for me now, thanks for pushing the fix!
Verified that it's working as expected on Prometheus Operator 0.32 version. Steps used to reproduce: 1 Provisioned 4.1 cluster with 4.1.0-0.nightly-2019-10-03-210327 payload 2. Subscribed to Prometheus Operator and create a Prometheus CR. 3. Check pod limits and confirm that it's healthy Limits: - resources: limits: cpu: 100m memory: 25Mi requests: cpu: 100m memory: 25Mi terminationMessagePath: /dev/termination-log name: rules-configmap-reloader securityContext: oc get pods -n test NAME READY STATUS RESTARTS AGE prometheus-example-0 3/3 Running 1 4m29s prometheus-example-1 3/3 Running 1 4m28s prometheus-operator-57854c46d8-rsnmg 1/1 Running 0 4m46s oc get events -n test | grep rules-configmap-reloader 90s Normal Created pod/prometheus-example-1 Created container rules-configmap-reloader 90s Normal Started pod/prometheus-example-1 Started container rules-configmap-reloader
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0062