1735691 – [OperatorHub]Prometheus operator creating memory limit too small

Bug 1735691 - [OperatorHub]Prometheus operator creating memory limit too small

Summary: [OperatorHub]Prometheus operator creating memory limit too small

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	OLM
Sub Component:
Version:	4.1.z
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Target Release:	4.3.0
Assignee:	Sergiusz Urbaniak
QA Contact:	Bruno Andrade
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1750605
TreeView+	depends on / blocked

Reported:	2019-08-01 09:26 UTC by nate stephany
Modified:	2020-01-23 11:05 UTC (History)
CC List:	15 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1750605 (view as bug list)
Environment:
Last Closed:	2020-01-23 11:04:29 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	operator-framework community-operators pull 667	0	'None'	closed	Add v0.32.0 ClusterServiceVersion	2021-01-20 17:32:42 UTC
Red Hat Product Errata	RHBA-2020:0062	0	None	None	None	2020-01-23 11:05:06 UTC

Description nate stephany 2019-08-01 09:26:45 UTC

Description of problem:
Prometheus operator is deploying StatefulSet and setting memory limit too small on the rules-configmap-reloader container

Version-Release number of selected component (if applicable):
OpenShift 4.1.8
Prometheus Operator 0.27.0
Prometheus v2.7.1


How reproducible:
100%


Steps to Reproduce:
1. Install Prometheus operator from OperatorHub in OpenShift UI
2. Deploy Prometheus instance with defaults


Actual results:
rules-configmap-reloader container in prometheus-my-prometheus-0|1 doesn't come up


Expected results:
prometheus-my-prometheus-0|1 pods should fully start


Additional info:
Relevant section of StatefulSet created by Prometheus Operator:

- name: rules-configmap-reloader
    image: >-
    quay.io/coreos/configmap-reload@sha256:e2fd60ff0ae4500a75b80ebaa30e0e7deba9ad107833e8ca53f0047c42c5a057
    args:
    - '--webhook-url=http://localhost:9090/-/reload'
    - >-
        --volume-dir=/etc/prometheus/rules/prometheus-my-prometheus-rulefiles-0
    resources:
    limits:
        cpu: 25m
        memory: 10Mi


oc describe po prometheus-my-prometheus-0
...
Events:
  Type     Reason     Age                    From                                                      Message
  ----     ------     ----                   ----                                                      -------
  Normal   Scheduled  3m15s                  default-scheduler                                         Successfully assigned anewtest/prometheus-my-prometheus-0 to ip-10-0-157-150.ap-southeast-1.compute.internal
  Normal   Pulling    3m6s                   kubelet, ip-10-0-157-150.ap-southeast-1.compute.internal  Pulling image "quay.io/prometheus/prometheus:v2.7.1"
  Normal   Pulled     2m52s                  kubelet, ip-10-0-157-150.ap-southeast-1.compute.internal  Successfully pulled image "quay.io/prometheus/prometheus:v2.7.1"
  Normal   Pulling    2m51s                  kubelet, ip-10-0-157-150.ap-southeast-1.compute.internal  Pulling image "quay.io/coreos/prometheus-config-reloader@sha256:61b7969fd1336fd4bbac0622f9e4281f2a2b4ae02ad55748b6fdc65e2be69b73"
  Normal   Pulling    2m41s                  kubelet, ip-10-0-157-150.ap-southeast-1.compute.internal  Pulling image "quay.io/coreos/configmap-reload@sha256:e2fd60ff0ae4500a75b80ebaa30e0e7deba9ad107833e8ca53f0047c42c5a057"
  Normal   Started    2m41s                  kubelet, ip-10-0-157-150.ap-southeast-1.compute.internal  Started container prometheus-config-reloader
  Normal   Pulled     2m41s                  kubelet, ip-10-0-157-150.ap-southeast-1.compute.internal  Successfully pulled image "quay.io/coreos/prometheus-config-reloader@sha256:61b7969fd1336fd4bbac0622f9e4281f2a2b4ae02ad55748b6fdc65e2be69b73"
  Normal   Created    2m41s                  kubelet, ip-10-0-157-150.ap-southeast-1.compute.internal  Created container prometheus-config-reloader
  Normal   Started    2m31s (x2 over 2m51s)  kubelet, ip-10-0-157-150.ap-southeast-1.compute.internal  Started container prometheus
  Normal   Created    2m31s (x2 over 2m51s)  kubelet, ip-10-0-157-150.ap-southeast-1.compute.internal  Created container prometheus
  Normal   Pulled     2m31s                  kubelet, ip-10-0-157-150.ap-southeast-1.compute.internal  Successfully pulled image "quay.io/coreos/configmap-reload@sha256:e2fd60ff0ae4500a75b80ebaa30e0e7deba9ad107833e8ca53f0047c42c5a057"
  Normal   Pulled     2m31s                  kubelet, ip-10-0-157-150.ap-southeast-1.compute.internal  Container image "quay.io/prometheus/prometheus:v2.7.1" already present on machine
  Warning  Failed     112s (x6 over 2m31s)   kubelet, ip-10-0-157-150.ap-southeast-1.compute.internal  Error: set memory limit 10485760 too low; should be at least 12582912
  Normal   Pulled     100s (x6 over 2m31s)   kubelet, ip-10-0-157-150.ap-southeast-1.compute.internal  Container image "quay.io/coreos/configmap-reload@sha256:e2fd60ff0ae4500a75b80ebaa30e0e7deba9ad107833e8ca53f0047c42c5a057" already present on machine

Comment 1 nate stephany 2019-08-01 09:36:22 UTC

Also, you can fix the StatefulSet and set the limit to something like 15Mi and the operator does not revert the change, which I would expect it to if a user comes in and modifies an object that was created and is managed by an operator.

Comment 4 Matt Dorn 2019-08-20 17:50:10 UTC

This issue was resolved in Prometheus Operator 0.29 where `rules-configmap-reloader` container now defaults to memory limit of 25Mi.

Latest version in OperatorHub is 0.27. I will check with OLM team so we if we can get this bumped to at least 0.29 in the Hub.

Matt

Comment 5 Evan Cordell 2019-08-27 14:25:11 UTC

Moving to modified based on Matt's comment

Comment 6 Matt Dorn 2019-08-27 14:58:11 UTC

Spoke to Sergiusz last week - we are going to get the version of Prometheus in OLM bumped. I will update this thread once complete.

Matt

Comment 7 Matt Dorn 2019-09-05 20:23:41 UTC

PR here: https://github.com/operator-framework/community-operators/pull/667

Comment 9 Ben Turner 2019-09-30 22:40:40 UTC

Its working for me now, thanks for pushing the fix!

Comment 11 Bruno Andrade 2019-10-04 15:58:44 UTC

Verified that it's working as expected on Prometheus Operator 0.32 version.

Steps used to reproduce:
1 Provisioned 4.1 cluster with 4.1.0-0.nightly-2019-10-03-210327 payload
2. Subscribed to Prometheus Operator and create a Prometheus CR.

3. Check pod limits and confirm that it's healthy

Limits:
    - resources:
        limits:
          cpu: 100m
          memory: 25Mi
        requests:
          cpu: 100m
          memory: 25Mi
      terminationMessagePath: /dev/termination-log
      name: rules-configmap-reloader
      securityContext:

oc get pods -n test
NAME                                   READY   STATUS    RESTARTS   AGE
prometheus-example-0                   3/3     Running   1          4m29s
prometheus-example-1                   3/3     Running   1          4m28s
prometheus-operator-57854c46d8-rsnmg   1/1     Running   0          4m46s


oc get events -n test | grep  rules-configmap-reloader
90s         Normal    Created               pod/prometheus-example-1                          Created container rules-configmap-reloader
90s         Normal    Started               pod/prometheus-example-1                          Started container rules-configmap-reloader

Comment 13 errata-xmlrpc 2020-01-23 11:04:29 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0062

Note You need to log in before you can comment on or make changes to this bug.