Bug 1735691

Summary: [OperatorHub]Prometheus operator creating memory limit too small
Product: OpenShift Container Platform Reporter: nate stephany <nstephan>
Component: OLMAssignee: Sergiusz Urbaniak <surbania>
OLM sub component: OperatorHub QA Contact: Bruno Andrade <bandrade>
Status: CLOSED ERRATA Docs Contact:
Severity: unspecified    
Priority: unspecified CC: alegrand, anpicker, bandrade, bturner, ecordell, erooth, jfan, jiazha, jreimann, lcosic, mdorn, mloibl, pkrupa, rkshirsa, surbania
Version: 4.1.zKeywords: Regression
Target Milestone: ---   
Target Release: 4.3.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1750605 (view as bug list) Environment:
Last Closed: 2020-01-23 11:04:29 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1750605    

Description nate stephany 2019-08-01 09:26:45 UTC
Description of problem:
Prometheus operator is deploying StatefulSet and setting memory limit too small on the rules-configmap-reloader container

Version-Release number of selected component (if applicable):
OpenShift 4.1.8
Prometheus Operator 0.27.0
Prometheus v2.7.1


How reproducible:
100%


Steps to Reproduce:
1. Install Prometheus operator from OperatorHub in OpenShift UI
2. Deploy Prometheus instance with defaults


Actual results:
rules-configmap-reloader container in prometheus-my-prometheus-0|1 doesn't come up


Expected results:
prometheus-my-prometheus-0|1 pods should fully start


Additional info:
Relevant section of StatefulSet created by Prometheus Operator:

- name: rules-configmap-reloader
    image: >-
    quay.io/coreos/configmap-reload@sha256:e2fd60ff0ae4500a75b80ebaa30e0e7deba9ad107833e8ca53f0047c42c5a057
    args:
    - '--webhook-url=http://localhost:9090/-/reload'
    - >-
        --volume-dir=/etc/prometheus/rules/prometheus-my-prometheus-rulefiles-0
    resources:
    limits:
        cpu: 25m
        memory: 10Mi


oc describe po prometheus-my-prometheus-0
...
Events:
  Type     Reason     Age                    From                                                      Message
  ----     ------     ----                   ----                                                      -------
  Normal   Scheduled  3m15s                  default-scheduler                                         Successfully assigned anewtest/prometheus-my-prometheus-0 to ip-10-0-157-150.ap-southeast-1.compute.internal
  Normal   Pulling    3m6s                   kubelet, ip-10-0-157-150.ap-southeast-1.compute.internal  Pulling image "quay.io/prometheus/prometheus:v2.7.1"
  Normal   Pulled     2m52s                  kubelet, ip-10-0-157-150.ap-southeast-1.compute.internal  Successfully pulled image "quay.io/prometheus/prometheus:v2.7.1"
  Normal   Pulling    2m51s                  kubelet, ip-10-0-157-150.ap-southeast-1.compute.internal  Pulling image "quay.io/coreos/prometheus-config-reloader@sha256:61b7969fd1336fd4bbac0622f9e4281f2a2b4ae02ad55748b6fdc65e2be69b73"
  Normal   Pulling    2m41s                  kubelet, ip-10-0-157-150.ap-southeast-1.compute.internal  Pulling image "quay.io/coreos/configmap-reload@sha256:e2fd60ff0ae4500a75b80ebaa30e0e7deba9ad107833e8ca53f0047c42c5a057"
  Normal   Started    2m41s                  kubelet, ip-10-0-157-150.ap-southeast-1.compute.internal  Started container prometheus-config-reloader
  Normal   Pulled     2m41s                  kubelet, ip-10-0-157-150.ap-southeast-1.compute.internal  Successfully pulled image "quay.io/coreos/prometheus-config-reloader@sha256:61b7969fd1336fd4bbac0622f9e4281f2a2b4ae02ad55748b6fdc65e2be69b73"
  Normal   Created    2m41s                  kubelet, ip-10-0-157-150.ap-southeast-1.compute.internal  Created container prometheus-config-reloader
  Normal   Started    2m31s (x2 over 2m51s)  kubelet, ip-10-0-157-150.ap-southeast-1.compute.internal  Started container prometheus
  Normal   Created    2m31s (x2 over 2m51s)  kubelet, ip-10-0-157-150.ap-southeast-1.compute.internal  Created container prometheus
  Normal   Pulled     2m31s                  kubelet, ip-10-0-157-150.ap-southeast-1.compute.internal  Successfully pulled image "quay.io/coreos/configmap-reload@sha256:e2fd60ff0ae4500a75b80ebaa30e0e7deba9ad107833e8ca53f0047c42c5a057"
  Normal   Pulled     2m31s                  kubelet, ip-10-0-157-150.ap-southeast-1.compute.internal  Container image "quay.io/prometheus/prometheus:v2.7.1" already present on machine
  Warning  Failed     112s (x6 over 2m31s)   kubelet, ip-10-0-157-150.ap-southeast-1.compute.internal  Error: set memory limit 10485760 too low; should be at least 12582912
  Normal   Pulled     100s (x6 over 2m31s)   kubelet, ip-10-0-157-150.ap-southeast-1.compute.internal  Container image "quay.io/coreos/configmap-reload@sha256:e2fd60ff0ae4500a75b80ebaa30e0e7deba9ad107833e8ca53f0047c42c5a057" already present on machine

Comment 1 nate stephany 2019-08-01 09:36:22 UTC
Also, you can fix the StatefulSet and set the limit to something like 15Mi and the operator does not revert the change, which I would expect it to if a user comes in and modifies an object that was created and is managed by an operator.

Comment 4 Matt Dorn 2019-08-20 17:50:10 UTC
This issue was resolved in Prometheus Operator 0.29 where `rules-configmap-reloader` container now defaults to memory limit of 25Mi.

Latest version in OperatorHub is 0.27. I will check with OLM team so we if we can get this bumped to at least 0.29 in the Hub.

Matt

Comment 5 Evan Cordell 2019-08-27 14:25:11 UTC
Moving to modified based on Matt's comment

Comment 6 Matt Dorn 2019-08-27 14:58:11 UTC
Spoke to Sergiusz last week - we are going to get the version of Prometheus in OLM bumped. I will update this thread once complete.

Matt

Comment 9 Ben Turner 2019-09-30 22:40:40 UTC
Its working for me now, thanks for pushing the fix!

Comment 11 Bruno Andrade 2019-10-04 15:58:44 UTC
Verified that it's working as expected on Prometheus Operator 0.32 version.

Steps used to reproduce:
1 Provisioned 4.1 cluster with 4.1.0-0.nightly-2019-10-03-210327 payload
2. Subscribed to Prometheus Operator and create a Prometheus CR.

3. Check pod limits and confirm that it's healthy

Limits:
    - resources:
        limits:
          cpu: 100m
          memory: 25Mi
        requests:
          cpu: 100m
          memory: 25Mi
      terminationMessagePath: /dev/termination-log
      name: rules-configmap-reloader
      securityContext:

oc get pods -n test
NAME                                   READY   STATUS    RESTARTS   AGE
prometheus-example-0                   3/3     Running   1          4m29s
prometheus-example-1                   3/3     Running   1          4m28s
prometheus-operator-57854c46d8-rsnmg   1/1     Running   0          4m46s


oc get events -n test | grep  rules-configmap-reloader
90s         Normal    Created               pod/prometheus-example-1                          Created container rules-configmap-reloader
90s         Normal    Started               pod/prometheus-example-1                          Started container rules-configmap-reloader

Comment 13 errata-xmlrpc 2020-01-23 11:04:29 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0062