Bug 1852767 - prometheus.rentention does not take effect for UWM prometheus-user-workload pod
Summary: prometheus.rentention does not take effect for UWM prometheus-user-workload pod
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Monitoring
Version: 4.6
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.6.0
Assignee: Lili Cosic
QA Contact: Junqi Zhao
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-07-01 09:10 UTC by Junqi Zhao
Modified: 2020-10-27 16:12 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-27 16:11:46 UTC
Target Upstream Version:


Attachments (Terms of Use)
prometheus crd file (401.29 KB, text/plain)
2020-07-01 11:55 UTC, Junqi Zhao
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2020:4196 0 None None None 2020-10-27 16:12:05 UTC

Description Junqi Zhao 2020-07-01 09:10:23 UTC
Description of problem:
eanbled User Workload Monitoring and created user-workload-monitoring-config configmap to set prometheus.retention as 48h, but retention time is still 15d for prometheus-user-workload pod.
Note: default retention time for statefulset prometheus-k8s under openshift-monitoring is 15d
# kubectl -n openshift-user-workload-monitoring get cm user-workload-monitoring-config -oyaml
apiVersion: v1
data:
  config.yaml: |
    prometheus:
      retention: 48h
kind: ConfigMap
metadata:
  creationTimestamp: "2020-07-01T08:26:31Z"
  managedFields:
  - apiVersion: v1
    fieldsType: FieldsV1
    fieldsV1:
      f:data:
        .: {}
        f:config.yaml: {}
    manager: oc
    operation: Update
    time: "2020-07-01T08:26:31Z"
  name: user-workload-monitoring-config
  namespace: openshift-user-workload-monitoring

# for i in $(kubectl -n openshift-user-workload-monitoring get sts --no-headers | awk '{print $1}'); do echo $i; kubectl -n openshift-user-workload-monitoring get sts $i -oyaml | grep -i retention -A1; done
prometheus-user-workload
        - --storage.tsdb.retention.time=15d
        - --web.enable-lifecycle
thanos-ruler-user-workload
        - --tsdb.retention=24h
        - --label=thanos_ruler_replica="$(POD_NAME)"

#  for i in $(kubectl -n openshift-user-workload-monitoring get pod | grep prometheus-user-workload | awk '{print $1}'); do echo $i; kubectl -n openshift-user-workload-monitoring get pod  $i -oyaml | grep -i retention -A1; done
prometheus-user-workload-0
    - --storage.tsdb.retention.time=15d
    - --web.enable-lifecycle
prometheus-user-workload-1
    - --storage.tsdb.retention.time=15d
    - --web.enable-lifecycle

# kubectl -n openshift-monitoring get sts prometheus-k8s  -oyaml | grep -i retention -A1
        - --storage.tsdb.retention.time=15d
        - --web.enable-lifecycle



Version-Release number of selected component (if applicable):
4.6.0-0.nightly-2020-06-30-000342

How reproducible:
always

Steps to Reproduce:
1. see the description
2.
3.

Actual results:


Expected results:


Additional info:

Comment 6 Junqi Zhao 2020-07-01 11:55:25 UTC
Created attachment 1699477 [details]
prometheus crd file

Comment 8 Junqi Zhao 2020-07-01 12:07:32 UTC
# oc -n openshift-user-workload-monitoring get prometheus user-workload -oyaml | grep retention
  retention: 15d

Comment 10 Lili Cosic 2020-07-30 14:46:55 UTC
https://github.com/openshift/cluster-monitoring-operator/pull/839 Pr merged, forgot to link it.

Comment 12 Junqi Zhao 2020-08-03 08:49:39 UTC
Tested with 4.6.0-0.nightly-2020-08-02-091622, issue is fixed, verify steps see from Comment 0
# oc -n openshift-user-workload-monitoring get prometheus/user-workload -oyaml | grep -i retention -A1
  retention: 48h
  ruleNamespaceSelector: {}
# for i in $(oc -n openshift-user-workload-monitoring get pod | grep prometheus-user-workload | awk '{print $1}'); do echo $i; oc -n openshift-user-workload-monitoring get pod  $i -oyaml | grep -i retention -A1; done
prometheus-user-workload-0
    - --storage.tsdb.retention.time=48h
    - --web.enable-lifecycle
prometheus-user-workload-1
    - --storage.tsdb.retention.time=48h
    - --web.enable-lifecycle
# oc -n openshift-user-workload-monitoring get sts/prometheus-user-workload -oyaml | grep -i retention -A1
        - --storage.tsdb.retention.time=48h
        - --web.enable-lifecycle

Comment 15 errata-xmlrpc 2020-10-27 16:11:46 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196


Note You need to log in before you can comment on or make changes to this bug.