Bug 2051470 - prometheus: Add validations for relabel configs
Summary: prometheus: Add validations for relabel configs
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Monitoring
Version: 4.9
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.11.0
Assignee: Jayapriya Pai
QA Contact: Junqi Zhao
Brian Burt
URL:
Whiteboard:
Depends On:
Blocks: 2060718
TreeView+ depends on / blocked
 
Reported: 2022-02-07 10:48 UTC by German Parente
Modified: 2022-10-18 03:29 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Before this update prometheus-operator allowed invalid relabel configs, after this change it will validate the config passed
Clone Of:
: 2060718 (view as bug list)
Environment:
Last Closed: 2022-08-10 10:47:50 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-monitoring-operator pull 1571 0 None open Bug 2051470: Update prometheus-operator and sync jsonnet 2022-02-28 09:58:52 UTC
Github openshift prometheus-operator pull 151 0 None Merged [bot] Bump openshift/prometheus-operator to v0.54.0 2022-02-23 14:08:39 UTC
Github openshift prometheus-operator pull 158 0 None Merged [bot] Bump openshift/prometheus-operator to v0.54.1 2022-02-28 09:58:26 UTC
Red Hat Product Errata RHSA-2022:5069 0 None None None 2022-08-10 10:48:15 UTC

Description German Parente 2022-02-07 10:48:34 UTC
Description of problem:

this is just a bugzilla to document this upstream feature downstream:

https://github.com/prometheus-operator/prometheus-operator/pull/4429

it's fixed in 0.54 release of prometheus and this BZ is intended to match this to downstream version.


How reproducible:

Just add a relabelings in the spec and remove the targetLabel. Example:

From this:

spec:
  endpoints:
  - interval: 30s
    port: 8080-tcp
    scheme: http
    path: /actuator/prometheus
    relabelings:
    - action: replace
      regex: (.+)
      sourceLabels:
      - __meta_kubernetes_namespace
      targetLabel: namespace

then, let's remove targetLabel:

spec:
  endpoints:
  - interval: 30s
    port: 8080-tcp
    scheme: http
    path: /actuator/prometheus
    relabelings:
    - action: replace
      regex: (.+)
      sourceLabels:
      - __meta_kubernetes_namespace

We will see this:

level=error ts=2022-01-31T09:52:55.924Z caller=main.go:729 msg="Error reloading config" err="couldn't load configuration (--config.file=\"/etc/prometheus/config_out/prometheus.env.yaml\"): parsing YAML file /etc/prometheus/config_out/prometheus.env.yaml: relabel configuration for replace action requires 'target_label' value"

And no new configuration will be loaded because of one single invalid servicemonitor.

Comment 1 Jayapriya Pai 2022-02-23 14:08:40 UTC
relabel validation is included in 0.54 its already updated downstream https://github.com/openshift/prometheus-operator/pull/151
https://github.com/openshift/cluster-monitoring-operator/pull/1556 also brings this change to CMO once this PR is merged we are good to close this bug

Comment 2 Simon Pasquier 2022-02-23 14:37:37 UTC
For safety, we should probably bump to v0.54.1 once it's available.

Comment 3 Jayapriya Pai 2022-02-23 14:48:04 UTC
Sure once 0.54.1 is released I can pull that downstream and update in cmo

Comment 8 Junqi Zhao 2022-03-03 10:30:18 UTC
test with 4.11.0-0.nightly-2022-03-03-061758, prometheus operator 0.54.1, prometheus 2.32.1, no error in the prometheus-k8s pod and the servicemonitor is skipped, so it would not loaded to prometheus configuration, and there's error "relabel configuration for replace action needs targetLabel value" in prometheus operator

# oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- cat /etc/prometheus/config_out/prometheus.env.yaml | grep "serviceMonitor/openshift-console/console-test/0"
no result

# oc -n openshift-monitoring logs -c prometheus prometheus-k8s-0 | grep "couldn't load configuration"
no result

# oc -n openshift-monitoring logs -c prometheus-operator $(oc -n openshift-monitoring get pod --no-headers | grep prometheus-operator | awk '{print $1}') | grep "relabel configuration for replace action needs targetLabel value"
level=warn ts=2022-03-03T10:10:33.784201269Z caller=operator.go:1837 component=prometheusoperator msg="skipping servicemonitor" error="relabel configuration for replace action needs targetLabel value" servicemonitor=openshift-console/console-test namespace=openshift-monitoring prometheus=k8s
level=warn ts=2022-03-03T10:10:34.020788319Z caller=operator.go:1837 component=prometheusoperator msg="skipping servicemonitor" error="relabel configuration for replace action needs targetLabel value" servicemonitor=openshift-console/console-test namespace=openshift-monitoring prometheus=k8s
...

Comment 13 errata-xmlrpc 2022-08-10 10:47:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069


Note You need to log in before you can comment on or make changes to this bug.