Bug 1872253 - Invalid service monitors block the update of the user workload monitoring prometheus
Summary: Invalid service monitors block the update of the user workload monitoring pro...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Monitoring
Version: 4.6
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 4.6.0
Assignee: Simon Pasquier
QA Contact: Junqi Zhao
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-08-25 09:42 UTC by Simon Pasquier
Modified: 2020-10-27 16:33 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-27 16:33:06 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift prometheus-operator pull 89 0 None open Bug 1872253: skip invalid service monitors 2020-08-31 16:07:19 UTC
Github prometheus-operator prometheus-operator issues 3327 0 None open Add new metric when prometheus is stuck on "creating config failed" 2020-08-31 13:41:01 UTC
Github prometheus-operator prometheus-operator issues 3329 0 None open Incorrect ServiceMonitor blocks prometheus deployment 2020-08-31 13:41:01 UTC
Github prometheus-operator prometheus-operator pull 3445 0 None open pkg/prometheus: skip invalid service monitors 2020-08-31 13:41:01 UTC
Red Hat Product Errata RHBA-2020:4196 0 None None None 2020-10-27 16:33:08 UTC

Description Simon Pasquier 2020-08-25 09:42:00 UTC
Description of problem:
Whenever a service monitor references an invalid secret or configmap's key, the prometheus operator wouldn't update the Prometheus configuration. It shouldn't be a big issue for the infra Prometheus because we pretty control what goes in but it's more problematic for user workload monitoring (basically a bad service monitor can DoS the service).

Version-Release number of selected component (if applicable):
4.6

How reproducible:
Always

Steps to Reproduce:
1. Enable user workload monitoring
2. Create a secret + a service monitor that references this secret but with an invalid key
apiVersion: v1
data: {}
kind: Secret
metadata:
  name: demo
  namespace: default
type: Opaque

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: demo
  namespace: default
spec:
  endpoints:
  - port: web
    bearerTokenSecret:
      key: missing
      name: demo
  selector:
    matchLabels:
      app: demo

3. Create a valid service monitor

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: demo2
  namespace: default
spec:
  endpoints:
  - port: web
  selector:
    matchLabels:
      app: demo2

Actual results:
The second service monitor isn't present in the Prometheus configuration.

Expected results:
The second service monitor should be present in the Prometheus configuration.

Additional info:
https://github.com/prometheus-operator/prometheus-operator/issues/3327

Comment 6 errata-xmlrpc 2020-10-27 16:33:06 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196


Note You need to log in before you can comment on or make changes to this bug.