Bug 2012770 - when using expression metric openshift_apps_deploymentconfigs_last_failed_rollout_time namespace label is re-written
Summary: when using expression metric openshift_apps_deploymentconfigs_last_failed_rol...
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: openshift-controller-manager
Version: 4.8
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
: 4.10.0
Assignee: Filip Krepinsky
QA Contact: zhou ying
Depends On:
TreeView+ depends on / blocked
Reported: 2021-10-11 09:10 UTC by German Parente
Modified: 2022-03-10 16:19 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause & Consequence: openshift_apps_deploymentconfigs_last_failed_rollout_time metric has wrong namespace label and extra exported_namespace label Fix & Result: openshift_apps_deploymentconfigs_last_failed_rollout_time metric has correct namespace label and exported_namespace label is missing
Clone Of:
Last Closed: 2022-03-10 16:18:42 UTC
Target Upstream Version:

Attachments (Terms of Use)

System ID Private Priority Status Summary Last Updated
Github openshift cluster-openshift-controller-manager-operator pull 230 0 None open Bug 2012770: honor labels in openshift-controller-manager metrics 2021-11-15 20:25:08 UTC
Red Hat Product Errata RHSA-2022:0056 0 None None None 2022-03-10 16:19:10 UTC

Description German Parente 2021-10-11 09:10:05 UTC
Description of problem:

When using this expression in an alert rule:

expr: count_over_time(openshift_apps_deploymentconfigs_last_failed_rollout_time{exported_namespace="ns1",name="prometheus-example-app",namespace="openshift-kube-controller-manager"}[1m]) > 0 

to trigger when a deployment config has been unavailable, the rule is re-written to:

expr: count_over_time(openshift_apps_deploymentconfigs_last_failed_rollout_time{exported_namespace="ns1",name="prometheus-example-app",namespace="ns1"}[1m]) > 0 

After discussion with monitoring team, the issue is that the service monitor in openshift controller manager operator should have "honor_labels: true"

Version-Release number of selected component (if applicable): 4.8

Comment 1 Gabe Montero 2021-10-14 19:36:41 UTC
talking DC metrics ... transferring

Comment 2 Filip Krepinsky 2021-11-15 20:30:34 UTC
- when testing this, it is necessary to make sure that the Prometheus is not overrideHonorLabels: true
- the alert rule can be simplified to

expr: count_over_time(openshift_apps_deploymentconfigs_last_failed_rollout_time{name="prometheus-example-app",namespace="ns1"}[1m]) > 0

Comment 8 errata-xmlrpc 2022-03-10 16:18:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.