Bug 2012770
| Summary: | when using expression metric openshift_apps_deploymentconfigs_last_failed_rollout_time namespace label is re-written | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | German Parente <gparente> |
| Component: | openshift-controller-manager | Assignee: | Filip Krepinsky <fkrepins> |
| openshift-controller-manager sub component: | apps | QA Contact: | zhou ying <yinzhou> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | medium | ||
| Priority: | medium | CC: | aos-bugs, gmontero |
| Version: | 4.8 | ||
| Target Milestone: | --- | ||
| Target Release: | 4.10.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: |
Cause & Consequence:
openshift_apps_deploymentconfigs_last_failed_rollout_time metric has wrong namespace label and extra exported_namespace label
Fix & Result:
openshift_apps_deploymentconfigs_last_failed_rollout_time metric has correct namespace label and exported_namespace label is missing
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-03-10 16:18:42 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
talking DC metrics ... transferring - when testing this, it is necessary to make sure that the Prometheus is not overrideHonorLabels: true
- the alert rule can be simplified to
expr: count_over_time(openshift_apps_deploymentconfigs_last_failed_rollout_time{name="prometheus-example-app",namespace="ns1"}[1m]) > 0
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056 |
Description of problem: When using this expression in an alert rule: expr: count_over_time(openshift_apps_deploymentconfigs_last_failed_rollout_time{exported_namespace="ns1",name="prometheus-example-app",namespace="openshift-kube-controller-manager"}[1m]) > 0 to trigger when a deployment config has been unavailable, the rule is re-written to: expr: count_over_time(openshift_apps_deploymentconfigs_last_failed_rollout_time{exported_namespace="ns1",name="prometheus-example-app",namespace="ns1"}[1m]) > 0 After discussion with monitoring team, the issue is that the service monitor in openshift controller manager operator should have "honor_labels: true" Version-Release number of selected component (if applicable): 4.8