Bug 1667331

Summary: Duplicate alert for pod which have single container
Product: OpenShift Container Platform Reporter: Junqi Zhao <juzhao>
Component: MonitoringAssignee: Frederic Branczyk <fbranczy>
Status: CLOSED ERRATA QA Contact: Junqi Zhao <juzhao>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.1.0CC: surbania
Target Milestone: ---   
Target Release: 4.1.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-06-04 10:42:02 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
CPUThrottlingHigh alerts for cluster-monitoring-operator pod none

Description Junqi Zhao 2019-01-18 07:51:08 UTC
Created attachment 1521429 [details]
CPUThrottlingHigh alerts for cluster-monitoring-operator pod

Description of problem:
Cloned from https://jira.coreos.com/browse/MON-521
See the attached picture, there are 2 firing CPUThrottlingHigh alerts for cluster-monitoring-operator pod

The first one is: 39% throttling of CPU in namespace openshift-monitoring for container cluster-monitoring-operator in pod cluster-monitoring-operator-96ff8b5c9-d8cd7.

The second one is: 35% throttling of CPU in namespace openshift-monitoring for container in pod cluster-monitoring-operator-96ff8b5c9-d8cd7.

The second one missed the container name

*******************************************************************************

cluster-monitoring-operator pod only has one container

$oc -n openshift-monitoring get pod cluster-monitoring-operator-96ff8b5c9-d8cd7 -o jsonpath="{.spec.containers[*].name}"
cluster-monitoring-operator

also checked the alerts by API

********************************************************************************

{
            "labels": {
                "alertname": "CPUThrottlingHigh",
                "container_name": "cluster-monitoring-operator",
                "namespace": "openshift-monitoring",
                "pod_name": "cluster-monitoring-operator-96ff8b5c9-d8cd7",
                "severity": "warning"
            },
            "annotations": {
                "message": "36% throttling of CPU in namespace openshift-monitoring for container cluster-monitoring-operator in pod cluster-monitoring-operator-96ff8b5c9-d8cd7."
            },
            "state": "firing",
            "activeAt": "2019-01-18T06:56:01.534253519Z",
            "value": 36.43564356435644
        }, {
            "labels": {
                "alertname": "CPUThrottlingHigh",
                "namespace": "openshift-monitoring",
                "pod_name": "cluster-monitoring-operator-96ff8b5c9-d8cd7",
                "severity": "warning"
            },
            "annotations": {
                "message": "46% throttling of CPU in namespace openshift-monitoring for container  in pod cluster-monitoring-operator-96ff8b5c9-d8cd7."
            },
            "state": "firing",
            "activeAt": "2019-01-17T09:52:01.534253519Z",
            "value": 45.78544061302682
        }

 

********************************************************************************

I think the second one should not be appeared, there is something wrong in code to find out the container name




Version-Release number of selected component (if applicable):
payload: registry.svc.ci.openshift.org/ocp/release@sha256:85736576e39221daf368cc82be51cdb2509a77d2446ed98734286fc0ea99656c

cluster-monitoring-operator image: registry.svc.ci.openshift.org/ocp/4.0-art-latest-2019-01-15-184339@sha256:7b88121a3c893297c1be75261bae142b6312227c72bc72a0a64c68363a96601f

How reproducible:
Always

Steps to Reproduce:
1. login web console with cluster-admin and check alerts under "Monitoring -> Alerts"
2.
3.

Actual results:
one alert missed container name

Expected results:
the alert missed container name should not be shown

Additional info:

Comment 1 Frederic Branczyk 2019-02-06 13:16:11 UTC
This was fixed with https://github.com/openshift/cluster-monitoring-operator/pull/221

Comment 3 Junqi Zhao 2019-02-19 01:40:44 UTC
Set it to VERIFIED since the cloned one https://jira.coreos.com/browse/MON-521 is fixed

Comment 6 errata-xmlrpc 2019-06-04 10:42:02 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758