Bug 1975432 - Alert InstallPlanStepAppliedWithWarnings does not resolve
Summary: Alert InstallPlanStepAppliedWithWarnings does not resolve
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: OLM
Version: 4.8
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 4.8.0
Assignee: Kevin Rizza
QA Contact: Jian Zhang
URL:
Whiteboard:
Depends On: 1975824
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-06-23 16:13 UTC by Ben Luddy
Modified: 2021-07-27 23:13 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1975824 (view as bug list)
Environment:
Last Closed: 2021-07-27 23:13:39 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift operator-framework-olm pull 102 0 None open Bug 1975432: Resolve InstallPlanStepAppliedWithWarnings alert after some time. 2021-06-25 14:02:28 UTC
Red Hat Product Errata RHSA-2021:2438 0 None None None 2021-07-27 23:13:53 UTC

Description Ben Luddy 2021-06-23 16:13:05 UTC
Description of problem: The alert InstallPlanStepAppliedWithWarnings is supposed to fire when the catalog-operator receives a warning from the API server in the process of applying a resource on the user's behalf, and resolve when no warnings have been received for a period of time.

Version-Release number of selected component (if applicable): 4.8+


How reproducible: Always


Steps to Reproduce:
1. Install an operator that will trigger a warning. An easy example is a bundle containing v1beta1 CustomResourceDefinitions, which will produce a deprecation warning.
2. Wait 5-10 minutes.

Actual results: The alert is still firing.

Expected results: The alert stops firing when no warnings have been encountered for a period of time.

Comment 4 Jian Zhang 2021-06-28 10:10:41 UTC
[cloud-user@preserve-olm-env jian]$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.8.0-0.nightly-2021-06-25-182927   True        False         28m     Cluster version is 4.8.0-0.nightly-2021-06-25-182927
[cloud-user@preserve-olm-env jian]$ oc adm release info registry.ci.openshift.org/ocp/release:4.8.0-0.nightly-2021-06-25-182927 --commits|grep lifecycle
  operator-lifecycle-manager                     https://github.com/openshift/operator-framework-olm                         4628ec71b4c928908101dc318556061923974298

1, Install the community etcd-operator which is using the v1beta1.
[cloud-user@preserve-olm-env jian]$ oc get sub
NAME   PACKAGE   SOURCE                CHANNEL
etcd   etcd      community-operators   singlenamespace-alpha
[cloud-user@preserve-olm-env jian]$ oc get csv
NAME                  DISPLAY   VERSION   REPLACES              PHASE
etcdoperator.v0.9.4   etcd      0.9.4     etcdoperator.v0.9.2   Succeeded

2, Check the Prometheus Warnings: "InstallPlanStepAppliedWithWarnings"

[cloud-user@preserve-olm-env jian]$ curl -k -H "Authorization: Bearer $(oc -n openshift-monitoring sa get-token prometheus-k8s)" https://alertmanager-main-openshift-monitoring.apps.pdhamdhe-4828.qe.gcp.devcluster.openshift.com/api/v1/alerts| jq 
...
    {
      "labels": {
        "alertname": "InstallPlanStepAppliedWithWarnings",
        "prometheus": "openshift-monitoring/k8s",
        "severity": "warning"
      },
      "annotations": {
        "message": "The API server returned a warning during installation or upgrade of an operator. An Event with reason \"AppliedWithWarnings\" has been created with complete details, including a reference to the InstallPlan step that generated the warning."
      },
      "startsAt": "2021-06-28T05:41:34.233Z",
      "endsAt": "2021-06-28T05:45:34.233Z",
      "generatorURL": "https://prometheus-k8s-openshift-monitoring.apps.pdhamdhe-4828.qe.gcp.devcluster.openshift.com/graph?g0.expr=sum%28increase%28installplan_warnings_total%5B5m%5D%29%29+%3E+0&g0.tab=1",
      "status": {
        "state": "unprocessed",
        "silencedBy": [],
        "inhibitedBy": []
      },
      "receivers": [
        "Default"
      ],
      "fingerprint": "fe12e9a2d3e5ccd7"
    }
  ]
}

3, waiting for 5 mins.

4, Check it again,
[cloud-user@preserve-olm-env jian]$ curl -k -H "Authorization: Bearer $(oc -n openshift-monitoring sa get-token prometheus-k8s)" https://alertmanager-main-openshift-monitoring.apps.pdhamdhe-4828.qe.gcp.devcluster.openshift.com/api/v1/alerts| jq |grep -i "InstallPlanStepApplied" -A20
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1766  100  1766    0     0  11773      0 --:--:-- --:--:-- --:--:-- 11773

No "InstallPlanStepAppliedWithWarnings" found, verify it.

Comment 11 errata-xmlrpc 2021-07-27 23:13:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438


Note You need to log in before you can comment on or make changes to this bug.