Hello, The OpenShift Monitoring Team has published a set guidelines for writing alerting rules in OpenShift, including a basic style guide. You can find these here: https://github.com/openshift/enhancements/blob/master/enhancements/monitoring/alerting-consistency.md https://github.com/openshift/enhancements/blob/master/enhancements/monitoring/alerting-consistency.md#style-guide A subset of these are now being enforced in OpenShift End-to-End tests [1], with temporary exceptions for existing non-compliant rules. This component was found to have the following issues: * Alerts without summary and/or description annotations: - CertifiedOperatorsCatalogError - CommunityOperatorsCatalogError - RedhatMarketplaceCatalogError - RedhatOperatorsCatalogError Alerts MUST include summary and description annotations. Think of summary as the first line of a commit message, or an email subject line. It should be brief but informative. The description is the longer, more detailed explanation of the alert. The enhancement document linked above has examples of alerts with these annotations. Thank you! Repo: operator-framework/operator-marketplace [1]: https://github.com/openshift/origin/commit/097e7a6
Any progress on this issue? The monitoring team could help if needed.
verify: 1) install cluster with this PR zhaoxia@xzha-mac openshift-tests-private % oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.11.0-0.ci.test-2022-06-13-054554-ci-ln-zn13fyk-latest True False 39m Cluster version is 4.11.0-0.ci.test-2022-06-13-054554-ci-ln-zn13fyk-latest zhaoxia@xzha-mac openshift-tests-private % oc get catsrc NAME DISPLAY TYPE PUBLISHER AGE certified-operators Certified Operators grpc Red Hat 61m community-operators Community Operators grpc Red Hat 61m redhat-marketplace Red Hat Marketplace grpc Red Hat 61m redhat-operators Red Hat Operators grpc Red Hat 61m 2) make catsrc certified-operators pod is pending oc patch catsrc certified-operators -p='{"spec":{"grpcPodConfig":{"nodeSelector":{"fake43642":"fake"}}}}' --type=merge zhaoxia@xzha-mac openshift-tests-private % oc get pod NAME READY STATUS RESTARTS AGE certified-operators-7sns9 0/1 Pending 0 8s community-operators-8kqfv 1/1 Running 0 40m marketplace-operator-85d9b67789-g7tjw 1/1 Running 0 43m redhat-marketplace-bcc47 1/1 Running 0 40m redhat-operators-w6cbn 1/1 Running 0 40m 3) check alert zhaoxia@xzha-mac openshift-tests-private % curl -k -H "Authorization: Bearer $(oc create token prometheus-k8s -n openshift-monitoring)" https://$(oc get route prometheus-k8s -n openshift-monitoring -o=jsonpath='{.spec.host}')/api/v1/alerts| jq -r '.data.alerts[] | select (.labels.alertname == "OperatorHubSourceError")' { "labels": { "alertname": "OperatorHubSourceError", "container": "catalog-operator", "endpoint": "https-metrics", "exported_namespace": "openshift-marketplace", "instance": "10.128.0.25:8443", "job": "catalog-operator-metrics", "name": "certified-operators", "namespace": "openshift-operator-lifecycle-manager", "pod": "catalog-operator-6468cdd79f-4gnqb", "service": "catalog-operator-metrics", "severity": "warning" }, "annotations": { "description": "Operators shipped via the certified-operators source are not available for installation until the issue is fixed. Operators already installed from this source will not receive updates until issue is fixed. Inspect the status of the pod owned by certified-operators source in the openshift-marketplace namespace (oc -n openshift-marketplace get pods -l olm.catalogSource=certified-operators) to diagnose and repair.", "summary": "The certified-operators source is in non-ready state for more than 10 minutes." }, "state": "firing", "activeAt": "2022-06-13T06:38:33.120882761Z", "value": "0e+00" } 4) stop other catalogs oc patch catsrc community-operators -p='{"spec":{"grpcPodConfig":{"nodeSelector":{"fake43642":"fake"}}}}' --type=merge oc patch catsrc redhat-marketplace -p='{"spec":{"grpcPodConfig":{"nodeSelector":{"fake43642":"fake"}}}}' --type=merge oc patch catsrc redhat-operators -p='{"spec":{"grpcPodConfig":{"nodeSelector":{"fake43642":"fake"}}}}' --type=merge 5) check alert zhaoxia@xzha-mac openshift-tests-private % curl -k -H "Authorization: Bearer $(oc create token prometheus-k8s -n openshift-monitoring)" https://$(oc get route prometheus-k8s -n openshift-monitoring -o=jsonpath='{.spec.host}')/api/v1/alerts| jq -r '.data.alerts[] | select (.labels.alertname == "OperatorHubSourceError")' % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 10566 0 10566 0 0 7908 0 --:--:-- 0:00:01 --:--:-- 7998 { "labels": { "alertname": "OperatorHubSourceError", "container": "catalog-operator", "endpoint": "https-metrics", "exported_namespace": "openshift-marketplace", "instance": "10.128.0.25:8443", "job": "catalog-operator-metrics", "name": "certified-operators", "namespace": "openshift-operator-lifecycle-manager", "pod": "catalog-operator-6468cdd79f-4gnqb", "service": "catalog-operator-metrics", "severity": "warning" }, "annotations": { "description": "Operators shipped via the certified-operators source are not available for installation until the issue is fixed. Operators already installed from this source will not receive updates until issue is fixed. Inspect the status of the pod owned by certified-operators source in the openshift-marketplace namespace (oc -n openshift-marketplace get pods -l olm.catalogSource=certified-operators) to diagnose and repair.", "summary": "The certified-operators source is in non-ready state for more than 10 minutes." }, "state": "firing", "activeAt": "2022-06-13T06:38:33.120882761Z", "value": "0e+00" } { "labels": { "alertname": "OperatorHubSourceError", "container": "catalog-operator", "endpoint": "https-metrics", "exported_namespace": "openshift-marketplace", "instance": "10.128.0.25:8443", "job": "catalog-operator-metrics", "name": "community-operators", "namespace": "openshift-operator-lifecycle-manager", "pod": "catalog-operator-6468cdd79f-4gnqb", "service": "catalog-operator-metrics", "severity": "warning" }, "annotations": { "description": "Operators shipped via the community-operators source are not available for installation until the issue is fixed. Operators already installed from this source will not receive updates until issue is fixed. Inspect the status of the pod owned by community-operators source in the openshift-marketplace namespace (oc -n openshift-marketplace get pods -l olm.catalogSource=community-operators) to diagnose and repair.", "summary": "The community-operators source is in non-ready state for more than 10 minutes." }, "state": "firing", "activeAt": "2022-06-13T06:51:33.120882761Z", "value": "0e+00" } { "labels": { "alertname": "OperatorHubSourceError", "container": "catalog-operator", "endpoint": "https-metrics", "exported_namespace": "openshift-marketplace", "instance": "10.128.0.25:8443", "job": "catalog-operator-metrics", "name": "redhat-marketplace", "namespace": "openshift-operator-lifecycle-manager", "pod": "catalog-operator-6468cdd79f-4gnqb", "service": "catalog-operator-metrics", "severity": "warning" }, "annotations": { "description": "Operators shipped via the redhat-marketplace source are not available for installation until the issue is fixed. Operators already installed from this source will not receive updates until issue is fixed. Inspect the status of the pod owned by redhat-marketplace source in the openshift-marketplace namespace (oc -n openshift-marketplace get pods -l olm.catalogSource=redhat-marketplace) to diagnose and repair.", "summary": "The redhat-marketplace source is in non-ready state for more than 10 minutes." }, "state": "pending", "activeAt": "2022-06-13T06:53:33.120882761Z", "value": "0e+00" } { "labels": { "alertname": "OperatorHubSourceError", "container": "catalog-operator", "endpoint": "https-metrics", "exported_namespace": "openshift-marketplace", "instance": "10.128.0.25:8443", "job": "catalog-operator-metrics", "name": "redhat-operators", "namespace": "openshift-operator-lifecycle-manager", "pod": "catalog-operator-6468cdd79f-4gnqb", "service": "catalog-operator-metrics", "severity": "warning" }, "annotations": { "description": "Operators shipped via the redhat-operators source are not available for installation until the issue is fixed. Operators already installed from this source will not receive updates until issue is fixed. Inspect the status of the pod owned by redhat-operators source in the openshift-marketplace namespace (oc -n openshift-marketplace get pods -l olm.catalogSource=redhat-operators) to diagnose and repair.", "summary": "The redhat-operators source is in non-ready state for more than 10 minutes." }, "state": "pending", "activeAt": "2022-06-13T06:54:03.120882761Z", "value": "0e+00" } LGTM, verified.
verify: zhaoxia@xzha-mac ~ % oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.12.0-0.nightly-2022-07-24-180529 True False 38m Cluster version is 4.12.0-0.nightly-2022-07-24-180529 zhaoxia@xzha-mac ~ % oc get catsrc -A NAMESPACE NAME DISPLAY TYPE PUBLISHER AGE openshift-marketplace certified-operators Certified Operators grpc Red Hat 56m openshift-marketplace community-operators Community Operators grpc Red Hat 56m openshift-marketplace redhat-marketplace Red Hat Marketplace grpc Red Hat 56m openshift-marketplace redhat-operators Red Hat Operators grpc Red Hat 56m 1) make catsrc pod is pending zhaoxia@xzha-mac ~ % oc patch catsrc certified-operators -p='{"spec":{"grpcPodConfig":{"nodeSelector":{"fake43642":"fake"}}}}' --type=merge catalogsource.operators.coreos.com/certified-operators patched zhaoxia@xzha-mac ~ % oc patch catsrc community-operators -p='{"spec":{"grpcPodConfig":{"nodeSelector":{"fake43642":"fake"}}}}' --type=merge catalogsource.operators.coreos.com/community-operators patched zhaoxia@xzha-mac ~ % oc patch catsrc redhat-marketplace -p='{"spec":{"grpcPodConfig":{"nodeSelector":{"fake43642":"fake"}}}}' --type=merge catalogsource.operators.coreos.com/redhat-marketplace patched zhaoxia@xzha-mac ~ % oc patch catsrc redhat-operators -p='{"spec":{"grpcPodConfig":{"nodeSelector":{"fake43642":"fake"}}}}' --type=merge catalogsource.operators.coreos.com/redhat-operators patched zhaoxia@xzha-mac ~ % oc get pod NAME READY STATUS RESTARTS AGE certified-operators-szdsx 0/1 Pending 0 2m28s community-operators-q5hn5 0/1 Pending 0 2m18s marketplace-operator-bbbc9755c-lpkpr 1/1 Running 5 (44m ago) 62m redhat-marketplace-dhzx4 0/1 Pending 0 2m7s redhat-operators-lhdw7 0/1 Pending 0 117s 2) check alert zhaoxia@xzha-mac ~ % curl -k -H "Authorization: Bearer $(oc create token prometheus-k8s -n openshift-monitoring)" https://$(oc get route prometheus-k8s -n openshift-monitoring -o=jsonpath='{.spec.host}')/api/v1/alerts| jq -r '.data.alerts[] | select (.labels.alertname == "OperatorHubSourceError")' % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 4305 0 4305 0 0 4527 0 --:--:-- --:--:-- --:--:-- 4565 { "labels": { "alertname": "OperatorHubSourceError", "container": "catalog-operator", "endpoint": "https-metrics", "exported_namespace": "openshift-marketplace", "instance": "10.128.0.20:8443", "job": "catalog-operator-metrics", "name": "certified-operators", "namespace": "openshift-operator-lifecycle-manager", "pod": "catalog-operator-98ccdfbfc-d9bqf", "service": "catalog-operator-metrics", "severity": "warning" }, "annotations": { "description": "Operators shipped via the certified-operators source are not available for installation until the issue is fixed. Operators already installed from this source will not receive updates until issue is fixed. Inspect the status of the pod owned by certified-operators source in the openshift-marketplace namespace (oc -n openshift-marketplace get pods -l olm.catalogSource=certified-operators) to diagnose and repair.", "summary": "The certified-operators source is in non-ready state for more than 10 minutes." }, "state": "pending", "activeAt": "2022-07-25T05:39:47.911111372Z", "value": "0e+00" } { "labels": { "alertname": "OperatorHubSourceError", "container": "catalog-operator", "endpoint": "https-metrics", "exported_namespace": "openshift-marketplace", "instance": "10.128.0.20:8443", "job": "catalog-operator-metrics", "name": "community-operators", "namespace": "openshift-operator-lifecycle-manager", "pod": "catalog-operator-98ccdfbfc-d9bqf", "service": "catalog-operator-metrics", "severity": "warning" }, "annotations": { "description": "Operators shipped via the community-operators source are not available for installation until the issue is fixed. Operators already installed from this source will not receive updates until issue is fixed. Inspect the status of the pod owned by community-operators source in the openshift-marketplace namespace (oc -n openshift-marketplace get pods -l olm.catalogSource=community-operators) to diagnose and repair.", "summary": "The community-operators source is in non-ready state for more than 10 minutes." }, "state": "pending", "activeAt": "2022-07-25T05:39:47.911111372Z", "value": "0e+00" } { "labels": { "alertname": "OperatorHubSourceError", "container": "catalog-operator", "endpoint": "https-metrics", "exported_namespace": "openshift-marketplace", "instance": "10.128.0.20:8443", "job": "catalog-operator-metrics", "name": "redhat-marketplace", "namespace": "openshift-operator-lifecycle-manager", "pod": "catalog-operator-98ccdfbfc-d9bqf", "service": "catalog-operator-metrics", "severity": "warning" }, "annotations": { "description": "Operators shipped via the redhat-marketplace source are not available for installation until the issue is fixed. Operators already installed from this source will not receive updates until issue is fixed. Inspect the status of the pod owned by redhat-marketplace source in the openshift-marketplace namespace (oc -n openshift-marketplace get pods -l olm.catalogSource=redhat-marketplace) to diagnose and repair.", "summary": "The redhat-marketplace source is in non-ready state for more than 10 minutes." }, "state": "pending", "activeAt": "2022-07-25T05:39:47.911111372Z", "value": "0e+00" } LGTM, verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.12.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:7399