Bug 2100472 - TechPreview feature is not enabled, but find "failed to list *v1alpha1.AlertingRule: alertingrules.monitoring.openshift.io is forbidden" in cmo logs
Summary: TechPreview feature is not enabled, but find "failed to list *v1alpha1.Alerti...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Monitoring
Version: 4.11
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.12.0
Assignee: Simon Pasquier
QA Contact: Junqi Zhao
Brian Burt
URL:
Whiteboard:
: 2103033 (view as bug list)
Depends On:
Blocks: 2103127
TreeView+ depends on / blocked
 
Reported: 2022-06-23 13:15 UTC by Junqi Zhao
Modified: 2023-01-17 19:50 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-01-17 19:50:29 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-monitoring-operator pull 1701 0 None open Bug 2100472: start alert controllers only when techpreview 2022-06-23 14:16:38 UTC
Github openshift cluster-monitoring-operator pull 1706 0 None open Revert "Bug 2100472: start alert controllers only when techpreview" 2022-07-04 14:08:31 UTC
Github openshift cluster-monitoring-operator pull 1707 0 None open Bug 2100472: fix alert controllers when not in techpreview 2022-07-04 15:12:31 UTC
Red Hat Product Errata RHSA-2022:7399 0 None None None 2023-01-17 19:50:45 UTC

Description Junqi Zhao 2022-06-23 13:15:09 UTC
Description of problem:
MON-2552 PR https://github.com/openshift/cluster-monitoring-operator/pull/1675 is in payload 4.11.0-0.nightly-2022-06-22-190830.
TechPreview feature is not enabled, but find "failed to list *v1alpha1.AlertingRule: alertingrules.monitoring.openshift.io is forbidden" in cmo logs

# oc get featuregate cluster -oyaml
apiVersion: config.openshift.io/v1
kind: FeatureGate
metadata:
  annotations:
    include.release.openshift.io/ibm-cloud-managed: "true"
    include.release.openshift.io/self-managed-high-availability: "true"
    include.release.openshift.io/single-node-developer: "true"
    release.openshift.io/create-only: "true"
  creationTimestamp: "2022-06-23T00:39:48Z"
  generation: 1
  name: cluster
  ownerReferences:
  - apiVersion: config.openshift.io/v1
    kind: ClusterVersion
    name: version
    uid: a158e319-e88f-457f-b270-5b67f1b8c18c
  resourceVersion: "1321"
  uid: 31692f28-0688-4ba7-be51-0f4b1b83fff6
spec: {}

# oc -n openshift-monitoring get pod
NAME                                                     READY   STATUS    RESTARTS   AGE
alertmanager-main-0                                      6/6     Running   0          12h
alertmanager-main-1                                      6/6     Running   0          12h
cluster-monitoring-operator-b87447c68-b5bd6              2/2     Running   0          12h
kube-state-metrics-5475455998-n65nv                      3/3     Running   0          12h
node-exporter-c8x9p                                      2/2     Running   0          12h
node-exporter-fkbfr                                      2/2     Running   0          12h
node-exporter-gjpkb                                      2/2     Running   0          12h
node-exporter-kwgk6                                      2/2     Running   0          12h
node-exporter-kzt9q                                      2/2     Running   0          12h
node-exporter-vrl7p                                      2/2     Running   0          12h
openshift-state-metrics-8679b4d578-vpd4w                 3/3     Running   0          12h
prometheus-adapter-5f4bcc7778-pxplf                      1/1     Running   0          26m
prometheus-adapter-5f4bcc7778-srxrl                      1/1     Running   0          26m
prometheus-k8s-0                                         6/6     Running   0          12h
prometheus-k8s-1                                         6/6     Running   0          12h
prometheus-operator-658d9c456f-rp948                     2/2     Running   0          12h
prometheus-operator-admission-webhook-74f7bb977f-5vb2b   1/1     Running   0          12h
prometheus-operator-admission-webhook-74f7bb977f-thbrw   1/1     Running   0          12h
telemeter-client-9587b6dc7-k4crh                         3/3     Running   0          12h
thanos-querier-85f555d468-lr6rq                          6/6     Running   0          7h6m
thanos-querier-85f555d468-vs7jq                          6/6     Running   0          7h6m
$ oc -n openshift-monitoring logs -c cluster-monitoring-operator cluster-monitoring-operator-b87447c68-b5bd6  | grep "alertingrules.monitoring.openshift.io is forbidden"
W0623 12:31:12.562641       1 reflector.go:324] github.com/openshift/cluster-monitoring-operator/pkg/alert/rule_controller.go:113: failed to list *v1alpha1.AlertingRule: alertingrules.monitoring.openshift.io is forbidden: User "system:serviceaccount:openshift-monitoring:cluster-monitoring-operator" cannot list resource "alertingrules" in API group "monitoring.openshift.io" in the namespace "openshift-monitoring"
E0623 12:31:12.562668       1 reflector.go:138] github.com/openshift/cluster-monitoring-operator/pkg/alert/rule_controller.go:113: Failed to watch *v1alpha1.AlertingRule: failed to list *v1alpha1.AlertingRule: alertingrules.monitoring.openshift.io is forbidden: User "system:serviceaccount:openshift-monitoring:cluster-monitoring-operator" cannot list resource "alertingrules" in API group "monitoring.openshift.io" in the namespace "openshift-monitoring"
W0623 12:32:11.310486       1 reflector.go:324] github.com/openshift/cluster-monitoring-operator/pkg/alert/rule_controller.go:113: failed to list *v1alpha1.AlertingRule: alertingrules.monitoring.openshift.io is forbidden: User "system:serviceaccount:openshift-monitoring:cluster-monitoring-operator" cannot list resource "alertingrules" in API group "monitoring.openshift.io" in the namespace "openshift-monitoring"
....


# oc -n openshift-monitoring logs -c cluster-monitoring-operator cluster-monitoring-operator-b87447c68-b5bd6  | grep "alertingrules.monitoring.openshift.io is forbidden" | wc -l
1988


$ oc explain AlertingRule
the server doesn't have a resource type "AlertingRule"

$ oc get crd | grep -i monitoring
alertmanagerconfigs.monitoring.coreos.com                         2022-06-22T07:46:48Z
alertmanagers.monitoring.coreos.com                               2022-06-22T07:46:51Z
podmonitors.monitoring.coreos.com                                 2022-06-22T07:46:53Z
probes.monitoring.coreos.com                                      2022-06-22T07:46:55Z
prometheuses.monitoring.coreos.com                                2022-06-22T07:46:58Z
prometheusrules.monitoring.coreos.com                             2022-06-22T07:47:00Z
servicemonitors.monitoring.coreos.com                             2022-06-22T07:47:02Z
thanosrulers.monitoring.coreos.com                                2022-06-22T07:47:04Z

Version-Release number of selected component (if applicable):
$ oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.11.0-0.nightly-2022-06-22-190830   True        False         12h     Cluster version is 4.11.0-0.nightly-2022-06-22-190830

How reproducible:
always

Steps to Reproduce:
1. see the description
2.
3.

Actual results:
"failed to list *v1alpha1.AlertingRule: alertingrules.monitoring.openshift.io is forbidden" in cmo logs

Expected results:


Additional info:

Comment 5 Simon Pasquier 2022-07-04 09:25:17 UTC
The fix introduces a regression: platform alerts aren't labeled anymore with openshift_io_alert_source="platform". It is being reverted in https://github.com/openshift/cluster-monitoring-operator/pull/1706.

Comment 6 Simon Pasquier 2022-07-04 14:14:33 UTC
The previous PR was a revert, the issue is still present.

Comment 11 Junqi Zhao 2022-07-06 03:28:27 UTC
tested with 4.12.0-0.nightly-2022-07-05-225149, TechPreview feature is not enabled, no errors for alertingrules/alertrelabelconfigs
#  oc get featuregate cluster -oyaml
...
spec: {}


# oc -n openshift-monitoring logs -c cluster-monitoring-operator cluster-monitoring-operator-7b6dc644c5-p45jk | grep "alertingrules.monitoring.openshift.io is forbidden"
no result

# oc -n openshift-monitoring logs -c cluster-monitoring-operator cluster-monitoring-operator-7b6dc644c5-p45jk | grep "alertrelabelconfigs.monitoring.openshift.io is forbidden"
no result

and no regression issue like comment 5
# token=`oc create token prometheus-k8s -n openshift-monitoring`
# oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -k -H "Authorization: Bearer $token" 'https://alertmanager-main.openshift-monitoring.svc:9094/api/v2/alerts' | jq
...
    "labels": {
      "alertname": "Watchdog",
      "namespace": "openshift-monitoring",
      "openshift_io_alert_source": "platform",
      "prometheus": "openshift-monitoring/k8s",
      "severity": "none"
    }
  },
...
    "labels": {
      "alertname": "AlertmanagerReceiversNotConfigured",
      "namespace": "openshift-monitoring",
      "openshift_io_alert_source": "platform",
      "prometheus": "openshift-monitoring/k8s",
      "severity": "warning"
    }
  }
]

Comment 12 Simon Pasquier 2022-07-06 07:23:58 UTC
*** Bug 2103033 has been marked as a duplicate of this bug. ***

Comment 16 Simon Pasquier 2022-12-14 09:19:02 UTC
Removing the requires_doc_text because the bug fix has been backported to 4.11.z already (see bug 2103127).

Comment 18 errata-xmlrpc 2023-01-17 19:50:29 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.12.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:7399


Note You need to log in before you can comment on or make changes to this bug.