Bug 1836836

Summary: "bad response status 403 Forbidden" in prometheus container logs
Product: OpenShift Container Platform Reporter: Junqi Zhao <juzhao>
Component: MonitoringAssignee: Simon Pasquier <spasquie>
Status: CLOSED DUPLICATE QA Contact: Junqi Zhao <juzhao>
Severity: low Docs Contact:
Priority: low    
Version: 4.5CC: alegrand, anpicker, erooth, kakkoyun, lcosic, mloibl, pkrupa, spasquie, surbania
Target Milestone: ---Keywords: Regression
Target Release: 4.5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-05-20 15:46:02 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
monitoring dump file, see logs here none

Description Junqi Zhao 2020-05-18 09:52:17 UTC
Description of problem:
# oc -n openshift-monitoring logs -c prometheus prometheus-k8s-0 | grep 403
level=error ts=2020-05-18T06:58:54.532Z caller=notifier.go:524 component=notifier alertmanager=https://10.131.0.11:9095/api/v2/alerts count=1 msg="Error sending alert" err="bad response status 403 Forbidden"
level=error ts=2020-05-18T06:58:56.733Z caller=notifier.go:524 component=notifier alertmanager=https://10.131.0.11:9095/api/v2/alerts count=1 msg="Error sending alert" err="bad response status 403 Forbidden"
level=error ts=2020-05-18T08:28:54.537Z caller=notifier.go:524 component=notifier alertmanager=https://10.128.2.12:9095/api/v2/alerts count=1 msg="Error sending alert" err="bad response status 403 Forbidden"
...

# oc -n openshift-monitoring get pod -o wide | grep alertmanager
alertmanager-main-0                           5/5     Running   0          10h     10.129.2.5     ip-10-0-170-129.ap-south-1.compute.internal   <none>           <none>
alertmanager-main-1                           5/5     Running   0          10h     10.131.0.11    ip-10-0-141-41.ap-south-1.compute.internal    <none>           <none>
alertmanager-main-2                           5/5     Running   0          10h     10.128.2.12    ip-10-0-159-243.ap-south-1.compute.internal   <none>           <none>

it seems the alertmanager works fine
# token=`oc sa get-token prometheus-k8s -n openshift-monitoring`
# oc -n openshift-monitoring exec -c alertmanager alertmanager-main-1 -- curl -k -H "Authorization: Bearer $token" https://10.131.0.11:9095/api/v2/alerts | jq | tail
      "job": "kube-state-metrics",
      "job_name": "image-pruner-1589760000",
      "namespace": "openshift-image-registry",
      "pod": "kube-state-metrics-5dfb57cddc-mq4n8",
      "prometheus": "openshift-monitoring/k8s",
      "service": "kube-state-metrics",
      "severity": "warning"
    }
  }
]

# oc -n openshift-monitoring exec -c alertmanager alertmanager-main-1 -- curl -k -H "Authorization: Bearer $token" https://10.128.2.12:9095/api/v2/alerts | jq | tail
      "job": "kube-state-metrics",
      "job_name": "image-pruner-1589760000",
      "namespace": "openshift-image-registry",
      "pod": "kube-state-metrics-5dfb57cddc-mq4n8",
      "prometheus": "openshift-monitoring/k8s",
      "service": "kube-state-metrics",
      "severity": "warning"
    }
  }
]


Version-Release number of selected component (if applicable):
4.5.0-0.nightly-2020-05-17-201019

How reproducible:
In recent build

Steps to Reproduce:
1. See the description
2.
3.

Actual results:
"bad response status 403 Forbidden" in prometheus container logs

Expected results:
no error

Additional info:

Comment 1 Junqi Zhao 2020-05-18 09:56:32 UTC
Created attachment 1689566 [details]
monitoring dump file, see logs here

Comment 2 Simon Pasquier 2020-05-18 11:18:55 UTC
From alertmanager-main-0-alertmanager-proxy.log:

2020/05/17 23:37:26 http.go:107: HTTPS: listening on [::]:9095
2020/05/18 02:52:45 provider.go:394: authorizer reason: 
2020/05/18 02:52:51 provider.go:394: authorizer reason: 
2020/05/18 02:52:59 provider.go:394: authorizer reason: 
E0518 03:18:48.415073       1 webhook.go:197] Failed to make webhook authorizer request: subjectaccessreviews.authorization.k8s.io is forbidden: User "system:serviceaccount:openshift-monitoring:alertmanager-main" cannot create resource "subjectaccessreviews" in API group "authorization.k8s.io" at the cluster scope
2020/05/18 03:18:48 oauthproxy.go:782: requestauth: 10.128.0.47:38884 subjectaccessreviews.authorization.k8s.io is forbidden: User "system:serviceaccount:openshift-monitoring:alertmanager-main" cannot create resource "subjectaccessreviews" in API group "authorization.k8s.io" at the cluster scope
2020/05/18 06:00:09 provider.go:394: authorizer reason: 
2020/05/18 06:00:18 provider.go:394: authorizer reason:

Comment 3 Simon Pasquier 2020-05-20 15:46:02 UTC
Closing as a duplicate because this is exactly the same error than returned by the Kubernetes API in bug 1832825.

*** This bug has been marked as a duplicate of bug 1832825 ***