Bug 1811492

Summary:	Show privilege forbidden errors from kube-controller-mamager pod
Product:	OpenShift Container Platform	Reporter:	zhou ying <yinzhou>
Component:	kube-controller-manager	Assignee:	Tomáš Nožička <tnozicka>
Status:	CLOSED NOTABUG	QA Contact:	zhou ying <yinzhou>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	unspecified	CC:	aos-bugs, mfojtik
Target Milestone:	---
Target Release:	4.5.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:
Clones:	1811505 (view as bug list)		Environment:
Last Closed:	2020-03-10 09:28:40 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1811505

Description zhou ying 2020-03-09 03:45:20 UTC

Description of problem:
Show  privilege forbidden errors from kube-controller-mamager pod

Version-Release number of selected component (if applicable):
payload: 4.5.0-0.nightly-2020-03-06-190457


How reproducible:
always

Steps to Reproduce:
1. Check logs from po/kube-controller-manager.


Actual results:
1. See errors:
E0309 02:34:58.489770       1 webhook.go:109] Failed to make webhook authenticator request: tokenreviews.authentication.k8s.io is forbidden: User "system:kube-controller-manager" cannot create resource "tokenreviews" in API group "authentication.k8s.io" at the cluster scope
E0309 02:34:58.489897       1 authentication.go:104] Unable to authenticate the request due to an error: [invalid bearer token, tokenreviews.authentication.k8s.io is forbidden: User "system:kube-controller-manager" cannot create resource "tokenreviews" in API group "authentication.k8s.io" at the cluster scope]
E0309 02:34:58.514213       1 leaderelection.go:331] error retrieving resource lock kube-system/kube-controller-manager: configmaps "kube-controller-manager" is forbidden: User "system:kube-controller-manager" cannot get resource "configmaps" in API group "" in the namespace "kube-system"

Expected results:
1. No error.


Additional info:

Comment 1 Maciej Szulik 2020-03-09 11:20:53 UTC

When this happened exactly? I want to see must-gather and full logs attached, I can't seem to notice similar problems in my cluster.

Comment 2 zhou ying 2020-03-10 02:18:59 UTC

After I delete secrets csr-signer from openshift-kube-controller-manager-operator, the kube-controller-manager pod reloaded client CA, will see the errors:

I0310 02:14:00.045985       1 tlsconfig.go:179] loaded client CA [7/"client-ca-bundle::/etc/kubernetes/static-pod-certs/configmaps/client-ca/ca-bundle.crt,request-header::/etc/kubernetes/static-pod-certs/configmaps/aggregator-client-ca/ca-bundle.crt"]: "aggregator-signer" [] issuer="<self>" (2020-03-10 00:36:23 +0000 UTC to 2020-03-11 00:36:23 +0000 UTC (now=2020-03-10 02:14:00.045979063 +0000 UTC))
I0310 02:14:00.046204       1 tlsconfig.go:201] loaded serving cert ["serving-cert::/etc/kubernetes/static-pod-resources/secrets/serving-cert/tls.crt::/etc/kubernetes/static-pod-resources/secrets/serving-cert/tls.key"]: "kube-controller-manager.openshift-kube-controller-manager.svc" [serving] validServingFor=[kube-controller-manager.openshift-kube-controller-manager.svc,kube-controller-manager.openshift-kube-controller-manager.svc.cluster.local] issuer="openshift-service-serving-signer@1583801568" (2020-03-10 00:53:08 +0000 UTC to 2022-03-10 00:53:09 +0000 UTC (now=2020-03-10 02:14:00.046191227 +0000 UTC))
I0310 02:14:00.046453       1 named_certificates.go:53] loaded SNI cert [0/"self-signed loopback"]: "apiserver-loopback-client@1583802300" [serving] validServingFor=[apiserver-loopback-client] issuer="apiserver-loopback-client-ca@1583802299" (2020-03-10 00:04:59 +0000 UTC to 2021-03-10 00:04:59 +0000 UTC (now=2020-03-10 02:14:00.046442405 +0000 UTC))



E0310 02:15:29.428500       1 webhook.go:109] Failed to make webhook authenticator request: Post https://localhost:6443/apis/authentication.k8s.io/v1/tokenreviews: dial tcp [::1]:6443: connect: connection refused
E0310 02:15:29.428529       1 authentication.go:104] Unable to authenticate the request due to an error: [invalid bearer token, Post https://localhost:6443/apis/authentication.k8s.io/v1/tokenreviews: dial tcp [::1]:6443: connect: connection refused]

Comment 4 Tomáš Nožička 2020-03-10 09:28:40 UTC

KCM is wired to local kube-apiserver, not through the load balancer. When kube-apiserver rollouts temporary connection errors are expected. It would be only valid if it were constantly looping on those.