Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1785579

Summary:	Unable oc rsh/logs/exec pods, Unauthorized error occurs on OpenShift 4.2
Product:	OpenShift Container Platform	Reporter:	Anshul Verma <ansverma>
Component:	Node	Assignee:	Peter Hunt <pehunt>
Node sub component:	CRI-O	QA Contact:	Sunil Choudhary <schoudha>
Status:	CLOSED NOTABUG	Docs Contact:
Severity:	low
Priority:	low	CC:	aos-bugs, cshepher, gblomqui, icherapa, jokerman, maupadhy, mfojtik, oarribas, rphillips, rupatel, slaznick, ssonigra, sttts
Version:	4.2.z	Flags:	ansverma: needinfo-
Target Milestone:	---
Target Release:	4.5.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2020-05-14 19:01:45 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Anshul Verma 2019-12-20 10:45:06 UTC

Description of problem:

Unable to oc logs/rsh/exec to any of a pod. The error occurs is -
~~
oc rsh -n openshift-authentication oauth-openshift-6dfddc87cf-7dn7q  cat /run/secrets/kubernetes.io/serviceaccount/ca.crt > ingress-ca.crt
error: unable to upgrade connection: Unauthorized
~~

Some time the following error occurs -
~~
# oc logs ibm-block-csi-operator-76749b9685-g4fpd
error: You must be logged in to the server (the server has asked for the client to provide credentials ( pods/log ibm-block-csi-operator-76749b9685-g4fpd))
~~

I check the openshift-authentication pod logs and got the following messages -
~~
I1114 07:11:35.830694       1 log.go:172] http: TLS handshake error from 10.131.0.1:44736: remote error: tls: bad certificate
I1114 07:49:12.538067       1 log.go:172] http: TLS handshake error from 10.128.0.1:55808: remote error: tls: unknown certificate
I1114 07:49:12.538202       1 log.go:172] http: TLS handshake error from 10.131.0.1:37778: remote error: tls: unknown certificate
I1114 07:49:12.546383       1 log.go:172] http: TLS handshake error from 10.128.0.1:55810: EOF
I1114 08:23:22.165048       1 log.go:172] http: TLS handshake error from 10.128.0.1:52108: remote error: tls: bad certificate
I1114 08:23:22.174039       1 log.go:172] http: TLS handshake error from 10.131.0.1:57172: remote error: tls: bad certificate
I1114 18:01:24.772893       1 log.go:172] http: TLS handshake error from 10.131.0.1:45426: EOF
I1115 07:53:10.787558       1 log.go:172] http: TLS handshake error from 10.128.0.1:45482: remote error: tls: unknown certificate
I1115 07:53:10.800616       1 log.go:172] http: TLS handshake error from 10.131.0.1:36036: EOF
~~

All the COs are in the Available state.
Checked the kubelet certs, they seemed to be good to be.

There are no pending CSRs as well -
~~
$ oc get csr
No resources found.
$ oc get nodes
NAME                  STATUS   ROLES    AGE    VERSION
master0.ocp.lou.com   Ready    master   106d   v1.14.6+7e13ab9a7
master1.ocp.lou.com   Ready    master   106d   v1.14.6+7e13ab9a7
master2.ocp.lou.com   Ready    master   106d   v1.14.6+7e13ab9a7
worker0.ocp.lou.com   Ready    worker   106d   v1.14.6+7e13ab9a7
worker1.ocp.lou.com   Ready    worker   106d   v1.14.6+7e13ab9a7
~~

There is no proxy in the environment.
The this happened after upgrading from 4.1 to 4.2.

Please do let me know if anything is required.

Comment 2 Ryan Phillips 2019-12-20 15:05:23 UTC

Moving to the API team, since the kubelet seems to be Ready. Could you attach the must-gather logs?

I searched 4.3 CI builds for the error, and did not find any matches.

Comment 3 Standa Laznicka 2020-01-02 15:44:37 UTC

I can see that they are unable to successfully perform must-gather, but they somehow managed to get logs for certain pods. Would it be possible to get openshift-apiserver pods logs, too?

Comment 10 Standa Laznicka 2020-01-07 07:52:28 UTC

Ok, looks like there's no problem with either API server. I would like you to check that the kubelets are actually capable of connecting to the API servers. I am going to move this BZ back to the Node team so that they tell you which CA file to use when attempting to do either of `openssl s_client -connect <url> -CAfile <kubelet_ca_here>` or `curl --cacert <kubelet_ca_here>`.

Could you also please share when was the last time each kubelet reported ready?

Comment 23 Ryan Phillips 2020-05-14 19:01:45 UTC

Looks like this issue is resolved. Closing.

Comment 24 Sonigra Saurab 2020-09-10 12:06:22 UTC

Do we get to know what was the cause and what is the solution for this bug i see it as marked as closed by there is not any specific details regarding the solution, Do we have a KCS for this issue.