Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1785579

Summary: Unable oc rsh/logs/exec pods, Unauthorized error occurs on OpenShift 4.2
Product: OpenShift Container Platform Reporter: Anshul Verma <ansverma>
Component: NodeAssignee: Peter Hunt <pehunt>
Node sub component: CRI-O QA Contact: Sunil Choudhary <schoudha>
Status: CLOSED NOTABUG Docs Contact:
Severity: low    
Priority: low CC: aos-bugs, cshepher, gblomqui, icherapa, jokerman, maupadhy, mfojtik, oarribas, rphillips, rupatel, slaznick, ssonigra, sttts
Version: 4.2.zFlags: ansverma: needinfo-
Target Milestone: ---   
Target Release: 4.5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-05-14 19:01:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Anshul Verma 2019-12-20 10:45:06 UTC
Description of problem:

Unable to oc logs/rsh/exec to any of a pod. The error occurs is -
~~
oc rsh -n openshift-authentication oauth-openshift-6dfddc87cf-7dn7q  cat /run/secrets/kubernetes.io/serviceaccount/ca.crt > ingress-ca.crt
error: unable to upgrade connection: Unauthorized
~~

Some time the following error occurs -
~~
# oc logs ibm-block-csi-operator-76749b9685-g4fpd
error: You must be logged in to the server (the server has asked for the client to provide credentials ( pods/log ibm-block-csi-operator-76749b9685-g4fpd))
~~

I check the openshift-authentication pod logs and got the following messages -
~~
I1114 07:11:35.830694       1 log.go:172] http: TLS handshake error from 10.131.0.1:44736: remote error: tls: bad certificate
I1114 07:49:12.538067       1 log.go:172] http: TLS handshake error from 10.128.0.1:55808: remote error: tls: unknown certificate
I1114 07:49:12.538202       1 log.go:172] http: TLS handshake error from 10.131.0.1:37778: remote error: tls: unknown certificate
I1114 07:49:12.546383       1 log.go:172] http: TLS handshake error from 10.128.0.1:55810: EOF
I1114 08:23:22.165048       1 log.go:172] http: TLS handshake error from 10.128.0.1:52108: remote error: tls: bad certificate
I1114 08:23:22.174039       1 log.go:172] http: TLS handshake error from 10.131.0.1:57172: remote error: tls: bad certificate
I1114 18:01:24.772893       1 log.go:172] http: TLS handshake error from 10.131.0.1:45426: EOF
I1115 07:53:10.787558       1 log.go:172] http: TLS handshake error from 10.128.0.1:45482: remote error: tls: unknown certificate
I1115 07:53:10.800616       1 log.go:172] http: TLS handshake error from 10.131.0.1:36036: EOF
~~

All the COs are in the Available state.
Checked the kubelet certs, they seemed to be good to be.

There are no pending CSRs as well -
~~
$ oc get csr
No resources found.
$ oc get nodes
NAME                  STATUS   ROLES    AGE    VERSION
master0.ocp.lou.com   Ready    master   106d   v1.14.6+7e13ab9a7
master1.ocp.lou.com   Ready    master   106d   v1.14.6+7e13ab9a7
master2.ocp.lou.com   Ready    master   106d   v1.14.6+7e13ab9a7
worker0.ocp.lou.com   Ready    worker   106d   v1.14.6+7e13ab9a7
worker1.ocp.lou.com   Ready    worker   106d   v1.14.6+7e13ab9a7
~~

There is no proxy in the environment.
The this happened after upgrading from 4.1 to 4.2.

Please do let me know if anything is required.

Comment 2 Ryan Phillips 2019-12-20 15:05:23 UTC
Moving to the API team, since the kubelet seems to be Ready. Could you attach the must-gather logs?

I searched 4.3 CI builds for the error, and did not find any matches.

Comment 3 Standa Laznicka 2020-01-02 15:44:37 UTC
I can see that they are unable to successfully perform must-gather, but they somehow managed to get logs for certain pods. Would it be possible to get openshift-apiserver pods logs, too?

Comment 10 Standa Laznicka 2020-01-07 07:52:28 UTC
Ok, looks like there's no problem with either API server. I would like you to check that the kubelets are actually capable of connecting to the API servers. I am going to move this BZ back to the Node team so that they tell you which CA file to use when attempting to do either of `openssl s_client -connect <url> -CAfile <kubelet_ca_here>` or `curl --cacert <kubelet_ca_here>`.

Could you also please share when was the last time each kubelet reported ready?

Comment 23 Ryan Phillips 2020-05-14 19:01:45 UTC
Looks like this issue is resolved. Closing.

Comment 24 Sonigra Saurab 2020-09-10 12:06:22 UTC
Do we get to know what was the cause and what is the solution for this bug i see it as marked as closed by there is not any specific details regarding the solution, Do we have a KCS for this issue.