Bug 1785579 - Unable oc rsh/logs/exec pods, Unauthorized error occurs on OpenShift 4.2
Summary: Unable oc rsh/logs/exec pods, Unauthorized error occurs on OpenShift 4.2
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 4.2.z
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: ---
: 4.5.0
Assignee: Peter Hunt
QA Contact: Sunil Choudhary
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-12-20 10:45 UTC by Anshul Verma
Modified: 2023-09-07 21:19 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-05-14 19:01:45 UTC
Target Upstream Version:
Embargoed:
ansverma: needinfo-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1886161 0 unspecified CLOSED Unable to visualize logs after upgrade cluster from 4.4.23 to 4.4.26 2021-02-22 00:41:40 UTC
Red Hat Knowledge Base (Solution) 5734741 0 None None None 2021-03-08 22:53:26 UTC

Description Anshul Verma 2019-12-20 10:45:06 UTC
Description of problem:

Unable to oc logs/rsh/exec to any of a pod. The error occurs is -
~~
oc rsh -n openshift-authentication oauth-openshift-6dfddc87cf-7dn7q  cat /run/secrets/kubernetes.io/serviceaccount/ca.crt > ingress-ca.crt
error: unable to upgrade connection: Unauthorized
~~

Some time the following error occurs -
~~
# oc logs ibm-block-csi-operator-76749b9685-g4fpd
error: You must be logged in to the server (the server has asked for the client to provide credentials ( pods/log ibm-block-csi-operator-76749b9685-g4fpd))
~~

I check the openshift-authentication pod logs and got the following messages -
~~
I1114 07:11:35.830694       1 log.go:172] http: TLS handshake error from 10.131.0.1:44736: remote error: tls: bad certificate
I1114 07:49:12.538067       1 log.go:172] http: TLS handshake error from 10.128.0.1:55808: remote error: tls: unknown certificate
I1114 07:49:12.538202       1 log.go:172] http: TLS handshake error from 10.131.0.1:37778: remote error: tls: unknown certificate
I1114 07:49:12.546383       1 log.go:172] http: TLS handshake error from 10.128.0.1:55810: EOF
I1114 08:23:22.165048       1 log.go:172] http: TLS handshake error from 10.128.0.1:52108: remote error: tls: bad certificate
I1114 08:23:22.174039       1 log.go:172] http: TLS handshake error from 10.131.0.1:57172: remote error: tls: bad certificate
I1114 18:01:24.772893       1 log.go:172] http: TLS handshake error from 10.131.0.1:45426: EOF
I1115 07:53:10.787558       1 log.go:172] http: TLS handshake error from 10.128.0.1:45482: remote error: tls: unknown certificate
I1115 07:53:10.800616       1 log.go:172] http: TLS handshake error from 10.131.0.1:36036: EOF
~~

All the COs are in the Available state.
Checked the kubelet certs, they seemed to be good to be.

There are no pending CSRs as well -
~~
$ oc get csr
No resources found.
$ oc get nodes
NAME                  STATUS   ROLES    AGE    VERSION
master0.ocp.lou.com   Ready    master   106d   v1.14.6+7e13ab9a7
master1.ocp.lou.com   Ready    master   106d   v1.14.6+7e13ab9a7
master2.ocp.lou.com   Ready    master   106d   v1.14.6+7e13ab9a7
worker0.ocp.lou.com   Ready    worker   106d   v1.14.6+7e13ab9a7
worker1.ocp.lou.com   Ready    worker   106d   v1.14.6+7e13ab9a7
~~

There is no proxy in the environment.
The this happened after upgrading from 4.1 to 4.2.

Please do let me know if anything is required.

Comment 2 Ryan Phillips 2019-12-20 15:05:23 UTC
Moving to the API team, since the kubelet seems to be Ready. Could you attach the must-gather logs?

I searched 4.3 CI builds for the error, and did not find any matches.

Comment 3 Standa Laznicka 2020-01-02 15:44:37 UTC
I can see that they are unable to successfully perform must-gather, but they somehow managed to get logs for certain pods. Would it be possible to get openshift-apiserver pods logs, too?

Comment 10 Standa Laznicka 2020-01-07 07:52:28 UTC
Ok, looks like there's no problem with either API server. I would like you to check that the kubelets are actually capable of connecting to the API servers. I am going to move this BZ back to the Node team so that they tell you which CA file to use when attempting to do either of `openssl s_client -connect <url> -CAfile <kubelet_ca_here>` or `curl --cacert <kubelet_ca_here>`.

Could you also please share when was the last time each kubelet reported ready?

Comment 23 Ryan Phillips 2020-05-14 19:01:45 UTC
Looks like this issue is resolved. Closing.

Comment 24 Sonigra Saurab 2020-09-10 12:06:22 UTC
Do we get to know what was the cause and what is the solution for this bug i see it as marked as closed by there is not any specific details regarding the solution, Do we have a KCS for this issue.


Note You need to log in before you can comment on or make changes to this bug.