1819688 – oc login produces "TLS handshake error from <ingress pod IP and port>: remote error: tls: bad certificate"

Bug 1819688 - oc login produces "TLS handshake error from <ingress pod IP and port>: remote error: tls: bad certificate"

Summary: oc login produces "TLS handshake error from <ingress pod IP and port>: remote...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	apiserver-auth
Sub Component:
Version:	4.4
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	4.5.0
Assignee:	Standa Laznicka
QA Contact:	Xingxing Xia
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2020-04-01 10:27 UTC by Xingxing Xia
Modified:	2024-03-25 15:47 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:	Cause: `oc login` was performing HTTP request to decide which CA bundle (system trust store/kubeconfig CA) to use connecting to the remote login server Consequence: Every login attempt generated "remote error: tls: bad certificate" line in the oauth-server logs Fix: retrieve the server certificate chain from insecure TLS handshake and perform the correct-CA pick outside the connection. Result: oauth-server no longer logs bad certificate on login attempts
Clone Of:
Environment:
Last Closed:	2020-07-13 17:24:45 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift oc pull 380	0	None	closed	Bug 1819688: login: choose the CAs based on the remote server cert	2021-02-10 11:49:23 UTC
Red Hat Product Errata	RHBA-2020:2409	0	None	None	None	2020-07-13 17:25:09 UTC

Description Xingxing Xia 2020-04-01 10:27:58 UTC

Description of problem:
When kubeconfig has cluster certificate-authority-data info, each `oc login` can produce "TLS handshake error from <ingress pod IP and port>: remote error: tls: bad certificate" in oauth-openshift pod logs.
Though this error does not hurt and affect the function of successful login, this error may quite confuse customer, especially when there are many `oc login` operations against the cluster. Therefore the error should not be shown, worth addressing. After all, the certificate-authority-data

Version-Release number of selected component (if applicable):
4.4.0-0.nightly-2020-03-31-215957

How reproducible:
Always

Steps to Reproduce:
1. Launch 4.4 env, configure htpasswd IDP
2. In one terminal A, watch oauth-openshift pod logs:
$ oc logs -f --tail=5 -n openshift-authentication oauth-openshift-b97947fcd-p8hxm

3. In another terminal B:
$ cp path/to/admin.kubeconfig path/to/admin.kubeconfig.copied
$ export KUBECONFIG=path/to/admin.kubeconfig.copied # this file contains cluster certificate-authority-data info
Repeatedly run:
$ oc login -u xxia1 -p redhat

4. In terminal B, try oc login with kubeconfig file that does not have certificate-authority-data:
$ touch empty.kubeconfig
$ export KUBECONFIG=empty.kubeconfig
Repeatedly run:
$ oc login -u xxia1 -p redhat https://<api server>:6443 --insecure-skip-tls-verify

Actual results:
3. For each login with certificate-authority-data, terminal A will definitely show one line of below:
I0401 08:34:30.210254       1 log.go:172] http: TLS handshake error from 10.128.2.9:36202: remote error: tls: bad certificate
I0401 08:46:37.175267       1 log.go:172] http: TLS handshake error from 10.131.0.9:41046: remote error: tls: bad certificate
I0401 08:49:22.409251       1 log.go:172] http: TLS handshake error from 10.131.0.9:42866: remote error: tls: bad certificate

4. As said in above Description, such error should not be shown.

Expected results:
3. The error may be not hurting. But it may confuse customer. Worth addressing to make it not shown.

Additional info:
Check found above IPs are of ingress pods:
$ oc get po -A -o wide | grep -E "(10.131.0.9|10.128.2.9)"
openshift-ingress   router-default-656d77d7d8-gjrr6  1/1 Running  0 6h11m 10.128.2.9 ...
openshift-ingress   router-default-656d77d7d8-grmx4  1/1 Running  0 6h11m 10.131.0.9 ...

Comment 1 Standa Laznicka 2020-04-06 12:29:51 UTC

The message we see is caused by the `oc` first trying to connect without the CA (=> that's where the bad certificate comes from), and only using the CA in a subsequent request.

Comment 4 Xingxing Xia 2020-04-22 03:02:21 UTC

Verified in oc 4.5.0-202004202137-8dda2e7 with original steps, fixed and cannot reproduce now.

Comment 6 errata-xmlrpc 2020-07-13 17:24:45 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409

Comment 7 Standa Laznicka 2020-11-30 10:20:24 UTC

*** Bug 1901379 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.