1743840 – Bootstrap etcd-signer lacks localhost cert, leading to: x509: certificate is valid for api..., not localhost

Bug 1743840 - Bootstrap etcd-signer lacks localhost cert, leading to: x509: certificate is valid for api..., not localhost

Summary: Bootstrap etcd-signer lacks localhost cert, leading to: x509: certificate is ...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Installer
Sub Component:
Version:	4.2.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Target Release:	4.2.0
Assignee:	W. Trevor King
QA Contact:	David Sanz
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2019-08-20 19:15 UTC by W. Trevor King
Modified:	2019-10-16 06:36 UTC (History)
CC List:	1 user (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-10-16 06:36:40 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift installer pull 2242	0	'None'	closed	Bug 1743840: data/bootstrap/files/usr/local/bin/bootkube.sh.template: Localhost keys for etcd-signer	2020-12-10 21:49:29 UTC
Red Hat Product Errata	RHBA-2019:2922	0	None	None	None	2019-10-16 06:36:53 UTC

Description W. Trevor King 2019-08-20 19:15:46 UTC

Since the pivots to prefer loopback Kube-API access [1,2,3,etc.], the logs on the bootstrap machine have contained distracting errors like these reported in bug 1743661:

$ grep 'not localhost\|etcd-signer' journal-bootstrap.log
...
Aug 20 10:33:56 cnv-qe-08.cnvqe.lab.eng.rdu2.redhat.com podman[8366]: 2019-08-20 10:33:56.090073216 +0000 UTC m=+2.644782091 container start d0dcc42a1335c1224df35a48a279f63f1cb7a03c94de5ebb29e2633e6ee6c429 (image=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:f20394d571ff9a28aed9366434521d221d8d743a6efe2a3d6c6ad242198a522e, name=etcd-signer)
Aug 20 10:33:58 cnv-qe-08.cnvqe.lab.eng.rdu2.redhat.com openshift.sh[2867]: error: unable to recognize "./99_kubeadmin-password-secret.yaml": Get https://localhost:6443/api?timeout=32s: x509: certificate is valid for api.bm1.oc4, not localhost
Aug 20 10:34:01 cnv-qe-08.cnvqe.lab.eng.rdu2.redhat.com approve-csr.sh[2870]: Unable to connect to the server: x509: certificate is valid for api.bm1.oc4, not localhost
...
Aug 20 10:43:55 cnv-qe-08.cnvqe.lab.eng.rdu2.redhat.com openshift.sh[2867]: error: unable to recognize "./99_kubeadmin-password-secret.yaml": Get https://localhost:6443/api?timeout=32s: x509: certificate is valid for api.bm1.oc4, not localhost
Aug 20 10:43:59 cnv-qe-08.cnvqe.lab.eng.rdu2.redhat.com podman[15272]: 2019-08-20 10:43:59.68789639 +0000 UTC m=+0.188325679 container died d0dcc42a1335c1224df35a48a279f63f1cb7a03c94de5ebb29e2633e6ee6c429 (image=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:f20394d571ff9a28aed9366434521d221d8d743a6efe2a3d6c6ad242198a522e, name=etcd-signer)
...

We should pass the localhost cert to etcd-signer, so we can form the TLS connection to gracefully say "sorry, I'm not really a Kube API server".

[1]: https://github.com/openshift/installer/pull/2085
[2]: https://github.com/openshift/installer/pull/2086
[3]: https://github.com/openshift/cluster-bootstrap/pull/28

Comment 2 W. Trevor King 2019-08-21 03:07:12 UTC

Not sure what verification looks like for this bug.  We can wait for a bootstrap error that includes the new code and check the gathered tarball.  Or we can SSH in and gather the journal from a bootstrap machine before the installer cleans up the bootstrap resources (either directly via SSH or indirectly via 'openshift-install gather bootstrap ...').  You can ctrl-C the installer any time after "Waiting up to 30m0s for the Kubernetes API ..." and before "Destroying the bootstrap resources..." if you want more time to gather bootstrap logs.  And you can do this during a UPI install if you want to avoid having a running 'cluster create' that you need to race.

Comment 3 David Sanz 2019-08-21 09:58:20 UTC

No more x509 certificates errors during the bootstrap process

[root@morenod-ocp-hk2bb-bootstrap ~]# journalctl | grep -i x509
Aug 21 09:49:25 localhost kernel: Asymmetric key parser 'x509' registered

Verified on:

./openshift-install v4.2.0-201908201819-dirty
built from commit c6f5ce37fc12f3637b53971a48b9bdb13cf45ec5
release image registry.svc.ci.openshift.org/ocp/release@sha256:55b073d15bf225317cf6c04eaddb528b5d056d75eb8ffb2b7937cbfc67d4ed26

RHCOS: 42.80.20190820.0

Comment 4 errata-xmlrpc 2019-10-16 06:36:40 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2922

Note You need to log in before you can comment on or make changes to this bug.