Bug 1694079 - Add tool that can restore expired certificates in case a cluster was suspended for longer period of time
Summary: Add tool that can restore expired certificates in case a cluster was suspende...
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Master
Version: 4.1.0
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
: 4.1.0
Assignee: Tomáš Nožička
QA Contact: zhou ying
: 1699470 (view as bug list)
Depends On:
TreeView+ depends on / blocked
Reported: 2019-03-29 13:06 UTC by Michal Fojtik
Modified: 2020-02-28 02:39 UTC (History)
19 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Last Closed: 2019-06-04 10:46:34 UTC
Target Upstream Version:

Attachments (Terms of Use)

System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:0758 0 None None None 2019-06-04 10:46:42 UTC

Internal Links: 1701099

Description Michal Fojtik 2019-03-29 13:06:16 UTC
Description of problem:

This tool will allow users that suspended the cluster longer than validity of
our certificates.
This tool can be baked into a VM boot process so when somebody want to package
OpenShift cluster into a VM, it will start with valid certificates.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:

Actual results:

Expected results:

Additional info:

Comment 1 Seth Jennings 2019-03-29 14:14:35 UTC
*** Bug 1693951 has been marked as a duplicate of this bug. ***

Comment 2 Maciej Szulik 2019-04-09 14:27:20 UTC
*** Bug 1693951 has been marked as a duplicate of this bug. ***

Comment 3 Seth Jennings 2019-04-18 20:17:27 UTC
*** Bug 1699470 has been marked as a duplicate of this bug. ***

Comment 4 Xingxing Xia 2019-04-19 06:04:40 UTC
Adding the keyword considering the issue of bug 1699470 and the label of MSTR-363. Feel free to remove it if you disagree. Thanks.

Comment 6 Tomáš Nožička 2019-04-30 08:02:54 UTC
WIP is tracked here https://github.com/openshift/cluster-kube-apiserver-operator/pull/444

Comment 9 Michal Fojtik 2019-05-07 07:50:29 UTC
https://github.com/openshift/cluster-kube-apiserver-operator/pull/460 merged, moving to QA.

@Tomas can you please work with QA on explaining how they can test recovery procedure?

Comment 10 zhou ying 2019-05-08 09:36:26 UTC
Fellow this doc  https://docs.google.com/document/d/1ONkxdDmQVLBNJrSJymfKPrndo7b4vgCA2zwL9xHYx6A/edit, I can recovery the cluster, no issue found , will verify .

Comment 11 zhou ying 2019-05-08 09:36:56 UTC
Payload 4.1.0-0.nightly-2019-05-08-012425

Comment 12 Anjan 2019-05-31 12:55:00 UTC
So i tried the force certification steps from the above gdoc, not sure what i am doing wrong but it does not rotate certs in the cluster.
I am using the libvirt build for testing this and following are the steps that i followed:

# validity is 30 times the base (30*9000s = 270000s)
oc create -n openshift-config configmap unsupported-cert-rotation-config --from-literal='base=9000s'

# forcing rotation
oc get secret -A -o json | jq -r '.items[] | select(.metadata.annotations."auth.openshift.io/certificate-not-after" | .!=null and fromdateiso8601<='$( date --date='+1year' +%s )') | "-n \(.metadata.namespace) \(.metadata.name)"' | xargs -n3 oc patch secret -p='{"metadata": {"annotations": {"auth.openshift.io/certificate-not-after": null}}}'

# Wait ~ 5-10 minutes

# Make sure at least the apiserver serving cert has 15 min validity (change your cluster name based on your kubeconfig)
openssl s_client -connect api.tnozicka-1.devcluster.openshift.com:6443 | openssl x509 -noout -dates

Actual O/P i got:
$ ./oc create -n openshift-config configmap unsupported-cert-rotation-config --from-literal='base=9000s'
configmap/unsupported-cert-rotation-config created

$ ./oc get secret -A -o json | jq -r '.items[] | select(.metadata.annotations."auth.openshift.io/certificate-not-after" | .!=null and fromdateiso8601<='$( date --date='+1year' +%s )') | "-n \(.metadata.namespace) \(.metadata.name)"' | xargs -n3 ./oc patch secret -p='{"metadata": {"annotations": {"auth.openshift.io/certificate-not-after": null}}}'
secret/kube-controller-manager-client-cert-key patched
secret/kube-scheduler-client-cert-key patched
secret/aggregator-client-signer patched
secret/kube-apiserver-to-kubelet-signer patched
secret/kube-control-plane-signer patched
secret/aggregator-client patched
secret/external-loadbalancer-serving-certkey patched
secret/internal-loadbalancer-serving-certkey patched
secret/kube-apiserver-cert-syncer-client-cert-key patched
secret/kube-apiserver-cert-syncer-client-cert-key-2 patched
secret/kube-apiserver-cert-syncer-client-cert-key-3 patched
secret/kube-apiserver-cert-syncer-client-cert-key-4 patched
secret/kube-apiserver-cert-syncer-client-cert-key-5 patched
secret/kube-apiserver-cert-syncer-client-cert-key-6 patched
secret/kubelet-client patched
secret/kubelet-client-2 patched
secret/kubelet-client-3 patched
secret/kubelet-client-4 patched
secret/kubelet-client-5 patched
secret/kubelet-client-6 patched
secret/localhost-serving-cert-certkey patched
secret/service-network-serving-certkey patched
secret/csr-signer patched
secret/csr-signer-signer patched
secret/kube-controller-manager-client-cert-key patched
secret/kube-controller-manager-client-cert-key-2 patched
secret/kube-controller-manager-client-cert-key-3 patched
secret/kube-controller-manager-client-cert-key-4 patched
secret/kube-controller-manager-client-cert-key-5 patched
secret/kube-scheduler-client-cert-key patched
secret/kube-scheduler-client-cert-key-2 patched
secret/kube-scheduler-client-cert-key-3 patched
secret/kube-scheduler-client-cert-key-4 patched
secret/kube-scheduler-client-cert-key-5 patched

#inside the VM that installer created certs are only valid for 1 day
[core@crc-kmrrq-master-0 ~]$ sudo su
[root@crc-kmrrq-master-0 core]# openssl x509 -in /var/lib/kubelet/pki/kubelet-client-current.pem -noout -dates
notBefore=May 31 11:14:00 2019 GMT
notAfter=Jun  1 11:05:06 2019 GMT


Comment 14 errata-xmlrpc 2019-06-04 10:46:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Comment 17 zhou ying 2020-02-28 02:39:20 UTC
Hi Praveen:

Yes, until now we only have the manual process, but devs is coding the automation tool.

Note You need to log in before you can comment on or make changes to this bug.