Bug 1724189

Summary: Nodes in Not Ready state after adding API named certificate as per https://docs.openshift.com/container-platform/4.1/authentication/certificates/api-server.html
Product: OpenShift Container Platform Reporter: Miheer Salunke <misalunk>
Component: kube-apiserverAssignee: David Eads <deads>
Status: CLOSED ERRATA QA Contact: Xingxing Xia <xxia>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.1.0CC: agawand, aos-bugs, deads, gblomqui, jokerman, mfojtik, mfuruta, mirollin, mkim, mmccomas, openshift-bugs-escalate, rphillips, rushil, sanchezl, scheng, sjenning, xxia
Target Milestone: ---   
Target Release: 4.2.0   
Hardware: Unspecified   
OS: Unspecified   
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-10-16 06:32:41 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 1728877    

Comment 4 Luis Sanchez 2019-06-28 12:54:34 UTC
I need more details.
At minimum cert details (e.g. openssl x509 -in certificate.crt -text -noout) and the contents of apiserver/cluster (oc describe apiserver cluster).

Comment 5 Luis Sanchez 2019-06-28 14:32:30 UTC
Please provide the output of must-gather.

Using the openshift-must-gather binary:

 openshift-must-gather inspect clusteroperator/kube-apiserver

OR using the must-gather image, e.g:

  $ export KUBECONFIG=/path/to/kubeconfig
  $ export image=quay.io/openshift/origin-must-gather:latest # for example
  $ output_dir="${PWD}/must-gather.$(date --utc +%Y%m%d_%H%M%SZ)" 
  $ mkdir -p ${output_dir}
  $ docker run --rm --interactive --tty \
    --volume=${KUBECONFIG}:/root/.kube/config:z \
    --volume=${output_dir}:/must-gather:z \
    --workdir=/ \
    ${image} \
    openshift-must-gather inspect clusteroperator/kube-apiserver

Comment 11 David Eads 2019-07-08 11:58:07 UTC
The node team can help get your kubelets honoring both old and new serving certs so that you can run pods to collect your data.

Node team: see comment 6 to see what's happened https://bugzilla.redhat.com/show_bug.cgi?id=1724189#c6 .  We have delivered code to stop people from doing this in the future, but the master kubelets needs to trust the current serving cert they set in order to read pods in order to rollout a new revision.

Comment 14 Ryan Phillips 2019-07-23 22:02:22 UTC
Restoring the control plane should be possible via the documented steps in [1]. Have these steps been attempted?

1. https://docs.openshift.com/container-platform/4.1/disaster_recovery/scenario-3-expired-certs.html

Comment 15 Miheer Salunke 2019-07-25 03:38:58 UTC
(In reply to Ryan Phillips from comment #14)
> Restoring the control plane should be possible via the documented steps in
> [1]. Have these steps been attempted?
> 1.
> https://docs.openshift.com/container-platform/4.1/disaster_recovery/scenario-
> 3-expired-certs.html

No I don't think so. But is it needed in this case ?  The certs don't seem to be expired.

I noticed https://docs.openshift.com/container-platform/4.1/authentication/certificates/api-server.html was updated with:
Do not provide a named certificate for the internal load balancer (host name api-int.<cluster_name>.<base_domain>). Doing so will leave your cluster in a degraded state.

If this means API certificate can be deployed only with external api hostname and no api-int SAN, it would be an acceptable solution for our use case.

Comment 20 Greg Blomquist 2019-08-26 19:08:15 UTC
Cloned (copied actually) to 4.1.z: https://bugzilla.redhat.com/show_bug.cgi?id=1728877

https://github.com/openshift/origin/pull/23297 is merged in Origin master.  Moving to modified for 4.2.0.

Comment 27 Xingxing Xia 2019-09-16 03:22:43 UTC
Chuan, can use root CA from http://file.rdu.redhat.com/~xxia/rootCA/ , or create your own root CA using https://github.com/giantswarm/grumpy/blob/instance_migration/gen_certs.sh#L8-L11 :
openssl genrsa -out certs/ca.key 2048
openssl req -new -x509 -key certs/ca.key -out certs/ca.crt -config certs/ca_config.txt

Comment 31 errata-xmlrpc 2019-10-16 06:32:41 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.