Bug 1735180 - master bootstrap credentials are not managed
Summary: master bootstrap credentials are not managed
Keywords:
Status: ASSIGNED
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 4.2.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.4.0
Assignee: Ryan Phillips
QA Contact: Sunil Choudhary
URL:
Whiteboard:
Depends On:
Blocks: 1693951
TreeView+ depends on / blocked
 
Reported: 2019-07-31 18:56 UTC by David Eads
Modified: 2020-01-08 08:16 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 4271712 None None None 2019-09-09 20:41:10 UTC

Description David Eads 2019-07-31 18:56:33 UTC
During startup a kubelet...

 1. tries to use current client-cert.  if it is missing or invalid it...
 2. uses a bootstrap credential to make a CSR for a new client-cert

This flow happens on initial startup and when clusters are restarted after being off for an extended period.  On the masters, step 2 fails after one day.

Master kubelets do not use the same bootstrap credentials as the rest of the cluster.  Because it's initially a client-cert on the masters, we cannot extend the lifetime indefinitely because client-certs are not individually revokable.

The master kubelets should be updated sometime after the initial boot to use the same serviceaccount token that the rest of the nodes use.  Now that the MCO doesn't reboot a machine for every update, this should work.

To my knowledge, this is the only reason that clusters cannot be shutdown shortly after installation.  It's also the only reason that step 9 exists here: https://docs.openshift.com/container-platform/4.1/disaster_recovery/scenario-3-expired-certs.html

Comment 1 Seth Jennings 2019-07-31 19:42:21 UTC
> Now that the MCO doesn't reboot a machine for every update, this should work.

This would be news to me if the MCD does this now.

If it does not, then we are looking at two reboots per node during install: one for the original pivot and one to apply the changed MC that includes the new bootstrap credentials.

Comment 2 David Eads 2019-07-31 19:53:42 UTC
I misunderstood how the kubelet ca updates were being handled.  If they are rebooting all the machines, I guess you face a similar choice here.

Regardless, this is the only thing I'm aware of that prevents an immediate shutdown of a cluster after installation.

Comment 3 Seth Jennings 2019-08-05 17:09:46 UTC
We really need the MCD to be more feature-rich to make this work.  In particular, we need to be able to reproject files changed in the MC without a reboot.  Rebooting the nodes twice during install is a disruptive change.

For this reason, I'm deferring to 4.3.  I've talked to Antonio and this functionality is a priority for MCO in 4.3.  I'll reference a Jira story tracking the progress when one exists.

Comment 4 Seth Jennings 2019-08-05 17:18:42 UTC
https://jira.coreos.com/browse/PROD-1025

Comment 5 Eric Rich 2019-09-09 20:44:59 UTC
Is this the root cause for: https://bugzilla.redhat.com/show_bug.cgi?id=1693951


Note You need to log in before you can comment on or make changes to this bug.