Bug 1813894

Summary: stop adding service-ca to token secret in 4.6
Product: OpenShift Container Platform Reporter: David Eads <deads>
Component: kube-apiserverAssignee: Maru Newby <mnewby>
Status: CLOSED DEFERRED QA Contact: scheng
Severity: high Docs Contact:
Priority: high    
Version: 4.5CC: aos-bugs, mfojtik, pruan, sttts, tflannag, xxia
Target Milestone: ---Keywords: UpcomingSprint
Target Release: 4.6.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Removed functionality
Doc Text:
The service-serving CA is no longer available in pods at /var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt. This file has been deprecated since 4.1. Pods that currently consume the service-serving CA bundle from /var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt must migrate to obtaining the CA bundle from a configMap annotated with service.beta.openshift.io/inject-cabundle=true.
Story Points: ---
Clone Of:
: 1843949 (view as bug list) Environment:
Last Closed: 2020-07-31 16:38:41 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description David Eads 2020-03-16 12:41:09 UTC
with the goal of removing https://github.com/openshift/kubernetes/pull/116/commits/66d4751e4f866a9e51386eaac93bbdb3537f4813 in 4.6

1. find the initial deprecation notice in docs
2. have the value be off by default, with some ugly wiring (probably env var) to turn back on.
2.5. write a controller in the operator that removes the service-ca from all secrets.
3. create a new field in kcm.operator.openshift.io named `enableDeprecatedAndRemovedServiceCAKeyUntilNextRelease_ThisMakesClusterImpossibleToUpgrade`.  The name is abusive and clear.  People who set it should be very aware and not call us.
4. if the value is set, set the env var and mark the cluster upgradeable==false

In 4.6, we can remove the code entirely because no one can be relying on it.

Comment 1 Maru Newby 2020-03-24 02:30:59 UTC
Corrected commit targeted for removal: https://github.com/openshift/kubernetes/commit/46562f3b5e34287b6ef79b92e54d9bee78ab735d

Comment 3 Stefan Schimanski 2020-05-06 11:03:47 UTC
*** Bug 1813892 has been marked as a duplicate of this bug. ***

Comment 4 Maru Newby 2020-05-07 22:46:46 UTC
PR to remove the code has been posted, and its merge should be deferred until 4.6: 

https://github.com/openshift/origin/pull/24393

Comment 5 Maru Newby 2020-05-20 14:14:02 UTC
Still waiting on the updates to the following operators:

- openshift/cluster-kube-controller-manager-operator (submitted but blocked by persistent and unrelated test flake)
- openshift/cluster-samples-operator (coordinating with maintainers of jboss-container-images to get required upstream changes merged)

Comment 6 Maru Newby 2020-05-29 21:22:06 UTC
The cluster-samples-operator fix is still in-progress, but testing the change is now possible.

The change can be tested by creating a pod in a 4.5 cluster and verifying the absence of the service serving CA in the pod filesystem:


/var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt

Comment 7 Maru Newby 2020-06-02 14:50:55 UTC
The openshift/library PR has merged: 

https://github.com/openshift/library/pull/219

Tomorrow (June 3rd), once a nightly job has made the necessary updates to the branch, cluster-samples-operator will be able to vendor the change. This vendoring change will need to merge to master and then backported to 4.5.

Comment 8 Maru Newby 2020-06-08 02:35:22 UTC
The change can be tested by creating a pod in a 4.5 cluster and verifying the absence of the service serving CA in service account token secrets and on the pod filesystem:


/var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt

Comment 12 Maru Newby 2020-06-09 16:48:41 UTC
As per https://bugzilla.redhat.com/show_bug.cgi?id=1845188 this change represents a backwards-incompatible change that was insufficiently communicated. Deferring to 4.6 and even then making this change will depend on being able to avoid breaking customer workloads.

Comment 13 Maru Newby 2020-06-18 14:27:27 UTC
This change is already present in master (it was reverted for release-4.5) and awaits QA verification.

Comment 18 Maru Newby 2020-07-31 16:38:41 UTC
Given a lack of visibility into the customer impact of this change, this change will not appear in 4.6. I dropped the removal PR as part of the  1.19 rebase. 

I'm closing for now and we can re-open if/when we can justify the time and energy required to not have this change negatively impact customers.