Bug 1946479 - In k8s 1.21 bump BoundServiceAccountTokenVolume is disabled by default
Summary: In k8s 1.21 bump BoundServiceAccountTokenVolume is disabled by default
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: kube-apiserver
Version: 4.8
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.8.0
Assignee: Sergiusz Urbaniak
QA Contact: Ke Wang
URL:
Whiteboard:
Depends On:
Blocks: 1977383
TreeView+ depends on / blocked
 
Reported: 2021-04-06 08:35 UTC by Maciej Szulik
Modified: 2021-07-27 22:58 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1977383 (view as bug list)
Environment:
Last Closed: 2021-07-27 22:57:48 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-kube-apiserver-operator pull 1142 0 None closed Bug 1946479: inject SA tokens as projected volumes to KAS install and prune pods 2021-06-10 09:15:43 UTC
Github openshift cluster-kube-controller-manager-operator pull 531 0 None closed Bug 1946479: manifests: use manual service account mounts 2021-06-09 17:37:26 UTC
Github openshift cluster-kube-scheduler-operator pull 355 0 None closed Bug 1946479: use manual service account tokens 2021-06-10 13:56:25 UTC
Github openshift cluster-version-operator pull 585 0 None closed Bug 1946479: prevent pod deployment deadlock due to custom SA projected volume injection 2021-06-10 13:55:40 UTC
Github openshift kubernetes pull 714 0 None closed Bug 1946479: Re-enable BoundServiceAccountTokenVolume disabled by 1.21 rebase 2021-06-10 13:55:40 UTC
Github openshift kubernetes pull 786 0 None closed Bug 1946479: UPSTREAM: 101950: Make watch order conformance test reliable 2021-06-10 13:55:42 UTC
Github openshift library-go pull 1100 0 None closed Bug 1946479: pkg/operator/staticpod/controller/installer,pruner: use manual SA mounts 2021-06-10 13:55:42 UTC
Github openshift origin pull 26184 0 None closed Bug 1946479: Skip cluster quota test while bound token projected volume is being re-enabled 2021-06-10 13:55:43 UTC
Github openshift origin pull 26200 0 None closed Bug 1946479: Vendor watch consistency fix 2021-06-08 17:29:54 UTC
Red Hat Product Errata RHSA-2021:2438 0 None None None 2021-07-27 22:58:04 UTC

Description Maciej Szulik 2021-04-06 08:35:46 UTC
In https://github.com/openshift/kubernetes/pull/641 which brings k8s 1.21 I disabled "BoundServiceAccountTokenVolume" feature
which is causing some tests failures. We need to investigate this functionality further before enabling this by default.

Comment 1 Maciej Szulik 2021-04-06 08:38:23 UTC
Origin PR disabling this test is https://github.com/openshift/origin/pull/26047

Comment 2 Maciej Szulik 2021-04-06 08:47:10 UTC
Detailed failure in https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_kubernetes/641/pull-ci-openshift-kubernetes-master-e2e-aws-serial/1377643088377286656


STEP: looking up the openshift registry URL
Apr  1 17:25:20.651: INFO: Running 'oc --namespace=e2e-test-registry-signing-r4vfg --kubeconfig=/tmp/configfile312401163 registry info'
STEP: obtaining bearer token for the test user
Apr  1 17:25:20.764: INFO: Running 'oc --namespace=e2e-test-registry-signing-r4vfg --kubeconfig=/tmp/configfile312401163 whoami -t'
STEP: granting the image-signer role to test user
Apr  1 17:25:20.829: INFO: Running 'oc --namespace=e2e-test-registry-signing-r4vfg --kubeconfig=/tmp/kubeconfig-076331212 adm policy add-cluster-role-to-user system:image-signer e2e-test-registry-signing-r4vfg-user'
STEP: granting the anyuid scc to test user
Apr  1 17:25:21.029: INFO: Running 'oc --namespace=e2e-test-registry-signing-r4vfg --kubeconfig=/tmp/kubeconfig-076331212 adm policy add-scc-to-user anyuid e2e-test-registry-signing-r4vfg-user'
STEP: preparing the image stream where the signed image will be pushed
Apr  1 17:25:21.184: INFO: Running 'oc --namespace=e2e-test-registry-signing-r4vfg --kubeconfig=/tmp/configfile312401163 create imagestream signed'
STEP: granting the image-auditor role to test user
Apr  1 17:25:21.327: INFO: Running 'oc --namespace=e2e-test-registry-signing-r4vfg --kubeconfig=/tmp/kubeconfig-076331212 adm policy add-cluster-role-to-user system:image-auditor e2e-test-registry-signing-r4vfg-user'
Apr  1 17:25:21.537: INFO: Running 'oc --namespace=e2e-test-registry-signing-r4vfg --kubeconfig=/tmp/configfile312401163 run sign-and-push --labels name=sign-and-push --image image-registry.openshift-image-registry.svc:5000/e2e-test-registry-signing-r4vfg/signer:latest --restart Never --command -- /bin/bash -c sleep infinity'
Apr  1 17:25:40.719: INFO: Running 'oc --namespace=e2e-test-registry-signing-r4vfg --kubeconfig=/tmp/configfile312401163 cp /usr/bin/oc sign-and-push:/usr/bin/oc'
STEP: creating dummy GPG key
Apr  1 17:25:41.739: INFO: Running 'oc --namespace=e2e-test-registry-signing-r4vfg --kubeconfig=/tmp/configfile312401163 exec sign-and-push -- /bin/bash -c rm -f /dev/random; ln -sf /dev/urandom /dev/random && GNUPGHOME=/var/lib/origin/gnupg gpg2 --batch --gen-key dummy_key.conf'
STEP: logging as a test user
Apr  1 17:25:42.354: INFO: Running 'oc --namespace=e2e-test-registry-signing-r4vfg --kubeconfig=/tmp/configfile312401163 exec sign-and-push -- /bin/bash -c oc login https://$KUBERNETES_SERVICE_HOST:$KUBERNETES_SERVICE_PORT --token=sha256~4JhGL15vT82WsuC4p9sLDA --certificate-authority=/run/secrets/kubernetes.io/serviceaccount/ca.crt'
STEP: signing a just-built image and pushing it into openshift registry
Apr  1 17:25:42.776: INFO: Running 'oc --namespace=e2e-test-registry-signing-r4vfg --kubeconfig=/tmp/configfile312401163 exec sign-and-push -- /bin/bash -c GNUPGHOME=/var/lib/origin/gnupg skopeo --debug --registries.d /this/does/not/exist copy --sign-by joe@foo.bar --src-creds=e2e-test-registry-signing-r4vfg-user:sha256~4JhGL15vT82WsuC4p9sLDA --dest-creds=e2e-test-registry-signing-r4vfg-user:sha256~4JhGL15vT82WsuC4p9sLDA --src-cert-dir=/run/secrets/kubernetes.io/serviceaccount --dest-cert-dir=/run/secrets/kubernetes.io/serviceaccount docker://image-registry.openshift-image-registry.svc:5000/e2e-test-registry-signing-r4vfg/signer:latest docker://image-registry.openshift-image-registry.svc:5000/e2e-test-registry-signing-r4vfg/signed:latest'
Apr  1 17:25:43.194: INFO: Error running /usr/bin/oc --namespace=e2e-test-registry-signing-r4vfg --kubeconfig=/tmp/configfile312401163 exec sign-and-push -- /bin/bash -c GNUPGHOME=/var/lib/origin/gnupg skopeo --debug --registries.d /this/does/not/exist copy --sign-by joe@foo.bar --src-creds=e2e-test-registry-signing-r4vfg-user:sha256~4JhGL15vT82WsuC4p9sLDA --dest-creds=e2e-test-registry-signing-r4vfg-user:sha256~4JhGL15vT82WsuC4p9sLDA --src-cert-dir=/run/secrets/kubernetes.io/serviceaccount --dest-cert-dir=/run/secrets/kubernetes.io/serviceaccount docker://image-registry.openshift-image-registry.svc:5000/e2e-test-registry-signing-r4vfg/signer:latest docker://image-registry.openshift-image-registry.svc:5000/e2e-test-registry-signing-r4vfg/signed:latest:
StdOut>

Comment 3 Maru Newby 2021-04-29 22:18:57 UTC
BoundServiceAccountTokenVolume replaces 

 - the creation of a token secret for each service account and the mounting of that secret for pods running under a given service account

with

 - the addition of a projected volume with a name prefix of 'kube-api-access-' that will mount a bound token sourced from the TokenRequest api


Reference: https://kubernetes.io/docs/reference/access-authn-authz/service-accounts-admin/#bound-service-account-token-volume


The move to bound tokens represents a security improvement. Unlike legacy tokens, bound tokens are time-limited and audience-bound so that the scope of potential compromise is limited. Scaling is also improved, since bound tokens can be verified without api persistence.

The switch causes problems for OpenShift, though, since some components rely on the service serving ca bundle being available at `/run/secrets/kubernetes.io/serviceaccount/service-ca.crt`. This file is present for legacy tokens because OpenShift's forked controller-manager adds the service serving ca to each service account token secret:

https://github.com/openshift/kubernetes/commit/095575f8679

Removal of the service serving ca from token secrets was attempted in the 4.6 timeframe. However, it was realized that there was no effective way of determining (let alone mediating) the customer impact of this kind of breaking change. Accordingly, the service serving ca bundle needs to be continue to be mounted in all pods at location  `/run/secrets/kubernetes.io/serviceaccount/service-ca.crt`.

Potential fixes:

 - patch the kubelet to mount the service serving ca bundle for the volumes resulting from the BoundServiceAccountTokenVolume feature (current approach of https://github.com/openshift/kubernetes/pull/714)
 - copy the service ca bundle configmap to every namespace and configure their mounting in volumes resulting from the BoundServiceAccountTokenVolume feature
   - ensuring the configmap in every namespace is required because projected volumes only support local object references

Comment 5 Sergiusz Urbaniak 2021-06-11 12:01:09 UTC
Unfortunately it turns out this needs more work than anticipated, see https://bugzilla.redhat.com/show_bug.cgi?id=1970828.

Tentatively unsetting blocker because:

1. We want to reintroduce injection of service-ca bundle into service account tokens (3.x behavior). This has not been the case so far in 4.x
2. We want to be prepared with 1.22 which will rely on the default enabled BoundServiceAccountTokenVolume feature.

This leaves a window where we can properly introduce necessary changes in oauth-server.

Comment 6 Sergiusz Urbaniak 2021-06-11 15:05:40 UTC
Reset targetting 4.8.0 as we do want BoundServiceAccountTokenVolume enabled in 1.21 already. https://github.com/openshift/oauth-server/pull/80 (tracked in https://bugzilla.redhat.com/show_bug.cgi?id=1970828) should fix the oauth-server issues.

Comment 8 Wally 2021-06-11 18:33:43 UTC
Moving back to VERIFIED after opening https://bugzilla.redhat.com/show_bug.cgi?id=1971052 for test re-enablement.

Comment 10 Dan Kenigsberg 2021-06-30 09:59:29 UTC
Bug 1977179 in HostPathProvisioner (part of OpenShift Virtualization) was tickled by this late change. I do not know if many more components or user payloads are going to be similarly affected, but please allow more time and testing before backporting to 4.7.

Comment 13 errata-xmlrpc 2021-07-27 22:57:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438


Note You need to log in before you can comment on or make changes to this bug.