Bug 2115527
Summary: | ServiceAccounts PATCH noise leads to Secret leakage | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | W. Trevor King <wking> |
Component: | Monitoring | Assignee: | Joao Marcal <jmarcal> |
Status: | CLOSED ERRATA | QA Contact: | Junqi Zhao <juzhao> |
Severity: | high | Docs Contact: | Brian Burt <bburt> |
Priority: | high | ||
Version: | 4.12 | CC: | anpicker, bburt, bluddy, hongkliu, jmarcal, juzhao, tremes, xxia |
Target Milestone: | --- | ||
Target Release: | 4.12.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | No Doc Update | |
Doc Text: | Story Points: | --- | |
Clone Of: | 2095719 | Environment: | |
Last Closed: | 2023-01-17 19:54:29 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
W. Trevor King
2022-08-04 21:12:28 UTC
Observed this sequence of updates to the thanos-querier ServiceAccount (picked arbitrarily): $ kubectl get -n openshift-monitoring serviceaccount thanos-querier -o yaml -w apiVersion: v1 imagePullSecrets: - name: thanos-querier-dockercfg-rklg9 kind: ServiceAccount metadata: annotations: serviceaccounts.openshift.io/oauth-redirectreference.thanos-querier: '{"kind":"OAuthRedirectReference","apiVersion":"v1","reference":{"kind":"Route","name":"thanos-querier"}}' creationTimestamp: "2022-08-04T20:59:16Z" labels: app.kubernetes.io/component: query-layer app.kubernetes.io/instance: thanos-querier app.kubernetes.io/name: thanos-query app.kubernetes.io/part-of: openshift-monitoring app.kubernetes.io/version: 0.27.0 name: thanos-querier namespace: openshift-monitoring resourceVersion: "28720" uid: 69268a0f-12b0-4b35-8512-91880e8cf819 secrets: - name: thanos-querier-dockercfg-rklg9 --- apiVersion: v1 kind: ServiceAccount metadata: annotations: serviceaccounts.openshift.io/oauth-redirectreference.thanos-querier: '{"kind":"OAuthRedirectReference","apiVersion":"v1","reference":{"kind":"Route","name":"thanos-querier"}}' creationTimestamp: "2022-08-04T20:59:16Z" labels: app.kubernetes.io/component: query-layer app.kubernetes.io/instance: thanos-querier app.kubernetes.io/name: thanos-query app.kubernetes.io/part-of: openshift-monitoring app.kubernetes.io/version: 0.27.0 name: thanos-querier namespace: openshift-monitoring resourceVersion: "30882" uid: 69268a0f-12b0-4b35-8512-91880e8cf819 --- apiVersion: v1 kind: ServiceAccount metadata: annotations: openshift.io/create-dockercfg-secrets.pending-token: thanos-querier-token-fhw5x serviceaccounts.openshift.io/oauth-redirectreference.thanos-querier: '{"kind":"OAuthRedirectReference","apiVersion":"v1","reference":{"kind":"Route","name":"thanos-querier"}}' creationTimestamp: "2022-08-04T20:59:16Z" labels: app.kubernetes.io/component: query-layer app.kubernetes.io/instance: thanos-querier app.kubernetes.io/name: thanos-query app.kubernetes.io/part-of: openshift-monitoring app.kubernetes.io/version: 0.27.0 name: thanos-querier namespace: openshift-monitoring resourceVersion: "30883" uid: 69268a0f-12b0-4b35-8512-91880e8cf819 --- apiVersion: v1 imagePullSecrets: - name: thanos-querier-dockercfg-lx4bt kind: ServiceAccount metadata: annotations: serviceaccounts.openshift.io/oauth-redirectreference.thanos-querier: '{"kind":"OAuthRedirectReference","apiVersion":"v1","reference":{"kind":"Route","name":"thanos-querier"}}' creationTimestamp: "2022-08-04T20:59:16Z" labels: app.kubernetes.io/component: query-layer app.kubernetes.io/instance: thanos-querier app.kubernetes.io/name: thanos-query app.kubernetes.io/part-of: openshift-monitoring app.kubernetes.io/version: 0.27.0 name: thanos-querier namespace: openshift-monitoring resourceVersion: "30891" uid: 69268a0f-12b0-4b35-8512-91880e8cf819 secrets: - name: thanos-querier-dockercfg-lx4bt Effectively: 1) Initially, secrets and imagePullSecrets are both populated with the names of existing secrets. 2) Both fields are wiped out by an update. 3) The dockercfg secret controller in openshift-controller-manager sees the update and decides it needs to act because the serviceaccount doesn't have appropriate items in its secrets and imagePullSecrets lists. It annotates the serviceaccount with "openshift.io/create-dockercfg-secrets.pending-token" to indicate that it has generated a new token name. 4) The same controller creates two secrets and adds their names back to the serviceaccount. And here is the Secret leak showing up in the presubmit on the original pull request: $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_cluster-monitoring-operator/1704/pull-ci-openshift-cluster-monitoring-operator-master-e2e-agnostic-upgrade/1544678570972942336/artifacts/e2e-agnostic-upgrade/gather-must-gather/artifacts/must-gather.tar | tar xOz registry-build03-ci-openshift-org-ci-op-s076v2wy-stable-sha256-662ef5efb9a8489a5283d1f94cbf470f763da37d4b284e30efdbc035fa5a6b1c/namespaces/openshift-monitoring/core/secrets.yaml | yaml2json | jq -r '.items[].metadata | .name + " " + .creationTimestamp' | sort ... telemeter-client-token-xh5tp 2022-07-06T15:01:09Z telemeter-client-token-xj6tr 2022-07-06T14:14:36Z thanos-querier-dockercfg-b6dmf 2022-07-06T15:15:42Z thanos-querier-dockercfg-d8fbm 2022-07-06T15:01:09Z thanos-querier-dockercfg-l7dt4 2022-07-06T14:15:37Z thanos-querier-dockercfg-t92dm 2022-07-06T15:24:19Z thanos-querier-dockercfg-vm6c2 2022-07-06T15:23:46Z thanos-querier-dockercfg-w2m2n 2022-07-06T15:14:54Z thanos-querier-grpc-tls-r8qd6n80gdn2 2022-07-06T14:15:38Z thanos-querier-kube-rbac-proxy 2022-07-06T14:15:36Z thanos-querier-kube-rbac-proxy-metrics 2022-07-06T14:15:36Z thanos-querier-kube-rbac-proxy-rules 2022-07-06T14:15:36Z thanos-querier-oauth-cookie 2022-07-06T14:15:36Z thanos-querier-tls 2022-07-06T14:13:27Z thanos-querier-token-d68sk 2022-07-06T15:24:19Z thanos-querier-token-gbwng 2022-07-06T15:01:09Z thanos-querier-token-ksczb 2022-07-06T15:23:46Z thanos-querier-token-m26kv 2022-07-06T14:15:36Z thanos-querier-token-sfjh6 2022-07-06T15:15:42Z thanos-querier-token-vj57h 2022-07-06T15:14:53Z I'm setting this high/high and blocker+, because leaking multiple Secrets into openshift-monitoring in batches every 15m or more frequently in ways that don't get garbage collected puts all 4.12 clusters at risk until we fix the leak, possibly by reverting [1]. [1]: https://github.com/openshift/cluster-monitoring-operator/pull/1704 tested with 4.12.0-0.nightly-2022-08-09-114621, no Secret leak now # kubectl get -n openshift-monitoring serviceaccount thanos-querier -o yaml -w apiVersion: v1 imagePullSecrets: - name: thanos-querier-dockercfg-rn7qz kind: ServiceAccount metadata: annotations: serviceaccounts.openshift.io/oauth-redirectreference.thanos-querier: '{"kind":"OAuthRedirectReference","apiVersion":"v1","reference":{"kind":"Route","name":"thanos-querier"}}' creationTimestamp: "2022-08-10T01:06:18Z" labels: app.kubernetes.io/component: query-layer app.kubernetes.io/instance: thanos-querier app.kubernetes.io/name: thanos-query app.kubernetes.io/part-of: openshift-monitoring app.kubernetes.io/version: 0.27.0 name: thanos-querier namespace: openshift-monitoring resourceVersion: "23367" uid: 2e6aeeb8-58de-4e85-b92b-7433442d3c78 secrets: - name: thanos-querier-dockercfg-rn7qz and # cat must-gather.local.3853711220397963982/quay-io-openshift-release-dev-ocp-v4-0-art-dev-sha256-9ac372d87be122862b54104f11de19bf6d23688c3c382546c3767c8677539ba6/namespaces/openshift-monitoring/core/secrets.yaml | yaml2json | jq -r '.items[].metadata | .name + " " + .creationTimestamp' | sort ..... telemeter-client 2022-08-10T01:07:12Z telemeter-client-dockercfg-q86jb 2022-08-10T01:08:50Z telemeter-client-kube-rbac-proxy-config 2022-08-10T01:07:12Z telemeter-client-tls 2022-08-10T01:07:12Z telemeter-client-token-72gz2 2022-08-10T01:08:50Z thanos-querier-dockercfg-rn7qz 2022-08-10T01:08:50Z thanos-querier-grpc-tls-ak3b6m82k2aqf 2022-08-10T01:06:20Z thanos-querier-kube-rbac-proxy 2022-08-10T01:06:17Z thanos-querier-kube-rbac-proxy-metrics 2022-08-10T01:06:17Z thanos-querier-kube-rbac-proxy-rules 2022-08-10T01:06:17Z thanos-querier-oauth-cookie 2022-08-10T01:06:17Z thanos-querier-tls 2022-08-10T01:06:18Z thanos-querier-token-6nmjg 2022-08-10T01:08:50Z *** Bug 2115376 has been marked as a duplicate of this bug. *** The issue is fixed in oc --context build02 get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.12.0-ec.1 True False 18h Cluster version is 4.12.0-ec.1 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.12.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:7399 |