Description of problem: Found there are so many secrets in openshift-cluster-node-tuning-operator(the cluster serve about 4 days). # for i in `oc get project -o jsonpath='{.items..metadata.name}'`; do echo -n -e "Project: $i => Secret: "; oc get secret -n $i | wc -l; done | column -t Project: default => Secret: 13 Project: kube-public => Secret: 10 Project: kube-system => Secret: 84 Project: openshift => Secret: 11 Project: openshift-ansible-service-broker => Secret: 22 Project: openshift-apiserver => Secret: 15 Project: openshift-apiserver-operator => Secret: 14 Project: openshift-authentication => Secret: 18 Project: openshift-authentication-operator => Secret: 14 Project: openshift-cloud-credential-operator => Secret: 10 Project: openshift-cluster-machine-approver => Secret: 13 Project: openshift-cluster-node-tuning-operator => Secret: 2066 Project: openshift-cluster-samples-operator => Secret: 13 Project: openshift-cluster-storage-operator => Secret: 13 Project: openshift-cluster-version => Secret: 10 Project: openshift-config => Secret: 17 Project: openshift-config-managed => Secret: 13 Project: openshift-console => Secret: 12 Project: openshift-console-operator => Secret: 13 Project: openshift-controller-manager => Secret: 14 Project: openshift-controller-manager-operator => Secret: 14 Project: openshift-dns => Secret: 13 Project: openshift-dns-operator => Secret: 13 Project: openshift-etcd => Secret: 10 Project: openshift-image-registry => Secret: 21 Project: openshift-infra => Secret: 70 Project: openshift-ingress => Secret: 17 Project: openshift-ingress-operator => Secret: 14 Project: openshift-kube-apiserver => Secret: 51 Project: openshift-kube-apiserver-operator => Secret: 20 Project: openshift-kube-controller-manager => Secret: 48 Project: openshift-kube-controller-manager-operator => Secret: 17 Project: openshift-kube-scheduler => Secret: 31 Project: openshift-kube-scheduler-operator => Secret: 14 Project: openshift-machine-api => Secret: 21 Project: openshift-machine-config-operator => Secret: 24 Project: openshift-marketplace => Secret: 26 Project: openshift-monitoring => Secret: 56 Project: openshift-multus => Secret: 13 Project: openshift-network-operator => Secret: 10 Project: openshift-node => Secret: 10 Project: openshift-operator-lifecycle-manager => Secret: 22 Project: openshift-operators => Secret: 10 Project: openshift-sdn => Secret: 16 Project: openshift-service-ca => Secret: 20 Project: openshift-service-ca-operator => Secret: 13 Project: openshift-service-catalog-apiserver => Secret: 15 Project: openshift-service-catalog-apiserver-operator => Secret: 14 Project: openshift-service-catalog-controller-manager => Secret: 14 Project: openshift-service-catalog-controller-manager-operator => Secret: 14 Project: openshift-template-service-broker => Secret: 21 The cluster is running more than 3 days, and have several times upgrade on it. Version-Release number of selected component (if applicable): The cluster is installed with 4.1.0-0.nightly-2019-05-17-110425, but we upgrade several times, and currently the cluster is 4.1.0-0.nightly-2019-05-24-040103 How reproducible: Always Steps to Reproduce: 1. 2. 3. Actual results: After several days, node-tuning-operator namespace have nearly 2000 secrets it's much more than other namespaces Expected results: Should not have so many secrets. Additional info: # oc get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME dell-r730-063.dsal.lab.eng.rdu2.redhat.com Ready master 3d19h v1.13.4+cb455d664 10.1.8.73 <none> Red Hat Enterprise Linux CoreOS 410.8.20190520.0 (Ootpa) 4.18.0-80.1.2.el8_0.x86_64 cri-o://1.13.9-1.rhaos4.1.gitd70609a.el8 dell-r730-064.dsal.lab.eng.rdu2.redhat.com Ready master 3d19h v1.13.4+cb455d664 10.1.8.74 <none> Red Hat Enterprise Linux CoreOS 410.8.20190520.0 (Ootpa) 4.18.0-80.1.2.el8_0.x86_64 cri-o://1.13.9-1.rhaos4.1.gitd70609a.el8 dell-r730-065.dsal.lab.eng.rdu2.redhat.com Ready master 3d19h v1.13.4+cb455d664 10.1.8.75 <none> Red Hat Enterprise Linux CoreOS 410.8.20190520.0 (Ootpa) 4.18.0-80.1.2.el8_0.x86_64 cri-o://1.13.9-1.rhaos4.1.gitd70609a.el8 dell-r730-066.dsal.lab.eng.rdu2.redhat.com Ready worker 3d19h v1.13.4+cb455d664 10.1.8.76 <none> Red Hat Enterprise Linux CoreOS 410.8.20190520.0 (Ootpa) 4.18.0-80.1.2.el8_0.x86_64 cri-o://1.13.9-1.rhaos4.1.gitd70609a.el8 dell-r730-067.dsal.lab.eng.rdu2.redhat.com Ready worker 3d19h v1.13.4+cb455d664 10.1.8.77 <none> Red Hat Enterprise Linux CoreOS 410.8.20190520.0 (Ootpa) 4.18.0-80.1.2.el8_0.x86_64 cri-o://1.13.9-1.rhaos4.1.gitd70609a.el8 dell-r730-068.dsal.lab.eng.rdu2.redhat.com Ready worker 3d17h v1.13.4+54aa63688 10.1.8.78 <none> Red Hat Enterprise Linux Server 7.6 (Maipo) 3.10.0-957.el7.x86_64 cri-o://1.13.6-1.dev.rhaos4.1.gitee2e748.el7-dev
Fix for release-4.1 branch: https://github.com/openshift/cluster-node-tuning-operator/pull/59
*** Bug 1716600 has been marked as a duplicate of this bug. ***
@jmencak - To safely delete the extraneous secrets, can all but the most recent tuned-token and tuned-dockercfg secrets be deleted?
(In reply to Mike Fiedler from comment #11) > @jmencak - To safely delete the extraneous secrets, can all but the most > recent tuned-token and tuned-dockercfg secrets be deleted? Unfortunately, that wouldn't work. At this point, the suggested cleanup after an upgrade from a version affected by this bug to 4.1.1 (or a version that has the fix) is: $ oc get secrets -n openshift-cluster-node-tuning-operator | awk '/^tuned-/ {print $1}' | xargs oc delete secrets $ oc delete ds/tuned -n openshift-cluster-node-tuning-operator
Checked with 4.1.0-0.nightly-2019-06-04-235906, and new fresh installation cluster will not burst so many secrets. And as jmencak said, existed thousands of secrets will not be cleaned.
@skordas - Please update the existing node tuning operator basic functionality automation to add a sanity check on the number of secrets in project. Thanks a lot
oc get clusterversions NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.1.0-0.nightly-2019-06-04-235906 True False 20h Cluster version is 4.1.0-0.nightly-2019-06-04-235906 # oc get secrets -n openshift-cluster-node-tuning-operator NAME TYPE DATA AGE builder-dockercfg-dp8dm kubernetes.io/dockercfg 1 20h builder-token-987xv kubernetes.io/service-account-token 4 20h builder-token-skzqw kubernetes.io/service-account-token 4 20h cluster-node-tuning-operator-dockercfg-zgwq9 kubernetes.io/dockercfg 1 20h cluster-node-tuning-operator-token-djrfq kubernetes.io/service-account-token 4 20h cluster-node-tuning-operator-token-jnkcw kubernetes.io/service-account-token 4 20h default-dockercfg-j999p kubernetes.io/dockercfg 1 20h default-token-26nlt kubernetes.io/service-account-token 4 20h default-token-wrf2s kubernetes.io/service-account-token 4 20h deployer-dockercfg-c4wzz kubernetes.io/dockercfg 1 20h deployer-token-54k7s kubernetes.io/service-account-token 4 20h deployer-token-cnf7s kubernetes.io/service-account-token 4 20h tuned-dockercfg-t6j85 kubernetes.io/dockercfg 1 20h tuned-token-2p74v kubernetes.io/service-account-token 4 20h tuned-token-4ptrb kubernetes.io/service-account-token 4 20h @mifiedle Test case and automation are updated: https://github.com/openshift/svt/pull/591
(In reply to weiwei jiang from comment #13) > Checked with 4.1.0-0.nightly-2019-06-04-235906, and new fresh installation > cluster will not burst so many secrets. > > And as jmencak said, existed thousands of secrets will not be cleaned. There is an upstream PR https://github.com/openshift/cluster-node-tuning-operator/pull/63 for automated removal of detached tuned secrets to perform the cleanup. Should the removal of secrets be tracked as part of this BZ or a new one created?
Re: comment 16. Opened https://bugzilla.redhat.com/show_bug.cgi?id=1718842 to track this and targeted it for 4.1.2
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:1382