Description of problem: Based on investigation of https://bugzilla.redhat.com/show_bug.cgi?id=2015052 the following was found: The kube-storage-version-migrator-operator is number two in top5 requests in "busy" cluster: system:serviceaccount:openshift-operator-lifecycle-manager:olm-operator-serviceaccount 418144 system:serviceaccount:openshift-kube-storage-version-migrator-operator:kube-storage-version-migrator-operator 156061 system:serviceaccount:openshift-controller-manager-operator:openshift-controller-manager-operator 134877 system:apiserver 133597 And number 4 in requesting the "cluster" resource: /apis/authorization.k8s.io/v1/subjectaccessreviews?timeout=10s 104472 /apis/operator.openshift.io/v1/openshiftcontrollermanagers/cluster 96652 /apis/authentication.k8s.io/v1/tokenreviews 68198 /apis/operator.openshift.io/v1/kubestorageversionmigrators/cluster 64526 And kubestorageversionmigrators.v1.operator.openshift.io.yaml 7432 This means there were 7432 requests in 60 minutes from this operator. This all suggests there might be some leak in the operator as the assumption is we should not see this many requests in an idle (although relatively big) cluster. The must-gathers and other data can be found in original bug, this one is only tracking the kube storage migrator. Version-Release number of selected component (if applicable): 4.8.z (but probably not limited to) How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: storage migrator operator produces extensive amount of API requests Expected results: storage migrator operator is silent in the idle cluster that is a past the upgrade. Additional info:
kube-storage-version-migrator-operator and kubestorageversionmigrators/cluster are low in 3 hours. @Luis Sanchez Is this count is acceptable for verfification? oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.10.0-0.nightly-2021-11-09-181140 True False 3h34m Cluster version is 4.10.0-0.nightly-2021-11-09-181140 for i in `oc get node|grep master|awk '{print $1}'`;do oc debug node/$i -- chroot /host bash -c "cat /var/log/kube-apiserver/audit*.log|jq -r '.user.username'|sort |uniq -c|sort -nr|grep kube-storage-version-migrator-operator";done W1111 16:25:13.067792 58618 warnings.go:70] would violate "latest" version of "baseline" PodSecurity profile: host namespaces (hostNetwork=true, hostPID=true), hostPath volumes (volume "host"), privileged (container "container-00" must not set securityContext.privileged=true) Starting pod/rgangwar-11de9-clq74-master-0-debug ... To use host binaries, run `chroot /host` Removing debug pod ... W1111 16:25:22.763541 58635 warnings.go:70] would violate "latest" version of "baseline" PodSecurity profile: host namespaces (hostNetwork=true, hostPID=true), hostPath volumes (volume "host"), privileged (container "container-00" must not set securityContext.privileged=true) Starting pod/rgangwar-11de9-clq74-master-1-debug ... To use host binaries, run `chroot /host` 2322 system:serviceaccount:openshift-kube-storage-version-migrator-operator:kube-storage-version-migrator-operator Removing debug pod ... W1111 16:25:41.543509 58669 warnings.go:70] would violate "latest" version of "baseline" PodSecurity profile: host namespaces (hostNetwork=true, hostPID=true), hostPath volumes (volume "host"), privileged (container "container-00" must not set securityContext.privileged=true) Starting pod/rgangwar-11de9-clq74-master-2-debug ... To use host binaries, run `chroot /host` 110 system:serviceaccount:openshift-kube-storage-version-migrator-operator:kube-storage-version-migrator-operator for i in `oc get node|grep master|awk '{print $1}'`;do oc debug node/$i -- chroot /host bash -c "cat /var/log/kube-apiserver/audit*.log|jq -r '.requestURI'|sort |uniq -c|sort -nr|grep kubestorageversionmigrators/cluster";done W1111 16:00:27.994058 56694 warnings.go:70] would violate "latest" version of "baseline" PodSecurity profile: host namespaces (hostNetwork=true, hostPID=true), hostPath volumes (volume "host"), privileged (container "container-00" must not set securityContext.privileged=true) Starting pod/rgangwar-11de9-clq74-master-0-debug ... To use host binaries, run `chroot /host` 57 /apis/operator.openshift.io/v1/kubestorageversionmigrators/cluster Removing debug pod ... W1111 16:00:42.538018 56712 warnings.go:70] would violate "latest" version of "baseline" PodSecurity profile: host namespaces (hostNetwork=true, hostPID=true), hostPath volumes (volume "host"), privileged (container "container-00" must not set securityContext.privileged=true) W1111 16:00:43.761608 56712 warnings.go:70] would violate "latest" version of "baseline" PodSecurity profile: host namespaces (hostNetwork=true, hostPID=true), hostPath volumes (volume "host"), privileged (container "container-00" must not set securityContext.privileged=true) Starting pod/rgangwar-11de9-clq74-master-1-debug ... To use host binaries, run `chroot /host` 2 /apis/operator.openshift.io/v1/kubestorageversionmigrators/cluster Removing debug pod ... W1111 16:01:01.698012 56744 warnings.go:70] would violate "latest" version of "baseline" PodSecurity profile: host namespaces (hostNetwork=true, hostPID=true), hostPath volumes (volume "host"), privileged (container "container-00" must not set securityContext.privileged=true) Starting pod/rgangwar-11de9-clq74-master-2-debug ... To use host binaries, run `chroot /host` 6 /apis/operator.openshift.io/v1/kubestorageversionmigrators/cluster/status 1 /apis/operator.openshift.io/v1/kubestorageversionmigrators/cluster Removing debug pod ...
@rgangwar: Looks good, but for more context, compare to all the requests. You can run this command remotely: oc adm node-logs --role=master --path="kube-apiserver" | \ grep -v -E "(.terminating|.lock|termination.log)" | \ sed "s|^| kube-apiserver |" | \ xargs --max-args=3 bash -c 'oc adm node-logs $2 --path=$1/$3' bash | \ # grep 'namespaces/openshift-kube-storage-version-migrator' | \ jq -r '.user.username+" "+.useragent+" "+.verb+" "+.requestURI' | sort | uniq -c | sort -n |tail -n 10 > Note the commented out part of the command that can be used to limit the count to migrator related logs.
@Luis Sanchez Below is the output for your suggested query. I think count is low. Can you please confirm? oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.10.0-0.nightly-2021-11-09-181140 True False 11h Cluster version is 4.10.0-0.nightly-2021-11-09-181140 rahulgangwar@rgangwar-mac hosts % oc adm node-logs --role=master --path="kube-apiserver"|grep -v -E "(.terminating|.lock|termination.log)"|sed "s|^| kube-apiserver |"|xargs bash -c 'oc adm node-logs $2 --path=$1/$3' bash|jq -r '.user.username+" "+.useragent+" "+.verb+" "+.requestURI' | sort | uniq -c | sort -n |tail -n 10 1089 system:serviceaccount:openshift-authentication:oauth-openshift create /apis/authorization.k8s.io/v1/subjectaccessreviews?timeout=10s 1173 system:anonymous get /.well-known/oauth-authorization-server 1217 system:serviceaccount:openshift-apiserver:openshift-apiserver-sa create /apis/authorization.k8s.io/v1/subjectaccessreviews?timeout=10s 1273 system:serviceaccount:openshift-apiserver:openshift-apiserver-sa get /api/v1/namespaces/default/services/docker-registry 1398 system:serviceaccount:openshift-monitoring:prometheus-k8s get /metrics 2119 system:anonymous get /livez 2127 system:apiserver get /api/v1/namespaces/default 2127 system:apiserver get /api/v1/namespaces/default/services/kubernetes 2128 system:apiserver get /api/v1/namespaces/default/endpoints/kubernetes 2128 system:apiserver get /apis/discovery.k8s.io/v1/namespaces/default/endpointslices/kubernetes rahulgangwar@rgangwar-mac hosts % oc adm node-logs --role=master --path="kube-apiserver"|grep -v -E "(.terminating|.lock|termination.log)"|sed "s|^| kube-apiserver |"|xargs bash -c 'oc adm node-logs $2 --path=$1/$3' bash|grep 'namespaces/openshift-kube-storage-version-migrator'|jq -r '.user.username+" "+.useragent+" "+.verb+" "+.requestURI' | sort | uniq -c | sort -n |tail -n 10 3 system:serviceaccount:kube-system:generic-garbage-collector get /apis/apps/v1/namespaces/openshift-kube-storage-version-migrator-operator/deployments/kube-storage-version-migrator-operator 3 system:serviceaccount:openshift-insights:operator list /api/v1/namespaces/openshift-kube-storage-version-migrator-operator/serviceaccounts?limit=1000 3 system:serviceaccount:openshift-insights:operator list /api/v1/namespaces/openshift-kube-storage-version-migrator/serviceaccounts?limit=1000 3 system:serviceaccount:openshift-insights:operator list /apis/operators.coreos.com/v1alpha1/namespaces/openshift-kube-storage-version-migrator-operator/installplans?limit=500 3 system:serviceaccount:openshift-insights:operator list /apis/operators.coreos.com/v1alpha1/namespaces/openshift-kube-storage-version-migrator/installplans?limit=500 12 system:serviceaccount:openshift-kube-storage-version-migrator-operator:kube-storage-version-migrator-operator get /api/v1/namespaces/openshift-kube-storage-version-migrator 13 system:serviceaccount:openshift-kube-storage-version-migrator-operator:kube-storage-version-migrator-operator get /api/v1/namespaces/openshift-kube-storage-version-migrator/serviceaccounts/kube-storage-version-migrator-sa 19 system:serviceaccount:openshift-kube-storage-version-migrator-operator:kube-storage-version-migrator-operator get /api/v1/namespaces/openshift-kube-storage-version-migrator-operator/configmaps/openshift-kube-storage-version-migrator-operator-lock?timeout=1m47s 19 system:serviceaccount:openshift-kube-storage-version-migrator-operator:kube-storage-version-migrator-operator update /api/v1/namespaces/openshift-kube-storage-version-migrator-operator/configmaps/openshift-kube-storage-version-migrator-operator-lock?timeout=1m47s 28 system:serviceaccount:openshift-kube-storage-version-migrator-operator:kube-storage-version-migrator-operator get /apis/apps/v1/namespaces/openshift-kube-storage-version-migrator/deployments/migrator
The new numbers look good to me. Thanks.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056