+++ This bug was initially created as a clone of Bug #1976379 +++
Created attachment 1794553 [details]
must-gather from cluster where this occurred
Description of problem:
Pod "cluster-version-operator-89bf5cdb5-4qhhh" in openshift-cluster-version namespace was not handled by the workload partitioning pod mutation logic. A warning was added to the pod:
workload.openshift.io/warning: only single-node clusters support workload partitioning
Version-Release number of selected component (if applicable): 4.8.0-0.nightly-2021-06-24-222938
How reproducible: unknown
Steps to Reproduce:
1. Cluster installed
2. "oc describe node" shows 20m CPU requests for this pod
openshift-cluster-version cluster-version-operator-89bf5cdb5-4qhhh 20m (0%) 0 (0%) 50Mi (0%) 0 (0%) 4h
openshift-cluster-version cluster-version-operator-89bf5cdb5-4qhhh 0 (0%) 0 (0%) 50Mi (0%) 0 (0%) 4h
--- Additional comment from alukiano on 2021-06-27 11:31:08 UTC ---
Can you please provide the installer debug log?
Like the 4.9 clone bug 1976379#c3 steps, I tested latest 4.8 non-SNO env (4.8.0-0.nightly-2021-07-04-112043), the issue still exists.
Checked its last o/k commit:
oc adm release info --commits registry.ci.openshift.org/ocp/release:4.8.0-0.nightly-2021-07-04-112043 | grep hyperkube
hyperkube https://github.com/openshift/kubernetes f36aa364667...
https://github.com/openshift/kubernetes/blob/f36aa364667/openshift-kube-apiserver/admission/autoscaling/managementcpusoverride/admission.go#L183-L186 already contains the PR code. Thus moving back to ASSIGNED.
The problem that this annotation was added for the SNO cluster with the workload partitioning when it should not.
It's ok to have this annotation under the pod under the non-SNO cluster.
Can you please verify the bug for the SNO cluster with the workload partitioning enabled?
(In reply to Artyom from comment #5)
Thanks for clarification. Then it is better to have the QE colleague from team of the workload partitioning feature. Let me update.
Did you enable the workload partitioning during the setup? You should provide an additional machine config manifest to enabled it.
Please see - https://github.com/openshift/enhancements/blob/master/enhancements/workload-partitioning/management-workload-partitioning.md#example-manifests
*** Bug 1982868 has been marked as a duplicate of this bug. ***
Neelesh closed bug 1982868 as a dup of this one , but while this bug is now VERIFIED, 4.7 -> 4.8 -> 4.7 rollback jobs are still failing. And a recent failure, from 4.7.20-x86_64 to 4.8.0-0.ci-2021-07-19-070057 and back  still blocks with :
deployment openshift-etcd-operator/etcd-operator has a replica failure FailedCreate: pods "etcd-operator-7b677856dc-" is forbidden: autoscaling.openshift.io/ManagementCPUsOverride infrastructure resource has empty status.controlPlaneTopology or status.infrastructureTopology
Did we want to move this back to ASSIGNED until we get that sorted out? Or should I reopen bug 1982868 so we can handle it separately?
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.