PriorityClasses can't be mutated, so it's currently impossible to change the priority value of an existing class that is maintained by CVO (the update is rejected and ultimately causes the upgrade to fail). Probably needs discussion w/ kube team, but one option would be to delete+recreate the resource instead of updating it. But we need to understand what implication that has for existing workloads that were using the priorityclass when it gets deleted+recreated. Honestly the CVO answer here may be that it's not solveable, depending on what the kube team says, but investigation is warranted.
Failure here https://github.com/openshift/cluster-monitoring-operator/pull/1060. "Could not update priorityclass "openshift-user-critical" (325 of 669): the object is invalid, possibly due to local cluster configuration"
$ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_cluster-monitoring-operator/1060/pull-ci-openshift-cluster-monitoring-operator-release-4.7-e2e-agnostic-upgrade/1361797567247028224/artifacts/e2e-agnostic-upgrade/gather-extra/artifacts/pods/openshift-cluster-version_cluster-version-operator-58d6f9db87-rmc6k_cluster-version-operator.log | grep -o 'error running apply for priorityclass.*' | sort | uniq -c 242 error running apply for priorityclass "openshift-user-critical" (325 of 669): PriorityClass.scheduling.k8s.io "openshift-user-critical" is invalid: Value: Forbidden: may not be changed in an update. 1 error running apply for priorityclass "openshift-user-critical" (325 of 669): context deadline exceeded So the issue is that bumping the 'value' [1] of the PriorityClass manifest didn't work. Maybe we can delete the PriorityClass object and create a new one instead of attempting to update the existing resource? But I don't know how the Kube-core components would handle us deleting a PriorityClass being used by running pods. We need to round with the Kube-core folks on this. More broadly, it might be possible to get the CVO out of the PriorityClass management loop and have a Kube-core team in charge of creating and managing the classes needed by OpenShift. We can hash that out as well when rounding with the Kube-core teams. This is complicated enough that it's unlikely to happen this sprint. [1]: https://github.com/openshift/cluster-monitoring-operator/pull/1060/files#diff-e30e0ecd0267865f4d29cdc205349e93e870c4d5c334f4a1c71d2bf5ef270298L8
[1] is the upstream request to allow these to be updated. [1]: https://github.com/kubernetes/kubernetes/issues/99205
As pointed out in comment 0, the CVO could delete+recreate here to work around the current API-server behavior. But as comment 3 points out, adjusting the API-server to support the CVO's current update attempts is being worked on upstream. With bug 1934516 being resolved via a new priority class, so I'm not aware of anything that makes this important enough to be worth delete+recreate in the CVO code. Marking DEFERRED, and we'll work for this use-case automatically after the API-server is adjusted.