Bug 1929741

Summary:	CVO does not allow priorityclass updates
Product:	OpenShift Container Platform	Reporter:	Ben Parees <bparees>
Component:	Cluster Version Operator	Assignee:	Lalatendu Mohanty <lmohanty>
Status:	CLOSED DEFERRED	QA Contact:	Johnny Liu <jialiu>
Severity:	low	Docs Contact:
Priority:	unspecified
Version:	4.7	CC:	aos-bugs, jack.ottofaro, jokerman, lmohanty, spasquie, wking
Target Milestone:	---
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2021-06-03 15:47:53 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Ben Parees 2021-02-17 14:42:00 UTC

PriorityClasses can't be mutated, so it's currently impossible to change the priority value of an existing class that is maintained by CVO (the update is rejected and ultimately causes the upgrade to fail).

Probably needs discussion w/ kube team, but one option would be to delete+recreate the resource instead of updating it.  But we need to understand what implication that has for existing workloads that were using the priorityclass when it gets deleted+recreated.

Honestly the CVO answer here may be that it's not solveable, depending on what the kube team says, but investigation is warranted.

Comment 1 Jack Ottofaro 2021-02-17 14:50:55 UTC

Failure here https://github.com/openshift/cluster-monitoring-operator/pull/1060.

"Could not update priorityclass "openshift-user-critical" (325 of 669): the object is invalid, possibly due to local cluster configuration"

Comment 2 W. Trevor King 2021-02-17 18:25:51 UTC

$ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_cluster-monitoring-operator/1060/pull-ci-openshift-cluster-monitoring-operator-release-4.7-e2e-agnostic-upgrade/1361797567247028224/artifacts/e2e-agnostic-upgrade/gather-extra/artifacts/pods/openshift-cluster-version_cluster-version-operator-58d6f9db87-rmc6k_cluster-version-operator.log | grep -o 'error running apply for priorityclass.*' | sort | uniq -c
    242 error running apply for priorityclass "openshift-user-critical" (325 of 669): PriorityClass.scheduling.k8s.io "openshift-user-critical" is invalid: Value: Forbidden: may not be changed in an update.
      1 error running apply for priorityclass "openshift-user-critical" (325 of 669): context deadline exceeded

So the issue is that bumping the 'value' [1] of the PriorityClass manifest didn't work.  Maybe we can delete the PriorityClass object and create a new one instead of attempting to update the existing resource?  But I don't know how the Kube-core components would handle us deleting a PriorityClass being used by running pods.  We need to round with the Kube-core folks on this.

More broadly, it might be possible to get the CVO out of the PriorityClass management loop and have a Kube-core team in charge of creating and managing the classes needed by OpenShift.  We can hash that out as well when rounding with the Kube-core teams.

This is complicated enough that it's unlikely to happen this sprint.

[1]: https://github.com/openshift/cluster-monitoring-operator/pull/1060/files#diff-e30e0ecd0267865f4d29cdc205349e93e870c4d5c334f4a1c71d2bf5ef270298L8

Comment 3 W. Trevor King 2021-02-24 18:58:55 UTC

[1] is the upstream request to allow these to be updated.

[1]: https://github.com/kubernetes/kubernetes/issues/99205

Comment 4 W. Trevor King 2021-06-03 15:47:53 UTC

As pointed out in comment 0, the CVO could delete+recreate here to work around the current API-server behavior.  But as comment 3 points out, adjusting the API-server to support the CVO's current update attempts is being worked on upstream.  With bug 1934516 being resolved via a new priority class, so I'm not aware of anything that makes this important enough to be worth delete+recreate in the CVO code.  Marking DEFERRED, and we'll work for this use-case automatically after the API-server is adjusted.