Bug 1929741
| Summary: | CVO does not allow priorityclass updates | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Ben Parees <bparees> |
| Component: | Cluster Version Operator | Assignee: | Lalatendu Mohanty <lmohanty> |
| Status: | CLOSED DEFERRED | QA Contact: | Johnny Liu <jialiu> |
| Severity: | low | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 4.7 | CC: | aos-bugs, jack.ottofaro, jokerman, lmohanty, spasquie, wking |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-06-03 15:47:53 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Ben Parees
2021-02-17 14:42:00 UTC
Failure here https://github.com/openshift/cluster-monitoring-operator/pull/1060. "Could not update priorityclass "openshift-user-critical" (325 of 669): the object is invalid, possibly due to local cluster configuration" $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_cluster-monitoring-operator/1060/pull-ci-openshift-cluster-monitoring-operator-release-4.7-e2e-agnostic-upgrade/1361797567247028224/artifacts/e2e-agnostic-upgrade/gather-extra/artifacts/pods/openshift-cluster-version_cluster-version-operator-58d6f9db87-rmc6k_cluster-version-operator.log | grep -o 'error running apply for priorityclass.*' | sort | uniq -c 242 error running apply for priorityclass "openshift-user-critical" (325 of 669): PriorityClass.scheduling.k8s.io "openshift-user-critical" is invalid: Value: Forbidden: may not be changed in an update. 1 error running apply for priorityclass "openshift-user-critical" (325 of 669): context deadline exceeded So the issue is that bumping the 'value' [1] of the PriorityClass manifest didn't work. Maybe we can delete the PriorityClass object and create a new one instead of attempting to update the existing resource? But I don't know how the Kube-core components would handle us deleting a PriorityClass being used by running pods. We need to round with the Kube-core folks on this. More broadly, it might be possible to get the CVO out of the PriorityClass management loop and have a Kube-core team in charge of creating and managing the classes needed by OpenShift. We can hash that out as well when rounding with the Kube-core teams. This is complicated enough that it's unlikely to happen this sprint. [1]: https://github.com/openshift/cluster-monitoring-operator/pull/1060/files#diff-e30e0ecd0267865f4d29cdc205349e93e870c4d5c334f4a1c71d2bf5ef270298L8 [1] is the upstream request to allow these to be updated. [1]: https://github.com/kubernetes/kubernetes/issues/99205 As pointed out in comment 0, the CVO could delete+recreate here to work around the current API-server behavior. But as comment 3 points out, adjusting the API-server to support the CVO's current update attempts is being worked on upstream. With bug 1934516 being resolved via a new priority class, so I'm not aware of anything that makes this important enough to be worth delete+recreate in the CVO code. Marking DEFERRED, and we'll work for this use-case automatically after the API-server is adjusted. |