Bug 1929741
Summary: | CVO does not allow priorityclass updates | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Ben Parees <bparees> |
Component: | Cluster Version Operator | Assignee: | Lalatendu Mohanty <lmohanty> |
Status: | CLOSED DEFERRED | QA Contact: | Johnny Liu <jialiu> |
Severity: | low | Docs Contact: | |
Priority: | unspecified | ||
Version: | 4.7 | CC: | aos-bugs, jack.ottofaro, jokerman, lmohanty, spasquie, wking |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2021-06-03 15:47:53 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Ben Parees
2021-02-17 14:42:00 UTC
Failure here https://github.com/openshift/cluster-monitoring-operator/pull/1060. "Could not update priorityclass "openshift-user-critical" (325 of 669): the object is invalid, possibly due to local cluster configuration" $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_cluster-monitoring-operator/1060/pull-ci-openshift-cluster-monitoring-operator-release-4.7-e2e-agnostic-upgrade/1361797567247028224/artifacts/e2e-agnostic-upgrade/gather-extra/artifacts/pods/openshift-cluster-version_cluster-version-operator-58d6f9db87-rmc6k_cluster-version-operator.log | grep -o 'error running apply for priorityclass.*' | sort | uniq -c 242 error running apply for priorityclass "openshift-user-critical" (325 of 669): PriorityClass.scheduling.k8s.io "openshift-user-critical" is invalid: Value: Forbidden: may not be changed in an update. 1 error running apply for priorityclass "openshift-user-critical" (325 of 669): context deadline exceeded So the issue is that bumping the 'value' [1] of the PriorityClass manifest didn't work. Maybe we can delete the PriorityClass object and create a new one instead of attempting to update the existing resource? But I don't know how the Kube-core components would handle us deleting a PriorityClass being used by running pods. We need to round with the Kube-core folks on this. More broadly, it might be possible to get the CVO out of the PriorityClass management loop and have a Kube-core team in charge of creating and managing the classes needed by OpenShift. We can hash that out as well when rounding with the Kube-core teams. This is complicated enough that it's unlikely to happen this sprint. [1]: https://github.com/openshift/cluster-monitoring-operator/pull/1060/files#diff-e30e0ecd0267865f4d29cdc205349e93e870c4d5c334f4a1c71d2bf5ef270298L8 [1] is the upstream request to allow these to be updated. [1]: https://github.com/kubernetes/kubernetes/issues/99205 As pointed out in comment 0, the CVO could delete+recreate here to work around the current API-server behavior. But as comment 3 points out, adjusting the API-server to support the CVO's current update attempts is being worked on upstream. With bug 1934516 being resolved via a new priority class, so I'm not aware of anything that makes this important enough to be worth delete+recreate in the CVO code. Marking DEFERRED, and we'll work for this use-case automatically after the API-server is adjusted. |