This bug was initially created as a copy of Bug #1891108 I am copying this bug because: https://bugzilla.redhat.com/show_bug.cgi?id=1891108#c8 and https://bugzilla.redhat.com/show_bug.cgi?id=1891108#c11, the issue is still on upgrade, need one new bug to track this. +++ This bug was initially created as a clone of Bug #1891107 +++ +++ This bug was initially created as a clone of Bug #1891106 +++ priority & fairness: Increase the concurrency share of workload-low priority level carry upstream PR: https://github.com/kubernetes/kubernetes/pull/95259 All workloads running using service account (except for the ones distinguished by p&f with a logically higher matching precedence) will match the `service-accounts` flow schema and be assigned to the `workload-low` priority and thus will have only `20` concurrency shares. (~10% of the total) On the other hand, `global-default` flow schema is assigned to `global-default` priority configuration and thus will have `100` concurrency shares (~50% of the total). If I am not mistaken, `global-default` goes pretty much unused since workloads running with user (not service account) will fall into this category and is not very common. Workload with service accounts do not have enough concurrency share and may starve. Increase the concurrency share of `workload-low` from `20` to `100` and reduce that of `global-default` from `100` to `20`. We have been asking customer to apply the patch manually: https://bugzilla.redhat.com/show_bug.cgi?id=1883589#c56 > oc patch prioritylevelconfiguration workload-low --type=merge -p '{"spec":{"limited":{"assuredConcurrencyShares": 100}}}' > oc patch prioritylevelconfiguration global-default --type=merge -p '{"spec":{"limited":{"assuredConcurrencyShares": 20}}}' This will get rid of the need for manual patch.
kewang, Okay, so I have opened a PR upstream to auto update p&f bootstrap configuration objects - https://github.com/kubernetes/kubernetes/pull/98028. This should resolve this BZ. I also have opened a test PR in o/k 4.7 so you can do an early upgrade test and verify that this PR resolves the issue. Also, please go through the PR description and come up with a test plan. Please do let me know if you have any question. > o/k PR: https://github.com/openshift/kubernetes/pull/563 This will go into a 4.7 Z stream, is that correct?
setting it to 4.7.Z release
This bug hasn't had any activity in the last 30 days. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. As such, we're marking this bug as "LifecycleStale" and decreasing the severity/priority. If you have further information on the current state of the bug, please update it, otherwise this bug can be closed in about 7 days. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant. Additionally, you can add LifecycleFrozen into Keywords if you think this bug should never be marked as stale. Please consult with bug assignee before you do that.
This bug's PR is dev-approved and not yet merged, so I'm following issue DPTP-660 to do the pre-merge verifying for QE pre-merge verification goal of issue OCPQE-815 by using the bot to build image with the open PR. Here is the verification steps: 1. Fresh installed one OCP 4.6 cluster 2. Upgrade to 4.7 using image built with PR. $ oc get clusterversion -o json|jq ".items[0].status.history" [ { "completionTime": "2021-09-22T11:47:56Z", "image": "registry.build01.ci.openshift.org/ci-ln-fwqtwkt/release:latest", "startedTime": "2021-09-22T10:46:14Z", "state": "Completed", "verified": false, "version": "4.7.0-0.ci.test-2021-09-22-071911-ci-ln-fwqtwkt-latest" }, { "completionTime": "2021-09-22T09:00:58Z", "image": "quay.io/openshift-release-dev/ocp-release@sha256:a3a26cf19be8b991ab94337580bd693857474f07c961f180c6ba67683ab91b8c", "startedTime": "2021-09-22T08:35:03Z", "state": "Completed", "verified": false, "version": "4.6.45" } ] $ oc get FlowSchema NAME PRIORITYLEVEL MATCHINGPRECEDENCE DISTINGUISHERMETHOD AGE MISSINGPL exempt exempt 1 <none> 5h52m False openshift-apiserver-sar exempt 2 ByUser 5h47m False openshift-oauth-apiserver-sar exempt 2 ByUser 5h47m False probes exempt 2 <none> 3h32m False system-leader-election leader-election 100 ByUser 5h52m False workload-leader-election leader-election 200 ByUser 5h52m False openshift-sdn system 500 ByUser 3h7m False system-nodes system 500 ByUser 5h52m False kube-controller-manager workload-high 800 ByNamespace 5h52m False kube-scheduler workload-high 800 ByNamespace 5h52m False kube-system-service-accounts workload-high 900 ByNamespace 5h52m False openshift-apiserver workload-high 1000 ByUser 5h47m False openshift-controller-manager workload-high 1000 ByUser 5h47m False openshift-oauth-apiserver workload-high 1000 ByUser 5h47m False openshift-oauth-server workload-high 1000 ByUser 5h47m False openshift-apiserver-operator openshift-control-plane-operators 2000 ByUser 5h47m False openshift-authentication-operator openshift-control-plane-operators 2000 ByUser 5h47m False openshift-etcd-operator openshift-control-plane-operators 2000 ByUser 5h47m False openshift-kube-apiserver-operator openshift-control-plane-operators 2000 ByUser 5h47m False openshift-monitoring-metrics workload-high 2000 ByUser 5h47m False service-accounts workload-low 9000 ByUser 5h52m False global-default global-default 9900 ByUser 5h52m False catch-all catch-all 10000 ByUser 5h52m False $ oc get prioritylevelconfiguration workload-low -o jsonpath='{.spec.limited.assuredConcurrencyShares}' 100 $ oc get prioritylevelconfiguration global-default -o jsonpath='{.spec.limited.assuredConcurrencyShares}' 20 So the bug is pre-merge verified. After the PR gets merged, the bug will be moved to VERIFIED by the bot automatically or, if not working, by me manually.
The LifecycleStale keyword was removed because the needinfo? flag was reset. The bug assignee was notified.
Based on above https://bugzilla.redhat.com/show_bug.cgi?id=1926724#c6, the bug was verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.7.38 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:4802