Bug 1997787
| Summary: | Descheduler default for evict pods with PVCs is incorrect | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Paige Rubendall <prubenda> |
| Component: | kube-scheduler | Assignee: | Mike Dame <mdame> |
| Status: | CLOSED ERRATA | QA Contact: | RamaKasturi <knarra> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 4.9 | CC: | aos-bugs, maszulik, mdame, mfojtik, prubenda |
| Target Milestone: | --- | ||
| Target Release: | 4.9.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-10-18 17:49:05 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Paige Rubendall
2021-08-25 19:23:17 UTC
The problem is that we made `ignorePvcPods=true` by default [1][2]. So the current default behavior in OCP is to not evict PVC pods. In that case I think we made a logical mistake in introducing this profile as "DoNotEvict..", since that implies the pods will be evicted by default. So we have some options: 1. Enable this profile by default in the operator (to preserve default behavior) 2. Add another profile "EvictPodsWithPVC" that sets `ignorePvcPods=false`, then deprecate "DoNotEvict.." profile. This inverts the upstream default behavior, but preserves ours 3. Remove the change from [2] that enables this by default. This is a change in the current behavior @Maciej which do you think would be best from a product standpoint? [1] https://bugzilla.redhat.com/show_bug.cgi?id=1911782 [2] https://github.com/openshift/cluster-kube-descheduler-operator/pull/168 I opened https://github.com/openshift/cluster-kube-descheduler-operator/pull/217 to go with option 2. We don't need to deprecate the current profile since it hasn't been released yet, so we can instead pivot it to disable the default behavior I like option 2, although your PR only replaces the profile, without deprecation, am I right? @Maciej, yeah, since we haven't released the new profile yet we don't need to deprecate it, we can just switch it @mikedame one question related to the bug here, it means there will be lot of pod evictions right since a customer env might have lot of pods created with pvcs ? Would not this be a problem when 4.9 release as this alters the previous config from 4.8 version of descheduler. This maintains the current behavior of the descheduler. Pods with PVCs will not be evicted by default. Users can enable this option to make PVC pods eligible for eviction, though they will still need to meet the criteria of one of the other profiles first. Hello mike, I tried to verify the bug here but we are hitting back the bug [1] which we fixed some time back. Is this expected? Could you please help confirm ? [1] https://bugzilla.redhat.com/show_bug.cgi?id=1911782 Thanks kasturi @Rama spoke offline, but for reference here too: the EvictPodsWithPVC profile now *does* evict pods that have a PVC (whether or not that PVC is using local storage does not matter) The default behavior for the operator is to *not* evict them. Setting this profile allows users to toggle this option now. Hope that clears it up Thanks Mike, that helps !!
Verified bug with the build below and i see that when EvictPodsWithPVC profile is set i see that pods which are created with pvcs are evicted even if the pvc is being created by local storage. By default pods created with pvc does not get evicted.
[knarra@knarra ~]$ oc get csv -n openshift-kube-descheduler-operator
NAME DISPLAY VERSION REPLACES PHASE
clusterkubedescheduleroperator.4.9.0-202109030207 Kube Descheduler Operator 4.9.0-202109030207 Succeeded
[knarra@knarra ~]$ oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.9.0-0.nightly-2021-09-06-004132 True False 29h Cluster version is 4.9.0-0.nightly-2021-09-06-004132
configmap of kubedescheduler cluster when EvictPodsWithPVC & EvictPodsWithLocalStorage are set.
[knarra@knarra ~]$ oc get configmap cluster -n openshift-kube-descheduler-operator -o yaml
apiVersion: v1
data:
policy.yaml: |
apiVersion: descheduler/v1alpha1
ignorePvcPods: true
kind: DeschedulerPolicy
strategies:
RemoveDuplicates:
enabled: true
params:
includeSoftConstraints: false
namespaces:
exclude:
When set EvictPodsWithPVC below is the yaml file:
[knarra@knarra ~]$ oc get configmap cluster -n openshift-kube-descheduler-operator -o yaml
apiVersion: v1
data:
policy.yaml: |
apiVersion: descheduler/v1alpha1
ignorePvcPods: false
kind: DeschedulerPolicy
strategies:
RemoveDuplicates:
enabled: true
params:
includeSoftConstraints: false
namespaces:
When both EvictpodsWithPVC & EvictPodsWithLocalStorage is present
[knarra@knarra ~]$ oc get configmap cluster -n openshift-kube-descheduler-operator -o yaml
apiVersion: v1
data:
policy.yaml: |
apiVersion: descheduler/v1alpha1
evictLocalStoragePods: true
ignorePvcPods: false
kind: DeschedulerPolicy
strategies:
RemoveDuplicates:
enabled: true
params:
includeSoftConstraints: false
namespaces:
exclude:
- kube-system
Based on the above moving the bug to verified state.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3759 |