Bug 1997787 - Descheduler default for evict pods with PVCs is incorrect
Summary: Descheduler default for evict pods with PVCs is incorrect
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: kube-scheduler
Version: 4.9
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.9.0
Assignee: Mike Dame
QA Contact: RamaKasturi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-08-25 19:23 UTC by Paige Rubendall
Modified: 2021-10-18 17:49 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-10-18 17:49:05 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-kube-descheduler-operator pull 217 0 None None None 2021-09-01 15:54:54 UTC
Red Hat Bugzilla 1989724 1 medium CLOSED Descheduler operator should expose options for pods with PVCs and Local Storage 2021-10-18 17:44:37 UTC
Red Hat Product Errata RHSA-2021:3759 0 None None None 2021-10-18 17:49:19 UTC

Description Paige Rubendall 2021-08-25 19:23:17 UTC
Description of problem:
When creating the descheduler operator, the ignore pods with pvc is always set whether the DoNotEvictPodsWithPVC profile is set or not 

Having this set to false by default would align with the upstream kuberenetes operator: https://github.com/kubernetes-sigs/descheduler/#policy-and-strategies

Version-Release number of selected component (if applicable): 4.9.0-0.nightly-2021-08-25-061033


How reproducible: 100%


Steps to Reproduce:
1. Create cluster 
2. Install Descheduler via OperatorHub
3. Add Kube Descheduler setting (add 1 profiles, ie AffinityAndTaints) 
4. Go to configmaps section of console and look at data-> policy.yaml section



Actual results:
Configmap shows ignorePvcPods: true by default even when DoNotEvictPodsWithPVC profile is not set 

Expected results:
Configmap shows ignorePvcPods: false by default when DoNotEvictPodsWithPVC profile is not set 


Additional info:



% oc get csv -n openshift-kube-descheduler-operator
NAME                                                DISPLAY                            VERSION              REPLACES                          PHASE
clusterkubedescheduleroperator.4.9.0-202108240325   Kube Descheduler Operator          4.9.0-202108240325                                     Succeeded


% oc get kubedescheduler cluster -n openshift-kube-descheduler-operator -o yaml
apiVersion: operator.openshift.io/v1
kind: KubeDescheduler
metadata:
  creationTimestamp: "2021-08-25T19:20:15Z"
  generation: 1
  name: cluster
  namespace: openshift-kube-descheduler-operator
  resourceVersion: "171104"
  uid: acb855e3-ad47-4381-9f34-f1d3b94a5141
spec:
  deschedulingIntervalSeconds: 300
  logLevel: Normal
  managementState: Managed
  operatorLogLevel: Normal
  profiles:
  - AffinityAndTaints
status:
  conditions:
  - lastTransitionTime: "2021-08-25T19:20:15Z"
    status: "False"
    type: TargetConfigControllerDegraded
  generations:
  - group: apps
    hash: ""
    lastGeneration: 1
    name: cluster
    namespace: openshift-kube-descheduler-operator
    resource: deployments
  readyReplicas: 0



 % oc get configmap cluster -n openshift-kube-descheduler-operator -o yaml
apiVersion: v1
data:
  policy.yaml: |
    apiVersion: descheduler/v1alpha1
    ignorePvcPods: true
    kind: DeschedulerPolicy
    strategies:
      RemovePodsViolatingInterPodAntiAffinity:
        enabled: true
        params:
          includeSoftConstraints: false
          namespaces:
            exclude:
            - kube-system
            - openshift-....
            include: null
          thresholdPriority: null
          thresholdPriorityClassName: ""
      RemovePodsViolatingNodeAffinity:
        enabled: true
        params:
          includeSoftConstraints: false
          namespaces:
            exclude:
            - kube-system
            - openshift-.....
            include: null
          thresholdPriority: null
          thresholdPriorityClassName: ""
kind: ConfigMap
metadata:
  creationTimestamp: "2021-08-25T19:20:15Z"
  name: cluster
  namespace: openshift-kube-descheduler-operator
  ownerReferences:
  - apiVersion: v1
    kind: KubeDescheduler
    name: cluster
    uid: acb855e3-ad47-4381-9f34-f1d3b94a5141
  resourceVersion: "171084"
  uid: c1cc9ffb-4f6d-49c3-9a77-10dce53fc725

Comment 1 Mike Dame 2021-09-01 15:25:27 UTC
The problem is that we made `ignorePvcPods=true` by default [1][2]. So the current default behavior in OCP is to not evict PVC pods.

In that case I think we made a logical mistake in introducing this profile as "DoNotEvict..", since that implies the pods will be evicted by default. So we have some options:
1. Enable this profile by default in the operator (to preserve default behavior)
2. Add another profile "EvictPodsWithPVC" that sets `ignorePvcPods=false`, then deprecate "DoNotEvict.." profile. This inverts the upstream default behavior, but preserves ours
3. Remove the change from [2] that enables this by default. This is a change in the current behavior

@Maciej which do you think would be best from a product standpoint?

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1911782
[2] https://github.com/openshift/cluster-kube-descheduler-operator/pull/168

Comment 2 Mike Dame 2021-09-01 15:54:10 UTC
I opened https://github.com/openshift/cluster-kube-descheduler-operator/pull/217 to go with option 2. We don't need to deprecate the current profile since it hasn't been released yet, so we can instead pivot it to disable the default behavior

Comment 3 Maciej Szulik 2021-09-02 14:00:59 UTC
I like option 2, although your PR only replaces the profile, without deprecation, am I right?

Comment 4 Mike Dame 2021-09-02 15:27:17 UTC
@Maciej, yeah, since we haven't released the new profile yet we don't need to deprecate it, we can just switch it

Comment 7 RamaKasturi 2021-09-03 05:13:25 UTC
@mikedame one question related to the bug here, it means there will be lot of pod evictions right since a customer env might have lot of pods created with pvcs ? Would not this be a problem when 4.9 release as this alters the previous config from 4.8 version of descheduler.

Comment 8 Mike Dame 2021-09-03 15:28:13 UTC
This maintains the current behavior of the descheduler. Pods with PVCs will not be evicted by default. Users can enable this option to make PVC pods eligible for eviction, though they will still need to meet the criteria of one of the other profiles first.

Comment 9 RamaKasturi 2021-09-06 13:07:00 UTC
Hello mike,

   I tried to verify the bug here but we are hitting back the bug [1] which we fixed some time back. Is this expected? Could you please help confirm ? 

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1911782

Thanks
kasturi

Comment 11 Mike Dame 2021-09-07 13:30:00 UTC
@Rama

spoke offline, but for reference here too: the EvictPodsWithPVC profile now *does* evict pods that have a PVC (whether or not that PVC is using local storage does not matter)

The default behavior for the operator is to *not* evict them. Setting this profile allows users to toggle this option now. Hope that clears it up

Comment 12 RamaKasturi 2021-09-07 13:33:26 UTC
Thanks Mike, that helps !!

Verified bug with the build below and i see that when EvictPodsWithPVC profile is set i see that pods which are created with pvcs are evicted even if the pvc is being created by local storage. By default pods created with pvc does not get evicted.

[knarra@knarra ~]$ oc get csv -n openshift-kube-descheduler-operator
NAME                                                DISPLAY                            VERSION              REPLACES                          PHASE
clusterkubedescheduleroperator.4.9.0-202109030207   Kube Descheduler Operator          4.9.0-202109030207                                     Succeeded

[knarra@knarra ~]$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.9.0-0.nightly-2021-09-06-004132   True        False         29h     Cluster version is 4.9.0-0.nightly-2021-09-06-004132

configmap of kubedescheduler cluster when EvictPodsWithPVC & EvictPodsWithLocalStorage are set.

[knarra@knarra ~]$ oc get configmap cluster -n openshift-kube-descheduler-operator -o yaml
apiVersion: v1
data:
  policy.yaml: |
    apiVersion: descheduler/v1alpha1
    ignorePvcPods: true
    kind: DeschedulerPolicy
    strategies:
      RemoveDuplicates:
        enabled: true
        params:
          includeSoftConstraints: false
          namespaces:
            exclude:


When set EvictPodsWithPVC below is the yaml file:

[knarra@knarra ~]$ oc get configmap cluster -n openshift-kube-descheduler-operator -o yaml
apiVersion: v1
data:
  policy.yaml: |
    apiVersion: descheduler/v1alpha1
    ignorePvcPods: false
    kind: DeschedulerPolicy
    strategies:
      RemoveDuplicates:
        enabled: true
        params:
          includeSoftConstraints: false
          namespaces:

When both EvictpodsWithPVC & EvictPodsWithLocalStorage is present

[knarra@knarra ~]$ oc get configmap cluster -n openshift-kube-descheduler-operator -o yaml
apiVersion: v1
data:
  policy.yaml: |
    apiVersion: descheduler/v1alpha1
    evictLocalStoragePods: true
    ignorePvcPods: false
    kind: DeschedulerPolicy
    strategies:
      RemoveDuplicates:
        enabled: true
        params:
          includeSoftConstraints: false
          namespaces:
            exclude:
            - kube-system


Based on the above moving the bug to verified state.

Comment 14 errata-xmlrpc 2021-10-18 17:49:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:3759


Note You need to log in before you can comment on or make changes to this bug.