Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1877319

Summary: Descheduler can't evict pod which violate RemovePodsViolatingInterPodAntiAffinity strategy as against the static pod
Product: OpenShift Container Platform Reporter: zhou ying <yinzhou>
Component: kube-schedulerAssignee: Mike Dame <mdame>
Status: CLOSED ERRATA QA Contact: zhou ying <yinzhou>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.6CC: aos-bugs, mfojtik
Target Milestone: ---Keywords: Reopened
Target Release: 4.6.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-10-27 16:39:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description zhou ying 2020-09-09 11:50:23 UTC
Description of problem:
Descheduler can't evict pod which violate RemovePodsViolatingInterPodAntiAffinity strategy against the static pod

Version-Release number of selected component (if applicable):
[root@dhcp-140-138 ~]# oc get csv
NAME                                                   DISPLAY                     VERSION                 REPLACES   PHASE
clusterkubedescheduleroperator.4.6.0-202009041839.p0   Kube Descheduler Operator   4.6.0-202009041839.p0              Succeeded

How reproducible:
always

Steps to Reproduce:
1. Set strategy for descheduler as :
  strategies:
  - name: RemovePodsViolatingInterPodAntiAffinity
2. Create static pod as follow:
[root@zy09096-7kmqr-compute-0 manifests]# cat static-web.yaml 
apiVersion: v1
kind: Pod
metadata:
  name: static-web
  namespace: zhouyt1
  labels:
    role: myrole
spec:
  priorityClassName: system-node-critical
  containers:
    - name: web
      image: nginx
      ports:
        - name: web
          containerPort: 80
          protocol: TCP

3. Create deploy as follow: 

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: test
  name: test
spec:
  replicas: 6
  selector:
    matchLabels:
      app: test
  template:
    metadata:
      labels:
        app: test
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: key
                operator: In
                values:
                - value
            topologyKey: kubernetes.io/hostname
      containers:
      - image: openshift/hello-openshift
        name: hello-openshift
4. Make sure there are pods belong to deploy running on the node same as the static pod;
[zhouying@dhcp-140-138 ~]$ oc get po -o wide
NAME                                 READY   STATUS              RESTARTS   AGE   IP            NODE                      NOMINATED NODE   READINESS GATES
static-web-zy09096-7kmqr-compute-0   1/1     Running             0          26m   10.129.2.64   zy09096-7kmqr-compute-0   <none>           <none>
test-76d4c6bbd6-qtks7                0/1     ContainerCreating   0          5s    <none>        zy09096-7kmqr-compute-0   <none>           <none>
...

5. Add label for the static pod:
   `oc label pod/static-web-zy09096-7kmqr-compute-0 key=value`

6. Check the descheduler pod logs . 

Actual results:
6. The descheduler pod logs show: 
I0909 11:43:31.608359       1 evictions.go:230] Pod static-web-zy09096-7kmqr-compute-0 in namespace zhouyt1 is not evictable: Pod lacks an eviction annotation and fails the following checks: [pod is critical, pod is a mirror pod, pod has higher priority than specified priority class threshold]

Expected results:
6. Should evict the pod: test-76d4c6bbd6-qtks7 which violate the RemovePodsViolatingInterPodAntiAffinity strategy.

Additional info:

Comment 3 Mike Dame 2020-09-09 13:06:19 UTC
The descheduler does not evict static pods (or any pod without an ownerref), see https://github.com/kubernetes-sigs/descheduler/#pod-evictions. In addition,

> pod is critical, pod is a mirror pod, pod has higher priority than specified priority class threshold

These also indicate why the pod was not evicted.

Comment 4 Mike Dame 2020-09-09 13:10:47 UTC
Sorry, I misread and see that you're referring to the other pod which should be evicted (not the static pod). My mistake, re-opening this to look further

Comment 5 Mike Dame 2020-09-09 13:24:52 UTC
Upstream PR opened at https://github.com/kubernetes-sigs/descheduler/pull/395. Again, sorry for the misunderstanding!

Comment 9 zhou ying 2020-09-16 10:25:04 UTC
Confirmed with latest version, the issue has fixed:

[root@dhcp-140-138 ~]# oc get csv
NAME                                                   DISPLAY                     VERSION                 REPLACES   PHASE
clusterkubedescheduleroperator.4.6.0-202009152100.p0   Kube Descheduler Operator   4.6.0-202009152100.p0              Succeeded


[zhouying@dhcp-140-138 ~]$ oc get po -o wide
NAME                                                       READY   STATUS              RESTARTS   AGE   IP            NODE                                            NOMINATED NODE   READINESS GATES
static-web-yinzhou-manu-ks7tr-worker-usgovvirginia-blsr8   1/1     Running             0          55s   10.131.0.16   yinzhou-manu-ks7tr-worker-usgovvirginia-blsr8   <none>           <none>
test-76d4c6bbd6-95zb7                                      0/1     ContainerCreating   0          5s    <none>        yinzhou-manu-ks7tr-worker-usgovvirginia-rkm7h   <none>           <none>
test-76d4c6bbd6-bdqdp                                      0/1     ContainerCreating   0          5s    <none>        yinzhou-manu-ks7tr-worker-usgovvirginia-blsr8   <none>           <none>
test-76d4c6bbd6-f7hc9                                      0/1     ContainerCreating   0          5s    <none>        yinzhou-manu-ks7tr-worker-usgovvirginia-rkm7h   <none>           <none>
test-76d4c6bbd6-tf4rx                                      0/1     ContainerCreating   0          4s    <none>        yinzhou-manu-ks7tr-worker-usgovvirginia-blsr8   <none>           <none>
test-76d4c6bbd6-ttw2f                                      0/1     ContainerCreating   0          5s    <none>        yinzhou-manu-ks7tr-worker-usgovvirginia-vxw9c   <none>           <none>
test-76d4c6bbd6-xs7cf                                      0/1     ContainerCreating   0          5s    <none>        yinzhou-manu-ks7tr-worker-usgovvirginia-vxw9c   <none>           <none>


See logs from descheduler operator:

I0916 10:07:23.680087       1 evictions.go:117] Evicted pod: "test-76d4c6bbd6-tf4rx" in namespace "zhouyt1" (InterPodAntiAffinity)
I0916 10:07:23.680432       1 event.go:282] Event(v1.ObjectReference{Kind:"Pod", Namespace:"zhouyt1", Name:"test-76d4c6bbd6-tf4rx", UID:"c93319bd-9ed0-49eb-b90f-e106bd74bc73", APIVersion:"v1", ResourceVersion:"72149", FieldPath:""}): type: 'Normal' reason: 'Descheduled' pod evicted by sigs.k8s.io/descheduler (InterPodAntiAffinity)
I0916 10:07:23.772011       1 evictions.go:117] Evicted pod: "test-76d4c6bbd6-bdqdp" in namespace "zhouyt1" (InterPodAntiAffinity)
I0916 10:07:23.772133       1 pod_antiaffinity.go:72] "Processing node" node="yinzhou-manu-ks7tr-worker-usgovvirginia-rkm7h"
I0916 10:07:23.772411       1 event.go:282] Event(v1.ObjectReference{Kind:"Pod", Namespace:"zhouyt1", Name:"test-76d4c6bbd6-bdqdp", UID:"3e55429e-1cdc-4a1d-b9c7-6c29825294fa", APIVersion:"v1", ResourceVersion:"72152", FieldPath:""}): type: 'Normal' reason: 'Descheduled' pod evicted by sigs.k8s.io/descheduler (InterPodAntiAffinity)

Comment 11 errata-xmlrpc 2020-10-27 16:39:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196