Description of problem: 'oc rollout pause' will erase 'Running' pods in special situation(depends on the number of replicas&maxSurge&maxUnavailable). I found the rule is: if running pods' number bigger than unavailable pods' number, the Running pods will not be erased after 'oc rollout pause', such as: replicas=5 maxSurge=1 maxUnavailable=1, after edit with nonexist image and pause, the pods sitatuion is:4 Running and 2 ErrImagePull. if running pods' number smaller than unavailable pods' number, the running pods will be erased after rollout pause, such as: replicas=5 maxSurge=1 maxUnavailable=4, after edit with nonexist image, there are: 1 running, 5 ErrImagePull pods, then run 'oc rollout pause', and there are 5 ErrImagePull pods only, the Running pods be erased. Version-Release number of selected component (if applicable): openshift v3.6.28 kubernetes v1.5.2+43a9be4 etcd 3.1.0 How reproducible: Always Steps to Reproduce: 1. create deployment(#oc create -f hello-openshift.yaml) with file: apiVersion: extensions/v1beta1 kind: Deployment metadata: name: hello-openshift spec: replicas: 2 template: metadata: labels: app: hello-openshift spec: containers: - name: hello-openshift image: openshift/hello-openshift ports: - containerPort: 80 strategy: rollingUpdate: maxSurge: 1 maxUnavailable: 1 type: RollingUpdate 2. there are 2 pods in Running status after step 1 # oc get pods NAME READY STATUS RESTARTS AGE hello-openshift-2049296760-c9d37 1/1 Running 0 6s hello-openshift-2049296760-ftpgz 1/1 Running 0 6s 3. change the image to 'nonexist' in deployment #oc edit deployment spec: containers: - image: openshift/hello-openshift ===change to==> nonexist [root@host-8-x-x ~]# oc edit deployment deployment "hello-openshift" edited # oc get pods NAME READY STATUS RESTARTS AGE hello-openshift-2049296760-c9d37 1/1 Running 0 1m hello-openshift-2758264543-mr9q8 0/1 ErrImagePull 0 32s hello-openshift-2758264543-s3t4b 0/1 ErrImagePull 0 32s 4. # oc rollout pause deployment/hello-openshift deployment "hello-openshift" paused # oc get pods NAME READY STATUS RESTARTS AGE hello-openshift-2758264543-mr9q8 0/1 ImagePullBackOff 0 53s hello-openshift-2758264543-s3t4b 0/1 ImagePullBackOff 0 53s obviously, the Running pods 'hello-openshift-2049296760-c9d37' is missed. 5. resume the deployment, still could not find the Running pod: # oc rollout resume deployment/hello-openshift deployment "hello-openshift" resumed [root@host-8-x-x ~]# oc get pods NAME READY STATUS RESTARTS AGE hello-openshift-2758264543-133sf 0/1 ImagePullBackOff 0 14m hello-openshift-2758264543-39t5j 0/1 ImagePullBackOff 0 14m hello-openshift-2758264543-8hk0j 0/1 ImagePullBackOff 0 14m hello-openshift-2758264543-f3h7q 0/1 ImagePullBackOff 0 14m hello-openshift-2758264543-rtffw 0/1 ImagePullBackOff 0 14m [root@host-8-x-x~]# Actual results: 'oc rollout pause' will erase Running pods in special situation(depends on the number of replicas&maxSurge&maxUnavailable) , Expected results: 'oc rollout pause' will not effect erase the number of Running and unavailable pods as expects
upstream pr: https://github.com/kubernetes/kubernetes/pull/44439
Verified in OCP 3.6 env: openshift v3.6.62 kubernetes v1.6.1+5115d708d7 etcd 3.1.0 1. # oc create -f hello-openshift.yaml deployment "hello-openshift" created 2.# oc get deployment NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE hello-openshift 2 2 2 0 4s # oc get rs NAME DESIRED CURRENT READY AGE hello-openshift-2382698553 2 2 1 7s # oc get pods NAME READY STATUS RESTARTS AGE hello-openshift-2382698553-fmlhn 1/1 Running 0 14s hello-openshift-2382698553-jllrn 1/1 Running 0 14s 3.# oc edit deployment deployment "hello-openshift" edited 4. # oc get pods NAME READY STATUS RESTARTS AGE hello-openshift-1944458656-045bp 0/1 ImagePullBackOff 0 27s hello-openshift-1944458656-dk11t 0/1 ImagePullBackOff 0 27s hello-openshift-2382698553-jllrn 1/1 Running 0 1m 5. # oc rollout pause deployment/hello-openshift deployment "hello-openshift" paused 6. # oc get pods NAME READY STATUS RESTARTS AGE hello-openshift-1944458656-045bp 0/1 ImagePullBackOff 0 49s hello-openshift-1944458656-dk11t 0/1 ImagePullBackOff 0 49s hello-openshift-2382698553-jllrn 1/1 Running 0 1m 7. # oc rollout resume deployment/hello-openshift deployment "hello-openshift" resumed 0 1m # oc get pods NAME READY STATUS RESTARTS AGE hello-openshift-1944458656-045bp 0/1 ErrImagePull 0 1m hello-openshift-1944458656-dk11t 0/1 ErrImagePull 0 1m hello-openshift-2382698553-jllrn 1/1 Running 0 1m