1441984 – 'oc rollout pause' operation will erase 'Running' pods

Bug 1441984 - 'oc rollout pause' operation will erase 'Running' pods

Summary: 'oc rollout pause' operation will erase 'Running' pods

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	openshift-controller-manager
Sub Component:
Version:	3.6.0
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Target Release:	3.6.z
Assignee:	Michal Fojtik
QA Contact:	zhou ying
Docs Contact:
URL:
Whiteboard:	workloads
Depends On:
Blocks:	1459008
TreeView+	depends on / blocked

Reported:	2017-04-13 08:53 UTC by ge liu
Modified:	2019-11-21 18:37 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Clones:	1459008 (view as bug list)
Environment:
Last Closed:	2019-11-21 18:37:54 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description ge liu 2017-04-13 08:53:24 UTC

Description of problem:
'oc rollout pause' will erase 'Running' pods in special situation(depends on the number of replicas&maxSurge&maxUnavailable). I found the rule is: if running pods' number bigger than unavailable pods' number, the Running pods will not be erased after 'oc rollout pause', such as: replicas=5 maxSurge=1 maxUnavailable=1, after edit with nonexist image and pause, the pods sitatuion is：4 Running and 2 ErrImagePull. if running pods' number smaller than unavailable pods' number, the running pods will be erased after rollout pause, such as: replicas=5 maxSurge=1 maxUnavailable=4,  after edit with nonexist image, there are: 1 running, 5 ErrImagePull pods, then run 'oc rollout pause', and there are 5 ErrImagePull pods only, the Running pods be erased.


Version-Release number of selected component (if applicable):

openshift v3.6.28
kubernetes v1.5.2+43a9be4
etcd 3.1.0

How reproducible:
Always

Steps to Reproduce:
1. create deployment(#oc create -f hello-openshift.yaml) with file:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: hello-openshift
spec:
  replicas: 2
  template:
    metadata:
      labels:
        app: hello-openshift
    spec:
      containers:
      - name: hello-openshift
        image: openshift/hello-openshift
        ports:
        - containerPort: 80
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
    type: RollingUpdate

2. there are 2 pods in Running status after step 1

# oc get pods
NAME                               READY     STATUS    RESTARTS   AGE
hello-openshift-2049296760-c9d37   1/1       Running   0          6s
hello-openshift-2049296760-ftpgz   1/1       Running   0          6s

3. change the image to 'nonexist' in deployment 

#oc edit deployment 

spec: 
  containers:
    - image: openshift/hello-openshift     ===change to==> nonexist

[root@host-8-x-x ~]# oc edit deployment
deployment "hello-openshift" edited
# oc get pods
NAME                               READY     STATUS         RESTARTS   AGE
hello-openshift-2049296760-c9d37   1/1       Running        0          1m
hello-openshift-2758264543-mr9q8   0/1       ErrImagePull   0          32s
hello-openshift-2758264543-s3t4b   0/1       ErrImagePull   0          32s


4. # oc rollout pause deployment/hello-openshift
deployment "hello-openshift" paused
# oc get pods
NAME                               READY     STATUS             RESTARTS   AGE
hello-openshift-2758264543-mr9q8   0/1       ImagePullBackOff   0          53s
hello-openshift-2758264543-s3t4b   0/1       ImagePullBackOff   0          53s

obviously, the Running pods 'hello-openshift-2049296760-c9d37' is missed.

5.  resume the deployment, still could not find the Running pod:

# oc rollout resume deployment/hello-openshift
deployment "hello-openshift" resumed
[root@host-8-x-x ~]# oc get pods
NAME                               READY     STATUS             RESTARTS   AGE
hello-openshift-2758264543-133sf   0/1       ImagePullBackOff   0          14m
hello-openshift-2758264543-39t5j   0/1       ImagePullBackOff   0          14m
hello-openshift-2758264543-8hk0j   0/1       ImagePullBackOff   0          14m
hello-openshift-2758264543-f3h7q   0/1       ImagePullBackOff   0          14m
hello-openshift-2758264543-rtffw   0/1       ImagePullBackOff   0          14m
[root@host-8-x-x~]# 

Actual results:

'oc rollout pause' will erase  Running pods in special situation(depends on the number of replicas&maxSurge&maxUnavailable) , 


Expected results:

'oc rollout pause' will not effect erase the number of Running and unavailable pods as expects

Comment 1 Wang Haoran 2017-04-13 13:42:36 UTC

upstream pr: https://github.com/kubernetes/kubernetes/pull/44439

Comment 5 ge liu 2017-05-03 02:34:40 UTC

Verified in OCP 3.6 env:

openshift v3.6.62
kubernetes v1.6.1+5115d708d7
etcd 3.1.0

1. # oc create -f hello-openshift.yaml 
deployment "hello-openshift" created

2.# oc get deployment
NAME              DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
hello-openshift   2         2         2            0           4s
# oc get rs
NAME                         DESIRED   CURRENT   READY     AGE
hello-openshift-2382698553   2         2         1         7s
# oc get pods
NAME                               READY     STATUS    RESTARTS   AGE
hello-openshift-2382698553-fmlhn   1/1       Running   0          14s
hello-openshift-2382698553-jllrn   1/1       Running   0          14s

3.# oc edit deployment
deployment "hello-openshift" edited

4. # oc get pods
NAME                               READY     STATUS             RESTARTS   AGE
hello-openshift-1944458656-045bp   0/1       ImagePullBackOff   0          27s
hello-openshift-1944458656-dk11t   0/1       ImagePullBackOff   0          27s
hello-openshift-2382698553-jllrn   1/1       Running            0          1m

5. #  oc rollout pause deployment/hello-openshift
deployment "hello-openshift" paused

6. # oc get pods
NAME                               READY     STATUS             RESTARTS   AGE
hello-openshift-1944458656-045bp   0/1       ImagePullBackOff   0          49s
hello-openshift-1944458656-dk11t   0/1       ImagePullBackOff   0          49s
hello-openshift-2382698553-jllrn   1/1       Running            0          1m

7. # oc rollout resume deployment/hello-openshift
deployment "hello-openshift" resumed
       0          1m
# oc get pods
NAME                               READY     STATUS         RESTARTS   AGE
hello-openshift-1944458656-045bp   0/1       ErrImagePull   0          1m
hello-openshift-1944458656-dk11t   0/1       ErrImagePull   0          1m
hello-openshift-2382698553-jllrn   1/1       Running        0          1m

Note You need to log in before you can comment on or make changes to this bug.