Bug 1333129 - Cannot scale up a pod while a deployment is not completed
Summary: Cannot scale up a pod while a deployment is not completed
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: openshift-controller-manager
Version: 3.2.1
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 3.2.1
Assignee: Michail Kargakis
QA Contact: zhou ying
URL:
Whiteboard:
: 1306720 1353834 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-05-04 17:38 UTC by Cesar Wong
Modified: 2016-09-27 09:32 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-09-27 09:32:06 UTC
Target Upstream Version:


Attachments (Terms of Use)
scale request (314.34 KB, image/png)
2016-05-04 17:38 UTC, Cesar Wong
no flags Details
event log (246.06 KB, image/png)
2016-05-04 18:42 UTC, Cesar Wong
no flags Details
scaling (2.83 MB, image/jpeg)
2016-05-31 03:52 UTC, zhou ying
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:1933 normal SHIPPED_LIVE Red Hat OpenShift Container Platform 3.3 Release Advisory 2016-09-27 13:24:36 UTC

Description Cesar Wong 2016-05-04 17:38:25 UTC
Created attachment 1153959 [details]
scale request

Description of problem:

After a deployment has created a new pod, and deleted the old pod, clicking on the 
up arrow to scale up the pod results in a message saying 'Scaling up...' but nothing
happens.

Version-Release number of selected component (if applicable):
3.2

How reproducible:
Always

Steps to Reproduce:
1. create an app with new-app: 'oc new-app https://github.com/csrwng/simple-ruby.git'
2. after the build completes, and an initial deployment has happened, start a new
   build.
3. on the overview page, wait for the new deployment to create the new pod, and delete the previous pod.
4. immediately after the previous pod disappears, click on the up arrow to scale up.

Actual results:

The pod says 'Scaling...' but nothing happens.

Expected results:

The pod scales up successfully.

Additional info:

Comment 1 Cesar Wong 2016-05-04 18:42:38 UTC
Created attachment 1153982 [details]
event log

Comment 2 Jessica Forrester 2016-05-04 19:05:37 UTC
it seems like a race condition in the deployment controller, the UI is just updating the scale resource on the DC. The key seems to be related to the timing, if you scale up after the old deployment disappears, the new deployment has scaled up, but the deployment still says it is "in progress".

Comment 3 zhou ying 2016-05-05 06:24:34 UTC
We can reproduce by command:
`oc deploy simple-ruby --latest; oc scale dc/simple-ruby --replicas=5`

Comment 4 Michail Kargakis 2016-05-05 10:04:14 UTC
That's because 1) the deployment is running on a separate process with the desired replica size being fixed which means that the deployment needs to complete before being able to scale and 2) even after the deployment process finishes, it should be scaled up to dc.spec.replicas but we have hacked the controller to restore dc.spec.replicas back to rc.spec.replicas just because we need to support older clients that try to scale a deploymentconfig. For now, you should not try to scale the dc while a deployment is in-flight but set it prior to the deployment or after it is complete.

Comment 5 Samuel Padgett 2016-05-05 13:58:49 UTC
I'll disable the scaling controls during a deployment.

Comment 6 Michail Kargakis 2016-05-05 14:30:53 UTC
Thanks Sam!

Comment 7 Samuel Padgett 2016-05-05 17:07:28 UTC
https://github.com/openshift/origin/pull/8761

Comment 8 Samuel Padgett 2016-05-05 18:07:22 UTC
Pull request in the origin/master merge queue.

Comment 10 zhou ying 2016-05-31 03:51:19 UTC
Confirmed with ami devenv-rhel7_4294, When the deployment in-flight, the scale was disabled, but when the deployment completed, and the scale enable immediately scale up, will meet : the pod saying:scaling to x ..., but wait for a long time , the scale not succeed.
please see the attachments.
openshift v1.3.0-alpha.1-41-g681170a
kubernetes v1.3.0-alpha.1-331-g0522e63
etcd 2.3.0

Comment 11 zhou ying 2016-05-31 03:52:48 UTC
Created attachment 1163019 [details]
scaling

Comment 12 Samuel Padgett 2016-05-31 12:23:32 UTC
(In reply to zhou ying from comment #10)
> Confirmed with ami devenv-rhel7_4294, When the deployment in-flight, the
> scale was disabled, but when the deployment completed, and the scale enable
> immediately scale up, will meet : the pod saying:scaling to x ..., but wait
> for a long time , the scale not succeed.

There are several reasons this could happen and might not be a bug. Can you check that you're not at your pods quota and the browse events page to see if there are any warnings?

Comment 13 Samuel Padgett 2016-07-05 13:08:42 UTC
yinzhou@redhat.com Any update? Do you still see the problem?

Comment 14 zhou ying 2016-07-06 06:32:16 UTC
Confirmed with ami devenv-rhel7_4530, can't reproduce this issue now on browse. 

But by command:

[root@ip-172-18-2-106 amd64]# oc get po
NAME              READY     STATUS      RESTARTS   AGE
ruby-ex-1-build   0/1       Completed   0          7m
ruby-ex-3-hzkj1   1/1       Running     0          2m
ruby-ex-3-qpgrg   1/1       Running     0          2m
ruby-ex-3-unfuj   1/1       Running     0          2m
[root@ip-172-18-2-106 amd64]# oc deploy ruby-ex --latest ; oc scale dc/ruby-ex --replicas=5
Started deployment #4
Use 'oc logs -f dc/ruby-ex' to track its progress.
deploymentconfig "ruby-ex" scaled

[root@ip-172-18-2-106 amd64]# oc get po
NAME              READY     STATUS      RESTARTS   AGE
ruby-ex-1-build   0/1       Completed   0          12m
ruby-ex-4-26fsh   1/1       Running     0          4m
ruby-ex-4-hn8o8   1/1       Running     0          4m
ruby-ex-4-jiigp   1/1       Running     0          4m

Comment 15 Samuel Padgett 2016-07-06 13:01:19 UTC
Michail, Dan (Mace), do we want to guard against this problem when scaling with the CLI?

Comment 16 Samuel Padgett 2016-07-18 14:23:09 UTC
Reassigning since the web console side is fixed. See comment #14.

Comment 17 Michail Kargakis 2016-07-20 08:46:28 UTC
*** Bug 1353834 has been marked as a duplicate of this bug. ***

Comment 18 Michail Kargakis 2016-07-20 08:53:29 UTC
*** Bug 1306720 has been marked as a duplicate of this bug. ***

Comment 19 Michal Fojtik 2016-07-20 11:17:19 UTC
We will probably just display a warning to users in CLI, but after we talked with Michalis we don't want to prevent them from scaling.

Comment 20 Michal Fojtik 2016-08-11 07:39:27 UTC
Cesar, Sam: I don't think we can show a warning in CLI in a reasonable way as the `oc scale` is upstream. We will have to create 'smarter' wrapper that will check the state of DC. I'm not 100% convinced that we want to do that refactor to only gain the warning.

I'm in favor of closing this as the UI portion is now fixed. WDYT?

Comment 21 Cesar Wong 2016-08-11 13:52:33 UTC
Michal, I'm ok with closing it as well.

Comment 22 Michal Fojtik 2016-08-11 13:55:45 UTC
Setting ON_QA so QA can close this.

Comment 23 zhou ying 2016-08-12 03:26:04 UTC
Confirmed with 3.3 latest env, the issue has fixed on browse. 
openshift version
openshift v3.3.0.18
kubernetes v1.3.0+507d3a7
etcd 2.3.0+git

On browse when deploying, the scaling up arrow is disable.

Comment 25 errata-xmlrpc 2016-09-27 09:32:06 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1933


Note You need to log in before you can comment on or make changes to this bug.