Bug 1369314 - Dev preview scale up button doesn't always work [NEEDINFO]
Summary: Dev preview scale up button doesn't always work
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Online
Classification: Red Hat
Component: Deployments
Version: 3.x
Hardware: Unspecified
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Michail Kargakis
QA Contact: zhou ying
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-08-23 06:34 UTC by bugreport398
Modified: 2017-02-16 22:12 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-02-16 22:12:39 UTC
Target Upstream Version:
mkargaki: needinfo? (mfojtik)


Attachments (Terms of Use)
video of clicking scale up after git push triggered build and deploy, then navigating away and back, then clicking scale up again. (170.43 KB, application/octet-stream)
2016-08-23 06:34 UTC, bugreport398
no flags Details

Description bugreport398 2016-08-23 06:34:24 UTC
Created attachment 1193156 [details]
video of clicking scale up after git push triggered build and deploy, then navigating away and back, then clicking scale up again.

Description of problem:

Possible bug in dev preview interface (Firefox 48.0 for Ubuntu).

Scale up button doesn't work in certain circumstances.  

Version-Release number of selected component (if applicable):

Dev Preview

How reproducible:  

I don't know.  

Steps to Reproduce:

1. [Python 2.7 application]

2. Scale down the pod to 0 through the interface.

3. Do a git push.

4. Wait till build has finished and click the Dismiss button on the notification.

5. Wait till deployment finishes.

6. Click the Scale up button (it will animate like it is scaling up, but it isn't)

7.  If you navigate to Home and then click on the Project name, you will see that the scaling progress is not being displayed and the pod is still set to 0.

8.  Clicking the Scale up button again, makes the pod scale up.

So perhaps the on click event is not working as desired after the build has finished? 

Actual results:

The pod does not scale up.  

Expected results:

The pod scales up.  

Additional info:

Comment 1 Samuel Padgett 2016-08-23 14:33:30 UTC
I'm able to reproduce. I see the web console making the scale request, and there are no errors from the API server. I don't believe it's a web console bug. Changing the component to deployments.

PUT /oapi/v1/namespaces/sgp/deploymentconfigs/ldp-js/scale

{  
   "apiVersion":"extensions/v1beta1",
   "kind":"Scale",
   "metadata":{  
      "name":"ldp-js",
      "namespace":"sgp",
      "creationTimestamp":"2016-08-08T13:25:31Z"
   },
   "spec":{  
      "replicas":1
   }
}

Comment 2 Michal Fojtik 2016-08-24 16:00:50 UTC
Does the DC have ConfigChange trigger?

Comment 3 Samuel Padgett 2016-08-24 16:25:43 UTC
Michal, the one I tested did.

Comment 4 Michail Kargakis 2016-08-25 09:10:23 UTC
Triggers have nothing to do with scaling. We haven't changed anything in this area since 3.1. I tried to reproduce but with no luck. Sam can you post the output of `oc get rc -o yaml` when you hit this?

Comment 5 Michal Fojtik 2016-08-25 13:14:45 UTC
I'm also not able to reproduce this on latest master. Moving this off the blocker list and we can consider back porting potential fix for 3.2.

Comment 6 Samuel Padgett 2016-08-25 13:32:08 UTC
Michail, I saw a conflict updating the DC in the events after reproducing yesterday (even though the scale request succeeded). Have you tried on 3.2? It only happens sometimes. When I looked at the RC, but spec and status replicas were 0.

I'll try to reproduce today.

Comment 7 Samuel Padgett 2016-08-25 13:50:47 UTC
Here is the event:

Cannot update deployment sgp/ldp-js-31 status to Pending: replicationcontrollers "ldp-js-31" cannot be updated: the object has been modified; please apply your changes to the latest version and try again

Comment 9 Michail Kargakis 2016-08-26 12:25:13 UTC
The update conflict is not a problem. Are you using the scale subresource of the deployment config or the replication controller in the UI? If the former which should be the correct thing to do, then there is a place in the deploymentconfig controller that we mutate replicas back to the replicas of the replication controller for backwards-compatibility reasons. It may be that we use that path for some requests which is totally wrong.

Comment 10 Samuel Padgett 2016-08-26 17:37:45 UTC
Michail, we're using the scale subresource. See the request above in comment #1.

Comment 11 Michail Kargakis 2016-08-29 14:09:10 UTC
I have managed to reproduce this on 3.2 and find the root cause.


I0829 10:26:26.129482   20773 controller.go:226] Synced deploymentConfig "test/ruby-ex" replicas from 1 to 0 based on the latest/active deployment "test/ruby-ex-2" which was scaled directly or has not previously been synced

oc get rc -o yaml | grep deployment-replicas:
openshift.io/deployment.replicas: ""

The deployer pod controller sets an empty openshift.io/deployment.replicas annotation forcing the deployment config controller to reconcile the deployment config back to the replica size of the replication controller (zero). Still not sure why this is happening because I can reproduce it with some deployments (eg. ruby-ex from `oc new-app centos/ruby-22-centos7~https://github.com/openshift/ruby-ex.git`) but not with others (eg. the `database` config from the sample-app example).

I cannot reproduce for 3.3 since we have revamped the deployment controllers and dropped the deployer pod controller. Note that this is not a regression for 3.2 (used to work like that since 3.1). Do we need a fix for 3.2? When will Online move to 3.3?

Comment 12 Abhishek Gupta 2016-11-01 20:56:29 UTC
Online is now on 3.3.1 in DevPreview production.

Comment 13 Li Zhe 2016-11-02 05:58:14 UTC
Tested on dev-preview-stg, OpenShift Master:v3.3.1.3, with Python 2.7 and Ruby 2.2, can not fing the bug anymore.
oc get bc -o yaml
...
      openshift.io/deployer-pod.name: ruby-ex-2-deploy
      openshift.io/deployment-config.latest-version: "2"
      openshift.io/deployment-config.name: ruby-ex
      openshift.io/deployment.phase: Complete
      openshift.io/deployment.replicas: "1"
      openshift.io/deployment.status-reason: caused by an image change

...

Can verify now


Note You need to log in before you can comment on or make changes to this bug.