Bug 1323129

Summary: Failed to scale dc after sacle rc
Product: OpenShift Container Platform Reporter: Wang Haoran <haowang>
Component: openshift-controller-managerAssignee: Michal Fojtik <mfojtik>
Status: CLOSED EOL QA Contact: zhou ying <yinzhou>
Severity: low Docs Contact:
Priority: medium    
Version: 3.2.0CC: aos-bugs, wmeng, xiuwang
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-08-23 12:48:19 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Wang Haoran 2016-04-01 10:52:00 UTC
Description of problem:
After scale rc ,then scale dc ,replica NO. not changed,need scale dc again.

Version-Release number of selected component (if applicable):
openshift v3.2.0.9

How reproducible:

always
Steps to Reproduce:
1.Create a project then create app 
  oc new-app openshift/perl:5.20 --code=https://github.com/openshift/sti-perl -l app=test-perl  --context-dir=5.20/test/sample-test-app/
2.scale rc
 oc scale rc sti-perl-1 --replicas=2
3.scale dc
 oc scale dc sti-perl --replicas=3
4. check the replica no
 oc describe dc sti-perl

Actual results:
[haoran@cheetah tmp.yGvvR8sRWW]$ oc describe rc sti-perl
Name:		sti-perl-1
Namespace:	haowang
Image(s):	172.30.10.62:5000/haowang/sti-perl@sha256:b448ad7c2c31d21444d29d59123ff88b5249c74a46dbb1a318faf65e46183962
Selector:	app=test-perl,deployment=sti-perl-1,deploymentconfig=sti-perl
Labels:		app=test-perl,openshift.io/deployment-config.name=sti-perl
Replicas:	2 current / 2 desired
Pods Status:	2 Running / 0 Waiting / 0 Succeeded / 0 Failed
No volumes.
Events:
  FirstSeen	LastSeen	Count	From				SubobjectPath	Type		Reason			Message
  ---------	--------	-----	----				-------------	--------	------			-------
  7m		7m		1	{replication-controller }			Normal		SuccessfulCreate	Created pod: sti-perl-1-7akcm
  6m		6m		1	{replication-controller }			Normal		SuccessfulCreate	Created pod: sti-perl-1-o2z0n
  3m		3m		1	{replication-controller }			Normal		SuccessfulCreate	Created pod: sti-perl-1-lngax
  2m		2m		1	{replication-controller }			Normal		SuccessfulCreate	Created pod: sti-perl-1-5lb74
  1m		1m		1	{replication-controller }			Normal		SuccessfulDelete	Deleted pod: sti-perl-1-5lb74
  1m		1m		1	{replication-controller }			Normal		SuccessfulDelete	Deleted pod: sti-perl-1-lngax



Expected results:
replica should be 3

Additional info:
run the step 3 second time,the replica will be normal

Comment 1 Solly Ross 2016-04-01 16:00:12 UTC
This appears to be an issue of timing -- it looks like the issue is that the when the RC's replica count is updated, the replica count in the DC isn't synced for a while, meaning that when the DC replica count is updated, the DC controller sees the differing replica counts, assumes that the RC is being updated independently, and syncs the RC's count (2) back to the DC, assuming that the RC has the canonical count.

If scaling the RC poked the DC's sync loop, this would probably be less of a problem, since the RC->DC sync would occur "immediately" after the RC scale, meaning that the DC scale would occur with proper replica counts in order.

Comment 2 Andy Goldstein 2016-04-01 16:43:56 UTC
Scaling a ReplicationController that is managed by a DeploymentConfig can lead to unpredictable behavior like this (timing issue). The correct action is to scale the DeploymentConfig.

Lowering severity.

1 possible solution would be to print out a warning when a user tries to scale the RC instead of the DC, and instead of proceeding to scale the RC, just go ahead and scale the DC.