Bug 1324888 - after scaling a replicationController, pods get scheduled to nodes with SchedulingDisabled
Summary: after scaling a replicationController, pods get scheduled to nodes with Sched...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 3.1.0
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
: ---
Assignee: Jan Chaloupka
QA Contact: DeShuai Ma
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-04-07 13:50 UTC by Christoph Görn
Modified: 2016-04-12 17:20 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-04-12 06:21:12 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Christoph Görn 2016-04-07 13:50:09 UTC
Description of problem:
After a node is marked as unschedulable and pods have been evacuated, scaling (up or down) of a replicationController results in pods getting schedules on the node that is unschedulable.

Version-Release number of selected component (if applicable):
atomic-openshift-3.1.0.4-1.git.15.5e061c3.el7aos.x86_64

How reproducible:
evacuate pods from a node, scale a replicationController

Steps to Reproduce:
[root@test-master-0 ~]# oc -n testing-2 get pods -o wide 
NAME                             READY     STATUS    RESTARTS   AGE       NODE
mongodb-1-0zxcg                  1/1       Running   0          1m        test-node-primary-0.example.com
mongodb-1-60t65                  1/1       Running   0          46s       test-node-primary-2.example.com
mongodb-1-ru4ru                  1/1       Running   0          1m        test-node-primary-2.example.com
mongodb-1-wk558                  1/1       Running   0          17h       test-node-primary-1.example.com
nodejs-mongodb-example-6-6mmze   1/1       Running   2          20h       test-node-primary-1.example.com
nodejs-mongodb-example-6-e6j0w   1/1       Running   0          3m        test-node-primary-0.example.com

[root@test-master-0 ~]# oadm manage-node test-node-primary-2.example.com --schedulable=false
NAME                                             LABELS                                                                                              STATUS                     AGE
test-node-primary-2.example.com   kubernetes.io/hostname=test-node-primary-2.example.com,region=primary,zone=default   Ready,SchedulingDisabled   16d

[root@test-master-0 ~]# oadm manage-node test-node-primary-2.example.com --evacuate 

Migrating these pods on node: test-node-primary-2.example.com

NAME              READY     STATUS    RESTARTS   AGE
mongodb-1-60t65   1/1       Running   0          1m
mongodb-1-ru4ru   1/1       Running   0         2m

[root@test-master-0 ~]# oc -n testing-2 get pods -o wide 
NAME                             READY     STATUS    RESTARTS   AGE       NODE
mongodb-1-0zxcg                  1/1       Running   0          2m        test-node-primary-0.example.com
mongodb-1-g5o4r                  1/1       Running   0          22s       test-node-primary-1.example.com
mongodb-1-rxrdb                  1/1       Running   0          22s       test-node-primary-0.example.com
mongodb-1-wk558                  1/1       Running   0          17h       test-node-primary-1.example.com
nodejs-mongodb-example-6-6mmze   1/1       Running   2          20h       test-node-primary-1.example.com
nodejs-mongodb-example-6-e6j0w   1/1       Running   0          4m        test-node-primary-0.example.com
[root@test-master-0 ~]# oc -n testing-2 scale --replicas=3 rc mongodb-1
replicationcontroller "mongodb-1" scaled

[root@test-master-0 ~]# oc -n testing-2 get pods -o wide 
NAME                             READY     STATUS    RESTARTS   AGE       NODE
mongodb-1-rxrdb                  1/1       Running   0          1m        test-node-primary-0.example.com
mongodb-1-v4l7b                  1/1       Running   0          26s       test-node-primary-2.example.com
mongodb-1-wk558                  1/1       Running   0          17h       test-node-primary-1.example.com
nodejs-mongodb-example-6-6mmze   1/1       Running   2          20h       test-node-primary-1.example.com
nodejs-mongodb-example-6-e6j0w   1/1       Running   0          4m        test-node-primary-0.example.com

[root@test-master-0 ~]# oc get nodes
NAME                                             LABELS                                                                                              STATUS                     AGE
test-master-0.example.com         kubernetes.io/hostname=test-master-0.example.com,region=master,zone=default          Ready,SchedulingDisabled   16d
test-node-infra-0.example.com     kubernetes.io/hostname=test-node-infra-0.example.com,region=infra,zone=default       Ready                      16d
test-node-infra-1.example.com     kubernetes.io/hostname=test-node-infra-1.example.com,region=infra,zone=default       Ready                      16d
test-node-primary-0.example.com   kubernetes.io/hostname=test-node-primary-0.example.com,region=primary,zone=default   Ready                      16d
test-node-primary-1.example.com   kubernetes.io/hostname=test-node-primary-1.example.com,region=primary,zone=default   Ready                      16d
test-node-primary-2.example.com   kubernetes.io/hostname=test-node-primary-2.example.com,region=primary,zone=default   Ready,SchedulingDisabled   16d


Actual results:
pods get schedule on test-node-primary-2.example.com

Expected results:
no pod scheduled on test-node-primary-2.example.com

Additional info:

[root@test-master-0 ~]# oc -n testing-2 describe pod mongodb-1-v4l7b
Name:				mongodb-1-v4l7b
Namespace:			testing-2
Image(s):			registry.access.redhat.com/rhscl/mongodb-26-rhel7:latest
Node:				test-node-primary-2.example.com/10.19.0.249
Start Time:			Thu, 07 Apr 2016 07:56:06 +0000
Labels:				deployment=mongodb-1,deploymentconfig=mongodb,name=mongodb
Status:				Running
Reason:				
Message:			
IP:				10.1.5.70
Replication Controllers:	mongodb-1 (3/3 replicas created)
Containers:
  mongodb:
    Container ID:	docker://65299d2d4e6362dc7c8ff09c6b9fa02a7b95809f798bd76b4e8bd92061bd20f8
    Image:		registry.access.redhat.com/rhscl/mongodb-26-rhel7:latest
    Image ID:		docker://19c92ed464ccfaa085af8ed8cca18edfa242c337c9fcab6c9c7dd8b5cb2b9c3c
    QoS Tier:
      cpu:	BestEffort
      memory:	Guaranteed
    Limits:
      memory:	512Mi
    Requests:
      memory:		512Mi
    State:		Running
      Started:		Thu, 07 Apr 2016 07:56:09 +0000
    Ready:		True
    Restart Count:	0
    Environment Variables:
      MONGODB_USER:		userLUV
      MONGODB_PASSWORD:		SgNJjwXpFoaO2Cm2
      MONGODB_DATABASE:		sampledb
      MONGODB_ADMIN_PASSWORD:	YgfBUE2gH4CpNKNc
Conditions:
  Type		Status
  Ready 	True 
Volumes:
  default-token-uu427:
    Type:	Secret (a secret that should populate this volume)
    SecretName:	default-token-uu427
Events:
  FirstSeen	LastSeen	Count	From								SubobjectPath				Reason			Message
  ─────────	────────	─────	────								─────────────				──────			───────
  10m		10m		1	{scheduler }												Scheduled		Successfully assigned mongodb-1-v4l7b to test-node-primary-2.example.com
  7m		7m		1	{scheduler }												FailedScheduling	Failed for reason Region and possibly others
  7m		7m		1	{kubelet test-node-primary-2.example.com}	implicitly required container POD	Pulled			Container image "openshift3/ose-pod:v3.1.0.4" already present on machine
  7m		7m		1	{kubelet test-node-primary-2.example.com}	implicitly required container POD	Started			Started with docker id 1ab9cda791de
  7m		7m		1	{kubelet test-node-primary-2.example.com}	implicitly required container POD	Created			Created with docker id 1ab9cda791de
  6m		6m		1	{kubelet test-node-primary-2.example.com}	spec.containers{mongodb}		Pulled			Container image "registry.access.redhat.com/rhscl/mongodb-26-rhel7:latest" already present on machine
  6m		6m		1	{kubelet test-node-primary-2.example.com}	spec.containers{mongodb}		Created			Created with docker id 65299d2d4e63
  6m		6m		1	{kubelet test-node-primary-2.example.com}	spec.containers{mongodb}		Started			Started with docker id 65299d2d4e63

Comment 1 Andy Goldstein 2016-04-08 18:45:18 UTC
I haven't been able to reproduce this.

Comment 2 Christoph Görn 2016-04-12 06:21:12 UTC
As the environment has died, I cant replicate. Will reopen if I got it replicated on a new environment. Thanks for the help!

Comment 3 Jan Chaloupka 2016-04-12 17:20:56 UTC
Trying to reproduce the issue on origin v1.1.6 and ose v3.1.1.6. With the same result not being able to reproduce it.


Note You need to log in before you can comment on or make changes to this bug.