Bug 1413037

Summary: Scheduler doesn't consider the old node which was removed and added by scaleup playbook for scheduling a pod.
Product: OpenShift Container Platform Reporter: Miheer Salunke <misalunk>
Component: NodeAssignee: Avesh Agarwal <avagarwa>
Status: CLOSED WORKSFORME QA Contact: DeShuai Ma <dma>
Severity: high Docs Contact:
Priority: high    
Version: 3.3.0CC: aos-bugs, avagarwa, jokerman, mmccomas, sjenning
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-01-26 20:43:17 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Comment 1 Miheer Salunke 2017-01-13 12:59:19 UTC
Description of problem:

The issue was that node991 which existed earlier was deleted by oc delete node...then entry was made in the inventory file under [new node] section for node991 and then scaleup.yaml playbook was run...adding the node991 was done by the playbook as we could see node991  in oc get nodes o/p with "Ready" status which means it is schedulable.....but when we schedule a pod on node991 using a nodeselector of node991 then we see the pod goes in pending state because of this [0] where you can see node 991 is not present....case ->https://access.redhat.com/support/cases/#/case/01764010  

[0]->

      FirstSeen     LastSeen        Count   From                    SubobjectPath   Type            Reason                  Message
      ---------     --------        -----   ----                    -------------   --------        ------                  -------
      12d           1h              10      {default-scheduler }                    Warning         FailedScheduling        pod (logging-es-546wrcu2-2-i0qvc) failed to fit in any node
    fit failure on node (node990.example.com): MatchNodeSelector
    fit failure on node (node930.example.com): CheckServiceAffinity
    fit failure on node (node931.example.com): CheckServiceAffinity
    fit failure on node (node992.example.com): MatchNodeSelector



Version-Release number of selected component (if applicable):
OCP 3.3.0

How reproducible:
Customer side

Steps to Reproduce:
  - oc delete node
  - reinstall os ond deleted node
  - scaleup playbook

Actual results:
The scheduler doesn't consider node991 for scheduling the pod.

Expected results:
The scheduler shall consider node991 for scheduling the pod.

Additional info:

Comment 2 Miheer Salunke 2017-01-13 12:59:54 UTC
Also the setup has 2 masters and 3 etcd servers.