Description of problem: After automated cluster provisioning and deployment: scaling up an application pod resulted by pods running both on workers and masters After investigation it cleared that masters are not tagged properly (see #1827996) How reproducible: constantly Steps to Reproduce: 1. Provision and deploy a cluster using automation 2. Create an application pod: # oc create new-project httpd-proj # oc create new-app httpd # oc expose dc/httpd 3. Add pod's replicas oc scale dc httpd --replicas=3 Actual results: Application pods are running both on workers and masters Expected results: Application pods should run on workers only Additional info:
Can you include what your install-config.yaml looks like (minus secrets)? This is likely intentional, if you do not specify > 0 compute replicas, the masters must be made schedulable. Generally we recommend you deploy with a minimum of 2 compute replicas, and 3 control plane replicas.
Could you provide a must-gather (oc adm must-gather) from this cluster? Given your compute replicas, the masters should not be scheduable.
Setting target release to current development version (4.5) for investigation. Where fixes (if any) are required/requested for prior versions, cloned BZs will be created when appropriate.
Here is must-gather https://drive.google.com/file/d/1V6JQR1S5Z7WqBNHjwDLqL_eyCbjoCxZI/view?usp=sharing
From your must-gather: ``` ./cluster-scoped-resources/config.openshift.io/schedulers.yaml: mastersSchedulable: false ``` Masters are not schedulable. I'm not sure why you're httpd app would end up on the masters in that case, moving this against the kube-scheduler folks for their feedback.
From what I see the spec of each and every node is empty: spec: {} kube-scheduler-operator is not in the bussiness of setting those tolerations, and kube-scheduler will react accordingly to what is being set. I don't see a problem here from this perspective. I took a quick pick into https://github.com/openshift/machine-config-operator/blob/6c690dafbbea5ab76c2e197239e1b70e386e753b/templates/master/01-master-kubelet/vsphere/units/kubelet.yaml which should register a node with appropriate taints, but I'll let them check this one out.
Antonio, why have you moved this back to be assigned to me? Maciej, you've linked to a vsphere template. This is baremetal: https://github.com/openshift/machine-config-operator/blob/6c690dafbbea5ab76c2e197239e1b70e386e753b/templates/master/01-master-kubelet/baremetal/units/kubelet.yaml We rely on this functionality: https://github.com/openshift/installer/blob/master/pkg/asset/manifests/scheduler.go#L73
> Maciej, you've linked to a vsphere template. This is baremetal: > https://github.com/openshift/machine-config-operator/blob/ > 6c690dafbbea5ab76c2e197239e1b70e386e753b/templates/master/01-master-kubelet/ > baremetal/units/kubelet.yaml Good point, thx!
(In reply to Stephen Benjamin from comment #8) > Antonio, why have you moved this back to be assigned to me? > > Maciej, you've linked to a vsphere template. This is baremetal: > https://github.com/openshift/machine-config-operator/blob/ > 6c690dafbbea5ab76c2e197239e1b70e386e753b/templates/master/01-master-kubelet/ > baremetal/units/kubelet.yaml > > We rely on this functionality: > https://github.com/openshift/installer/blob/master/pkg/asset/manifests/ > scheduler.go#L73 my bad, this should go to vsphere
Sorry this fell through the cracks. Looking at the install-config its for baremetal not sure why this is specified as vSphere. Just created manifests with the latest 4.5 just to be sure apiVersion: config.openshift.io/v1 kind: Scheduler metadata: creationTimestamp: null name: cluster spec: mastersSchedulable: false policy: name: "" status: {} and a recent cluster just built $ oc -o yaml get node jcallen-cfsz2-master-0 spec: providerID: vsphere://423b8b69-7c78-d68b-7f2d-711fcbb3cfd6 taints: - effect: NoSchedule key: node-role.kubernetes.io/master and must-gather cluster-scoped-resources/config.openshift.io/schedulers.yaml $ cat schedulers.yaml --- apiVersion: config.openshift.io/v1 items: - apiVersion: config.openshift.io/v1 kind: Scheduler ... spec: mastersSchedulable: false
Lubov, can you provide some more info here?
Created attachment 1690512 [details] master description
It is baremetal Attache output of "oc -o yaml get node master-0-0": no taints there
From an MCO perspective the pools are not degraded, configs seem to be applied correctly. If this is indeed a baremetal cluster with nodes not tainted correctly and those taints are set in https://github.com/openshift/installer/blob/master/pkg/asset/manifests/scheduler.go#L73 and not in the MCO templates, can the baremetal installer team look more closely at this? Looking at cluster-scoped-resources/config.openshift.io/schedulers.yaml in the must gather I see: apiVersion: config.openshift.io/v1 items: - apiVersion: config.openshift.io/v1 kind: Scheduler metadata: creationTimestamp: "2020-04-26T10:11:02Z" generation: 1 name: cluster resourceVersion: "1121" selfLink: /apis/config.openshift.io/v1/schedulers/cluster uid: b285072f-de86-4e15-92b2-6ad64b4ae59c spec: mastersSchedulable: false This doesn't seem like a Bug that the MCO team should be owning....
Digging into this a little more, to get background, I found pr: https://github.com/openshift/machine-config-operator/pull/846 I see: ``` For IPI baremetal, we need to support the platform in MCO. This PR also overrides the kubelet config to remove the NoSchedule taint. ``` Further down: ``` Baremetal IPI environment is not installable without removing the NoSchedule taint from the masters. ``` I don't know whether all of this means the baremetal template needs to be updated to add: --register-with-taints=node-role.kubernetes.io/master=:NoSchedule as the kubelet.yaml does in base/vsphere/openstack templates or the installer needs to do something else. Reassigning to @Stephen as he's more familiar with these templates & installer functionality and can reassign as appropriate within baremetal team for investigation.
@Stephen Can you PTAL
Stephen is out for a few days so I took a look, and I think we should be relying on the installer telling the scheduler to make the masters schedulable, only in the case where there aren't any workers defined: https://github.com/openshift/installer/pull/2004 That relies on some MCO changes which landed in https://github.com/openshift/machine-config-operator/pull/937 However I think we missed this PR that removes the customized baremetal kubelet conf https://github.com/openshift/machine-config-operator/pull/993 That got incorrectly closed without merging, and never revisited for review. So I think we need to revive that PR and it should resolve this issue?
Thanks for picking this up @Steven !
I've opened a new PR to address this: https://github.com/openshift/machine-config-operator/pull/1817
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196