In order to land the console pods on masters we intend to make the masters schedulable and add a taint to prevent normal pods from running on them.
This needs to be done on upgrade from 3.7 to 3.8 as well as clean 3.9 installs. This should be a blocker for 3.9.
It's been requested that we transition to labeling nodes in a defined manner like this node-role.kubernetes.io/{master,node,infra}=true A node could be all three if it were an all in one installation. See https://trello.com/c/7m7A7Vpu/579-5-standardize-on-rolenodekubernetesio-masternodeinfratrue For 3.9 lets just make sure that all masters are labeled node-role.kubernetes.io/master=true
PR to mark master nodes: https://github.com/openshift/openshift-ansible/pull/6849
Just labeling the master and setting a nodeSelector for the console namespace is not going to keep other pods off of the master. User pods can land on the labelled master if their nodeSelect from project/deployment/pod/etc spec is "" (or if the cluster defaultNodeSelector is "" or not specified). An OOTB configuration which allows pods on the master is probably not desirable.
(In reply to Mike Fiedler from comment #4) > Just labeling the master and setting a nodeSelector for the console > namespace is not going to keep other pods off of the master. Agree, probably https://docs.openshift.com/container-platform/3.7/admin_guide/scheduling/scheduler.html#constraining-pod-placement-nodeselector would be helpful here. The other option is tainting the node - https://docs.openshift.com/container-platform/3.7/admin_guide/scheduling/taints_tolerations.html#admin-guide-taints - but it seems it could be overcome as well
Created https://github.com/openshift/openshift-ansible/pull/6932 to taint masters (unless there are no dedicated nodes)
*** Bug 1540038 has been marked as a duplicate of this bug. ***
Changed scope of this bug to say this is only about making masters schedulable https://github.com/openshift/openshift-ansible/pull/6949
Fix is available in openshift-ansible-3.9.0-0.36.0.git.0.da68f13.el7
Verified this bug with openshift-ansible-3.9.0-0.36.0.git.0.da68f13.el7.noarch, and PASS. Now master nodes are scheduled. # oc get nodes NAME STATUS ROLES AGE VERSION 192.168.100.10 Ready <none> 1d v1.9.1+a0ce1bc657 192.168.100.15 Ready master 1d v1.9.1+a0ce1bc657 192.168.100.17 Ready master 1d v1.9.1+a0ce1bc657 192.168.100.6 Ready master 1d v1.9.1+a0ce1bc657 192.168.100.8 Ready <none> 1d v1.9.1+a0ce1bc657 About "taint" change, would introduce some other issues, will track it in https://bugzilla.redhat.com/show_bug.cgi?id=1539691
(In reply to Johnny Liu from comment #10) > Verified this bug with > openshift-ansible-3.9.0-0.36.0.git.0.da68f13.el7.noarch, and PASS. > > Now master nodes are scheduled. > > # oc get nodes > NAME STATUS ROLES AGE VERSION > 192.168.100.10 Ready <none> 1d v1.9.1+a0ce1bc657 > 192.168.100.15 Ready master 1d v1.9.1+a0ce1bc657 > 192.168.100.17 Ready master 1d v1.9.1+a0ce1bc657 > 192.168.100.6 Ready master 1d v1.9.1+a0ce1bc657 > 192.168.100.8 Ready <none> 1d v1.9.1+a0ce1bc657 > > > About "taint" change, would introduce some other issues, will track it in > https://bugzilla.redhat.com/show_bug.cgi?id=1539691 Forget to say, this is for fresh install.
Upgrade to OCP v3.9 has already mark all master hosts as schedulable.
Version: openshift-ansible-3.9.0-0.38.0.git.0.57e1184.el7.noarch # oc version oc v3.9.0-0.38.0 kubernetes v1.9.1+a0ce1bc657 features: Basic-Auth GSSAPI Kerberos SPNEGO Server https://qe-jliu-rpm-master-etcd-1:8443 openshift v3.9.0-0.38.0 kubernetes v1.9.1+a0ce1bc657 # oc get node --show-labels NAME STATUS ROLES AGE VERSION LABELS qe-jliu-rpm-master-etcd-1 Ready <none> 4h v1.9.1+a0ce1bc657 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=n1-standard-1,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kubernetes.io/hostname=qe-jliu-rpm-master-etcd-1,openshift-infra=apiserver,role=node qe-jliu-rpm-node-registry-router-1 Ready <none> 4h v1.9.1+a0ce1bc657 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=n1-standard-1,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kubernetes.io/hostname=qe-jliu-rpm-node-registry-router-1,registry=enabled,role=node,router=enabled After upgrade, master was schedulable, but master label was not added, so assign back the bug.
Created https://github.com/openshift/openshift-ansible/pull/7020 to fix it
Fix available in openshift-ansible-3.9.0-0.39.0.git.0.fea6997.el7
fixed. openshift-ansible-3.9.0-0.41.0.git.0.8290c01.el7.noarch after upgrade to OCP v3.9 master is the same as fresh install -- schedulable and with right label node-role.kubernetes.io/master=true # oc get nodes --show-labels NAME STATUS ROLES AGE VERSION LABELS 172.16.120.124 Ready <none> 7h v1.9.1+a0ce1bc657 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=3,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=regionOne,failure-domain.beta.kubernetes.io/zone=nova,kubernetes.io/hostname=172.16.120.124,registry=enabled,role=node,router=enabled 172.16.120.82 Ready master 7h v1.9.1+a0ce1bc657 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=3,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=regionOne,failure-domain.beta.kubernetes.io/zone=nova,kubernetes.io/hostname=172.16.120.82,node-role.kubernetes.io/master=true,role=node
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:3748