Description of problem:
OpenShift 4.4.5, we have observed pods being scheduled on to master nodes.
Version-Release number of selected component (if applicable):
OpenShift 4.4.5 bare-metal IPI (upgraded from 4.4.3)
Every time this workload is created.
Steps to Reproduce:
1. Ensure that master nodes have the "master" role and only the "master" role.
2. Ensure that the `.spec.mastersSchedulable` field of the `Schedulable/cluster` object is `false`.
3. Create application workload (via Helm 3).
Some of the newly-created pods have been scheduled to master nodes.
All of the newly-created pods are scheduled to *worker* nodes.
@Amit, could your team handle the verification of this BZ?
I opened https://bugzilla.redhat.com/show_bug.cgi?id=1870665 and https://github.com/openshift/machine-config-operator/pull/2016 to backport the MCO fix to 4.5
Client Version: 4.6.0-0.nightly-2020-08-24-110601
Server Version: 4.6.0-0.nightly-2020-08-24-110601
Kubernetes Version: v1.19.0-rc.2+3e083ac-dirty
The same problem was fixed in https://bugzilla.redhat.com/show_bug.cgi?id=1828250 and fix backported to 4.5 (https://bugzilla.redhat.com/show_bug.cgi?id=1846503) and 4.4 (https://bugzilla.redhat.com/show_bug.cgi?id=1849217)
I re-verified again now
I re-verified on 4.6 for the current change to ensure it is still working
In 4.4 (4.4.0-0.nightly-2020-07-18-033102) it works since https://bugzilla.redhat.com/show_bug.cgi?id=1849217 verification. As far as I understand this fix is inside 4.4.13 and above. I didn't re-verify 4.4 now
To update this, I think the discussion in https://bugzilla.redhat.com/show_bug.cgi?id=1828250 (which describes a similar issue) might have found the root cause of how the taints were removed after an upgrade. However I don't see any updates to the faulty logic that reconciles the mastersSchedulable field, so while the 2 bugs are related our fix is still necessary.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.