Description of problem: `oc describe pod` isn't listing all tolerations for node-exporter pods. The node-exporter DaemonSet in openshift-monitoring specifies: # oc get ds node-exporter -n openshift-monitoring -o json | jq .spec.template.spec.tolerations [ { "operator": "Exists" } ] Generated pods have several tolerations. `oc get pod -o yaml` shows eight, including the one above. `oc describe pod` only shows seven. It omits this one, which can be misleading. Version-Release number of selected component (if applicable): # oc version Client Version: v4.2.15 Server Version: 4.2.16 Kubernetes Version: v1.14.6+8bbaf43 How reproducible: Easily! Steps to Reproduce: 1. oc describe $(oc get pods -n openshift-monitoring -o name | grep node-exporter- | head -1) -n openshift-monitoring | grep -A10 Tolerations Actual results: Tolerations list from `oc describe` includes only 7 entries. Expected results: List should mention { "operator": "Exists" } entry. Additional info: The generated pod has a total of 8 tolerations, but `oc describe` only shows seven. # oc describe $(oc get pods -n openshift-monitoring -o name | grep node-exporter- | head -1) -n openshift-monitoring | grep -A10 Tolerations Tolerations: node.kubernetes.io/disk-pressure:NoSchedule node.kubernetes.io/memory-pressure:NoSchedule node.kubernetes.io/network-unavailable:NoSchedule node.kubernetes.io/not-ready:NoExecute node.kubernetes.io/pid-pressure:NoSchedule node.kubernetes.io/unreachable:NoExecute node.kubernetes.io/unschedulable:NoSchedule Events: Type Reason Age From Message ---- ------ ---- ---- ------- # oc get pod node-exporter-mfdbf -o json | jq .spec.tolerations [ { "effect": "NoSchedule", "key": "node.kubernetes.io/memory-pressure", "operator": "Exists" }, { "effect": "NoSchedule", "key": "node.kubernetes.io/pid-pressure", "operator": "Exists" }, { "effect": "NoSchedule", "key": "node.kubernetes.io/unschedulable", "operator": "Exists" }, { "effect": "NoSchedule", "key": "node.kubernetes.io/network-unavailable", "operator": "Exists" }, { "operator": "Exists" }, { "effect": "NoExecute", "key": "node.kubernetes.io/not-ready", "operator": "Exists" }, { "effect": "NoExecute", "key": "node.kubernetes.io/unreachable", "operator": "Exists" }, { "effect": "NoSchedule", "key": "node.kubernetes.io/disk-pressure", "operator": "Exists" } ]
Moving this to 4.5, for now.
Jan, this will require upstream fix, when you have it open feel free to move this BZ to 4.6, although we might consider bringing in some of the upstream fixes in a bigger batch like we did last time. It's up to you, how many they will be.
In order for a toleration to get printed, each toleration has to have at least .Value or .Effect field set. Currently, .Operator field is not taken into account. Upstream documentation at https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/ says: ``` An empty key with operator Exists matches all keys, values and effects which means this will tolerate everything. tolerations: - operator: "Exists" An empty effect matches all effects with key key. tolerations: - key: "key" operator: "Exists" ``` Thus, displaying {"operator": "Exists"} is still valid. Additionally, the documentation says: ``` The default value for operator is Equal. A toleration “matches” a taint if the keys are the same and the effects are the same, and: the operator is Exists (in which case no value should be specified), or the operator is Equal and the values are equal. ``` Thus, it's ok to skip printing the operator in already displayed tolerations as is now: ``` Tolerations: node.kubernetes.io/disk-pressure:NoSchedule node.kubernetes.io/memory-pressure:NoSchedule node.kubernetes.io/network-unavailable:NoSchedule node.kubernetes.io/not-ready:NoExecute node.kubernetes.io/pid-pressure:NoSchedule node.kubernetes.io/unreachable:NoExecute node.kubernetes.io/unschedulable:NoSchedule ``` Though, in the case of {"operator": "Exists"} we might just display ``` Tolerations: op=Exists ```
Upstream PR: https://github.com/kubernetes/kubernetes/pull/91024
Waiting for the next oc rebase
Resolved through https://github.com/openshift/oc/pull/491
[root@dhcp-140-138 ~]# oc get po/node-exporter-gbzzp -n openshift-monitoring -o json | jq .spec.tolerations [ { "operator": "Exists" } ] [root@dhcp-140-138 ~]# oc describe $(oc get pods -n openshift-monitoring -o name | grep node-exporter- | head -1) -n openshift-monitoring | grep -A10 Tolerations Tolerations: op=Exists [root@dhcp-140-138 ~]# oc version --client -o yaml clientVersion: buildDate: "2020-08-03T19:18:12Z" compiler: gc gitCommit: a695d74ef1aee9a3f605d38dd8b6fae2062b63fc gitTreeState: clean gitVersion: 4.6.0-202008031851.p0-a695d74 goVersion: go1.13.4 major: "" minor: "" platform: linux/amd64 So , will verify it.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196