Bug 1898510 - NFD worker pods not scheduler on a 3 node master/worker cluster
Summary: NFD worker pods not scheduler on a 3 node master/worker cluster
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node Feature Discovery Operator
Version: 4.7
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.6.z
Assignee: Carlos Eduardo Arango Gutierrez
QA Contact: Walid A.
URL:
Whiteboard:
Depends On: 1897346
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-11-17 11:40 UTC by OpenShift BugZilla Robot
Modified: 2021-02-01 15:16 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-02-01 15:16:13 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-nfd-operator pull 120 0 None closed Bug 1898510: This is an manual cherry-pick of #110 2021-01-25 15:28:03 UTC
Red Hat Product Errata RHBA-2021:0238 0 None None None 2021-02-01 15:16:20 UTC

Description OpenShift BugZilla Robot 2020-11-17 11:40:35 UTC
+++ This bug was initially created as a clone of Bug #1897346 +++

Description of problem:

On a 4.6 converged master/worker cluster NFD workers are not deployed. 
We are tolerating all taints and repelling the nfd worker from the master. 
Now in the converged master/worker case we need also make sure that NFD worker is also scheduled on a node with master/worker label.

Comment 3 Walid A. 2021-01-25 22:04:11 UTC
Verified on OCP 4.6.0-0.nightly-2021-01-25-060359.

# oc get co
NAME                                       VERSION                             AVAILABLE   PROGRESSING   DEGRADED   SINCE
authentication                             4.6.0-0.nightly-2021-01-25-060359   True        False         False      5h3m
cloud-credential                           4.6.0-0.nightly-2021-01-25-060359   True        False         False      5h51m
cluster-autoscaler                         4.6.0-0.nightly-2021-01-25-060359   True        False         False      5h43m
config-operator                            4.6.0-0.nightly-2021-01-25-060359   True        False         False      5h43m
console                                    4.6.0-0.nightly-2021-01-25-060359   True        False         False      5h5m
csi-snapshot-controller                    4.6.0-0.nightly-2021-01-25-060359   True        False         False      5h33m
dns                                        4.6.0-0.nightly-2021-01-25-060359   True        False         False      5h42m
etcd                                       4.6.0-0.nightly-2021-01-25-060359   True        False         False      5h41m
image-registry                             4.6.0-0.nightly-2021-01-25-060359   True        False         False      5h33m
ingress                                    4.6.0-0.nightly-2021-01-25-060359   True        False         False      5h32m
insights                                   4.6.0-0.nightly-2021-01-25-060359   True        False         False      5h43m
kube-apiserver                             4.6.0-0.nightly-2021-01-25-060359   True        False         False      5h37m
kube-controller-manager                    4.6.0-0.nightly-2021-01-25-060359   True        False         False      5h36m
kube-scheduler                             4.6.0-0.nightly-2021-01-25-060359   True        False         False      5h38m
kube-storage-version-migrator              4.6.0-0.nightly-2021-01-25-060359   True        False         False      5h12m
machine-api                                4.6.0-0.nightly-2021-01-25-060359   True        False         False      5h37m
machine-approver                           4.6.0-0.nightly-2021-01-25-060359   True        False         False      5h42m
machine-config                             4.6.0-0.nightly-2021-01-25-060359   True        False         False      5h41m
marketplace                                4.6.0-0.nightly-2021-01-25-060359   True        False         False      5h4m
monitoring                                 4.6.0-0.nightly-2021-01-25-060359   True        False         False      5h32m
network                                    4.6.0-0.nightly-2021-01-25-060359   True        False         False      5h43m
node-tuning                                4.6.0-0.nightly-2021-01-25-060359   True        False         False      5h43m
openshift-apiserver                        4.6.0-0.nightly-2021-01-25-060359   True        False         False      5h33m
openshift-controller-manager               4.6.0-0.nightly-2021-01-25-060359   True        False         False      5h31m
openshift-samples                          4.6.0-0.nightly-2021-01-25-060359   True        False         False      5h32m
operator-lifecycle-manager                 4.6.0-0.nightly-2021-01-25-060359   True        False         False      5h42m
operator-lifecycle-manager-catalog         4.6.0-0.nightly-2021-01-25-060359   True        False         False      5h42m
operator-lifecycle-manager-packageserver   4.6.0-0.nightly-2021-01-25-060359   True        False         False      5h5m
service-ca                                 4.6.0-0.nightly-2021-01-25-060359   True        False         False      5h43m
storage                                    4.6.0-0.nightly-2021-01-25-060359   True        False         False      5h11m

# oc get nodes
NAME                                         STATUS   ROLES           AGE     VERSION
ip-10-0-136-167.us-east-2.compute.internal   Ready    master,worker   5h45m   v1.19.0+1833054
ip-10-0-173-7.us-east-2.compute.internal     Ready    master,worker   5h45m   v1.19.0+1833054
ip-10-0-209-236.us-east-2.compute.internal   Ready    master,worker   5h45m   v1.19.0+1833054

# oc get pods -n test-nfd
NAME                            READY   STATUS    RESTARTS   AGE
nfd-master-52dj6                1/1     Running   0          15m
nfd-master-bcldh                1/1     Running   0          15m
nfd-master-jwp2d                1/1     Running   0          15m
nfd-operator-64f66f8476-6swsh   1/1     Running   0          18m
nfd-worker-fvkbs                1/1     Running   1          15m
nfd-worker-qhzvv                1/1     Running   1          15m
nfd-worker-znm5w                1/1     Running   0          15m

# oc describe node | grep feature
                    feature.node.kubernetes.io/cpu-cpuid.ADX=true
                    feature.node.kubernetes.io/cpu-cpuid.AESNI=true
                    feature.node.kubernetes.io/cpu-cpuid.AVX=true
                    feature.node.kubernetes.io/cpu-cpuid.AVX2=true
                    feature.node.kubernetes.io/cpu-cpuid.AVX512BW=true
                    feature.node.kubernetes.io/cpu-cpuid.AVX512CD=true
                    feature.node.kubernetes.io/cpu-cpuid.AVX512DQ=true
                    feature.node.kubernetes.io/cpu-cpuid.AVX512F=true
                    feature.node.kubernetes.io/cpu-cpuid.AVX512VL=true
                    feature.node.kubernetes.io/cpu-cpuid.FMA3=true
                    feature.node.kubernetes.io/cpu-cpuid.MPX=true
                    feature.node.kubernetes.io/cpu-hardware_multithreading=true
                    feature.node.kubernetes.io/custom-rdma.available=true
                    feature.node.kubernetes.io/kernel-selinux.enabled=true
                    feature.node.kubernetes.io/kernel-version.full=4.18.0-193.40.1.el8_2.x86_64
                    feature.node.kubernetes.io/kernel-version.major=4
                    feature.node.kubernetes.io/kernel-version.minor=18
                    feature.node.kubernetes.io/kernel-version.revision=0
                    feature.node.kubernetes.io/pci-1d0f.present=true
                    feature.node.kubernetes.io/storage-nonrotationaldisk=true
                    feature.node.kubernetes.io/system-os_release.ID=rhcos
                    feature.node.kubernetes.io/system-os_release.VERSION_ID=4.6
                    feature.node.kubernetes.io/system-os_release.VERSION_ID.major=4
                    feature.node.kubernetes.io/system-os_release.VERSION_ID.minor=6
                    nfd.node.kubernetes.io/feature-labels:
                    feature.node.kubernetes.io/cpu-cpuid.ADX=true
                    feature.node.kubernetes.io/cpu-cpuid.AESNI=true
                    feature.node.kubernetes.io/cpu-cpuid.AVX=true
                    feature.node.kubernetes.io/cpu-cpuid.AVX2=true
                    feature.node.kubernetes.io/cpu-cpuid.AVX512BW=true
                    feature.node.kubernetes.io/cpu-cpuid.AVX512CD=true
                    feature.node.kubernetes.io/cpu-cpuid.AVX512DQ=true
                    feature.node.kubernetes.io/cpu-cpuid.AVX512F=true
                    feature.node.kubernetes.io/cpu-cpuid.AVX512VL=true
                    feature.node.kubernetes.io/cpu-cpuid.FMA3=true
                    feature.node.kubernetes.io/cpu-cpuid.MPX=true
                    feature.node.kubernetes.io/cpu-hardware_multithreading=true
                    feature.node.kubernetes.io/custom-rdma.available=true
                    feature.node.kubernetes.io/kernel-selinux.enabled=true
                    feature.node.kubernetes.io/kernel-version.full=4.18.0-193.40.1.el8_2.x86_64
                    feature.node.kubernetes.io/kernel-version.major=4
                    feature.node.kubernetes.io/kernel-version.minor=18
                    feature.node.kubernetes.io/kernel-version.revision=0
                    feature.node.kubernetes.io/pci-1d0f.present=true
                    feature.node.kubernetes.io/storage-nonrotationaldisk=true
                    feature.node.kubernetes.io/system-os_release.ID=rhcos
                    feature.node.kubernetes.io/system-os_release.VERSION_ID=4.6
                    feature.node.kubernetes.io/system-os_release.VERSION_ID.major=4
                    feature.node.kubernetes.io/system-os_release.VERSION_ID.minor=6
                    nfd.node.kubernetes.io/feature-labels:
                    feature.node.kubernetes.io/cpu-cpuid.ADX=true
                    feature.node.kubernetes.io/cpu-cpuid.AESNI=true
                    feature.node.kubernetes.io/cpu-cpuid.AVX=true
                    feature.node.kubernetes.io/cpu-cpuid.AVX2=true
                    feature.node.kubernetes.io/cpu-cpuid.AVX512BW=true
                    feature.node.kubernetes.io/cpu-cpuid.AVX512CD=true
                    feature.node.kubernetes.io/cpu-cpuid.AVX512DQ=true
                    feature.node.kubernetes.io/cpu-cpuid.AVX512F=true
                    feature.node.kubernetes.io/cpu-cpuid.AVX512VL=true
                    feature.node.kubernetes.io/cpu-cpuid.FMA3=true
                    feature.node.kubernetes.io/cpu-cpuid.MPX=true
                    feature.node.kubernetes.io/cpu-hardware_multithreading=true
                    feature.node.kubernetes.io/custom-rdma.available=true
                    feature.node.kubernetes.io/kernel-selinux.enabled=true
                    feature.node.kubernetes.io/kernel-version.full=4.18.0-193.40.1.el8_2.x86_64
                    feature.node.kubernetes.io/kernel-version.major=4
                    feature.node.kubernetes.io/kernel-version.minor=18
                    feature.node.kubernetes.io/kernel-version.revision=0
                    feature.node.kubernetes.io/pci-1d0f.present=true
                    feature.node.kubernetes.io/storage-nonrotationaldisk=true
                    feature.node.kubernetes.io/system-os_release.ID=rhcos
                    feature.node.kubernetes.io/system-os_release.VERSION_ID=4.6
                    feature.node.kubernetes.io/system-os_release.VERSION_ID.major=4
                    feature.node.kubernetes.io/system-os_release.VERSION_ID.minor=6
                    nfd.node.kubernetes.io/feature-labels:

Comment 5 errata-xmlrpc 2021-02-01 15:16:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6.15 extras update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:0238


Note You need to log in before you can comment on or make changes to this bug.