Description of problem: nfd-workers applied under an IPv6 OpenShift deployment are just crashing. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: Applied this operator on an environment with ipv6. It deployed correctly on masters, but nfd-workers pods keep crashing. When i look at logs inside the container, i can see: 2020/04/08 10:44:05 Sendng labeling request nfd-master 2020/04/08 10:44:05 failed to set node labels: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp: address fd02::d19a:12000: too many colons in address" This seems to be a problem with ipv6 address validation. Actual results: Expected results: Additional info:
Setting target release to 4.6.0 (active development branch). Where any fixes are required/requested to be backported, clones targeting those z-stream releases will be created.
With Eduardo's image we now see it node-feature-discovery working with IPV6 (see the IPs being used below): [stack@openshift-master-0 ~]$ oc describe pods/nfd-worker-jvppc -n openshift-nfd Name: nfd-worker-jvppc Namespace: openshift-nfd Priority: 0 Node: worker-1.ostest.test.metalkube.org/fd2e:6f44:5dd8:c956::18 Start Time: Mon, 18 Jan 2021 15:53:09 -0500 Labels: app=nfd-worker controller-revision-hash=6b9f8bfd77 pod-template-generation=1 Annotations: openshift.io/scc: nfd-worker Status: Running IP: fd2e:6f44:5dd8:c956::18 IPs: IP: fd2e:6f44:5dd8:c956::18 Controlled By: DaemonSet/nfd-worker Containers: nfd-worker: Container ID: cri-o://69557baa8f927c78884c6dd797c0adb9bbafdd86832e8d38178643f5a8732eb8 Image: virthost.ostest.test.metalkube.org:5000/localimages/origin-node-feature-discovery:4.7 Image ID: virthost.ostest.test.metalkube.org:5000/localimages/origin-node-feature-discovery@sha256:75929c498301af285a8dcca4b17a45d5b53062c28b3a672a07b791be371757a1 Port: <none> Host Port: <none> Command: nfd-worker Args: --sleep-interval=60s --server=nfd-master:12000 State: Running Started: Mon, 18 Jan 2021 15:53:15 -0500 Ready: True Restart Count: 0 Environment: NODE_NAME: (v1:spec.nodeName) Mounts: /etc/kubernetes/node-feature-discovery from config (rw) /etc/kubernetes/node-feature-discovery/features.d from nfd-features (rw) /etc/kubernetes/node-feature-discovery/source.d from nfd-hooks (rw) /host-boot from host-boot (ro) /host-etc/os-release from host-os-release (ro) /host-sys from host-sys (rw) /var/run/secrets/kubernetes.io/serviceaccount from nfd-worker-token-l6j5r (ro) Conditions: Type Status Initialized True Ready True ContainersReady True PodScheduled True Volumes: host-boot: Type: HostPath (bare host directory volume) Path: /boot HostPathType: host-os-release: Type: HostPath (bare host directory volume) Path: /etc/os-release HostPathType: host-sys: Type: HostPath (bare host directory volume) Path: /sys HostPathType: nfd-hooks: Type: HostPath (bare host directory volume) Path: /etc/kubernetes/node-feature-discovery/source.d HostPathType: nfd-features: Type: HostPath (bare host directory volume) Path: /etc/kubernetes/node-feature-discovery/features.d HostPathType: config: Type: ConfigMap (a volume populated by a ConfigMap) Name: nfd-worker Optional: false nfd-worker-token-l6j5r: Type: Secret (a volume populated by a Secret) SecretName: nfd-worker-token-l6j5r Optional: false QoS Class: BestEffort Node-Selectors: <none> Tolerations: :NoSchedule op=Exists node.kubernetes.io/disk-pressure:NoSchedule op=Exists node.kubernetes.io/memory-pressure:NoSchedule op=Exists node.kubernetes.io/network-unavailable:NoSchedule op=Exists node.kubernetes.io/not-ready:NoExecute op=Exists node.kubernetes.io/pid-pressure:NoSchedule op=Exists node.kubernetes.io/unreachable:NoExecute op=Exists node.kubernetes.io/unschedulable:NoSchedule op=Exists Events: <none> [stack@openshift-master-0 ~]$
*** Bug 1913878 has been marked as a duplicate of this bug. ***
Verified that we could deploy NFD operator and instance of nfd-master-server operand successfully on an OCP 4.7fc3 disconnected IPv6 baremetal cluster. The NFD operator image and operand images were mirrored to the cluster local registry and deployed from NFD master github repo. $ oc version Client Version: 4.7.0-fc.3 Server Version: 4.7.0-fc.3 Kubernetes Version: v1.20.0+d9c52cc $ oc get pods -n openshift-nfd NAME READY STATUS RESTARTS AGE nfd-master-dmps2 1/1 Running 0 2d17h nfd-master-gz6q2 1/1 Running 0 2d17h nfd-master-zkt8h 1/1 Running 0 2d17h nfd-operator-59bf958694-58dzn 1/1 Running 0 2d17h nfd-worker-4r8cc 1/1 Running 0 2d17h nfd-worker-w9s8v 1/1 Running 0 2d17h $ oc get pods -n openshift-nfd -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nfd-master-dmps2 1/1 Running 0 2d17h fd01:0:0:3::b1 master-0-1.ocp-edge-cluster-jacot2-0.qe.lab.redhat.com <none> <none> nfd-master-gz6q2 1/1 Running 0 2d17h fd01:0:0:1::84 master-0-0.ocp-edge-cluster-jacot2-0.qe.lab.redhat.com <none> <none> nfd-master-zkt8h 1/1 Running 0 2d17h fd01:0:0:2::a2 master-0-2.ocp-edge-cluster-jacot2-0.qe.lab.redhat.com <none> <none> nfd-operator-59bf958694-58dzn 1/1 Running 0 2d17h fd01:0:0:2::a0 master-0-2.ocp-edge-cluster-jacot2-0.qe.lab.redhat.com <none> <none> nfd-worker-4r8cc 1/1 Running 0 2d17h fd2e:6f44:5dd8::13f worker-0-1.ocp-edge-cluster-jacot2-0.qe.lab.redhat.com <none> <none> nfd-worker-w9s8v 1/1 Running 0 2d17h fd2e:6f44:5dd8::123 worker-0-0.ocp-edge-cluster-jacot2-0.qe.lab.redhat.com <none> <none> $ oc describe node | grep feature feature.node.kubernetes.io/cpu-cpuid.ADX=true feature.node.kubernetes.io/cpu-cpuid.AESNI=true feature.node.kubernetes.io/cpu-cpuid.AVX=true feature.node.kubernetes.io/cpu-cpuid.AVX2=true feature.node.kubernetes.io/cpu-cpuid.FMA3=true feature.node.kubernetes.io/cpu-cpuid.HLE=true feature.node.kubernetes.io/cpu-cpuid.HYPERVISOR=true feature.node.kubernetes.io/cpu-cpuid.IBPB=true feature.node.kubernetes.io/cpu-cpuid.RTM=true feature.node.kubernetes.io/cpu-cpuid.STIBP=true feature.node.kubernetes.io/cpu-cpuid.VMX=true feature.node.kubernetes.io/custom-rdma.available=true feature.node.kubernetes.io/kernel-selinux.enabled=true feature.node.kubernetes.io/kernel-version.full=4.18.0-240.10.1.el8_3.x86_64 feature.node.kubernetes.io/kernel-version.major=4 feature.node.kubernetes.io/kernel-version.minor=18 feature.node.kubernetes.io/kernel-version.revision=0 feature.node.kubernetes.io/pci-1013.present=true feature.node.kubernetes.io/pci-1af4.present=true feature.node.kubernetes.io/system-os_release.ID=rhcos feature.node.kubernetes.io/system-os_release.RHEL_VERSION=8.3 feature.node.kubernetes.io/system-os_release.VERSION_ID=4.7 feature.node.kubernetes.io/system-os_release.VERSION_ID.major=4 feature.node.kubernetes.io/system-os_release.VERSION_ID.minor=7 nfd.node.kubernetes.io/feature-labels: feature.node.kubernetes.io/cpu-cpuid.ADX=true feature.node.kubernetes.io/cpu-cpuid.AESNI=true feature.node.kubernetes.io/cpu-cpuid.AVX=true feature.node.kubernetes.io/cpu-cpuid.AVX2=true feature.node.kubernetes.io/cpu-cpuid.FMA3=true feature.node.kubernetes.io/cpu-cpuid.HLE=true feature.node.kubernetes.io/cpu-cpuid.HYPERVISOR=true feature.node.kubernetes.io/cpu-cpuid.IBPB=true feature.node.kubernetes.io/cpu-cpuid.RTM=true feature.node.kubernetes.io/cpu-cpuid.STIBP=true feature.node.kubernetes.io/cpu-cpuid.VMX=true feature.node.kubernetes.io/custom-rdma.available=true feature.node.kubernetes.io/kernel-selinux.enabled=true feature.node.kubernetes.io/kernel-version.full=4.18.0-240.10.1.el8_3.x86_64 feature.node.kubernetes.io/kernel-version.major=4 feature.node.kubernetes.io/kernel-version.minor=18 feature.node.kubernetes.io/kernel-version.revision=0 feature.node.kubernetes.io/pci-1013.present=true feature.node.kubernetes.io/pci-1af4.present=true feature.node.kubernetes.io/system-os_release.ID=rhcos feature.node.kubernetes.io/system-os_release.RHEL_VERSION=8.3 feature.node.kubernetes.io/system-os_release.VERSION_ID=4.7 feature.node.kubernetes.io/system-os_release.VERSION_ID.major=4 feature.node.kubernetes.io/system-os_release.VERSION_ID.minor=7 nfd.node.kubernetes.io/feature-labels: $ oc describe node worker-0-0.ocp-edge-cluster-jacot2-0.qe.lab.redhat.com Name: worker-0-0.ocp-edge-cluster-jacot2-0.qe.lab.redhat.com Roles: worker Labels: beta.kubernetes.io/arch=amd64 beta.kubernetes.io/os=linux feature.node.kubernetes.io/cpu-cpuid.ADX=true feature.node.kubernetes.io/cpu-cpuid.AESNI=true feature.node.kubernetes.io/cpu-cpuid.AVX=true feature.node.kubernetes.io/cpu-cpuid.AVX2=true feature.node.kubernetes.io/cpu-cpuid.FMA3=true feature.node.kubernetes.io/cpu-cpuid.HLE=true feature.node.kubernetes.io/cpu-cpuid.HYPERVISOR=true feature.node.kubernetes.io/cpu-cpuid.IBPB=true feature.node.kubernetes.io/cpu-cpuid.RTM=true feature.node.kubernetes.io/cpu-cpuid.STIBP=true feature.node.kubernetes.io/cpu-cpuid.VMX=true feature.node.kubernetes.io/custom-rdma.available=true feature.node.kubernetes.io/kernel-selinux.enabled=true feature.node.kubernetes.io/kernel-version.full=4.18.0-240.10.1.el8_3.x86_64 feature.node.kubernetes.io/kernel-version.major=4 feature.node.kubernetes.io/kernel-version.minor=18 feature.node.kubernetes.io/kernel-version.revision=0 feature.node.kubernetes.io/pci-1013.present=true feature.node.kubernetes.io/pci-1af4.present=true feature.node.kubernetes.io/system-os_release.ID=rhcos feature.node.kubernetes.io/system-os_release.RHEL_VERSION=8.3 feature.node.kubernetes.io/system-os_release.VERSION_ID=4.7 feature.node.kubernetes.io/system-os_release.VERSION_ID.major=4 feature.node.kubernetes.io/system-os_release.VERSION_ID.minor=7 kubernetes.io/arch=amd64 kubernetes.io/hostname=worker-0-0.ocp-edge-cluster-jacot2-0.qe.lab.redhat.com kubernetes.io/os=linux node-role.kubernetes.io/worker= node.openshift.io/os_id=rhcos Annotations: k8s.ovn.org/l3-gateway-config: {"default":{"mode":"shared","interface-id":"br-ex_worker-0-0.ocp-edge-cluster-jacot2-0.qe.lab.redhat.com","mac-address":"52:54:00:3d:3a:d6... k8s.ovn.org/node-chassis-id: 3ecd516e-b673-4247-b8c9-e00e644e3b22 k8s.ovn.org/node-local-nat-ip: {"default":["fd99::821b"]} k8s.ovn.org/node-mgmt-port-mac-address: 0e:55:5a:82:7b:1e k8s.ovn.org/node-primary-ifaddr: {"ipv6":"fd2e:6f44:5dd8::123/128"} k8s.ovn.org/node-subnets: {"default":"fd01:0:0:5::/64"} machine.openshift.io/machine: openshift-machine-api/ocp-edge-cluster-jaco-k2gjl-worker-0-24p6z machineconfiguration.openshift.io/currentConfig: rendered-worker-88487793ccf13d4f83d751a51ad678bb machineconfiguration.openshift.io/desiredConfig: rendered-worker-88487793ccf13d4f83d751a51ad678bb machineconfiguration.openshift.io/reason: machineconfiguration.openshift.io/state: Done nfd.node.kubernetes.io/extended-resources: nfd.node.kubernetes.io/feature-labels: cpu-cpuid.ADX,cpu-cpuid.AESNI,cpu-cpuid.AVX,cpu-cpuid.AVX2,cpu-cpuid.FMA3,cpu-cpuid.HLE,cpu-cpuid.HYPERVISOR,cpu-cpuid.IBPB,cpu-cpuid.RTM,... nfd.node.kubernetes.io/worker.version: 1.15 volumes.kubernetes.io/controller-managed-attach-detach: true CreationTimestamp: Mon, 25 Jan 2021 22:28:37 +0000 Taints: <none> Unschedulable: false Lease: HolderIdentity: worker-0-0.ocp-edge-cluster-jacot2-0.qe.lab.redhat.com AcquireTime: <unset> RenewTime: Mon, 01 Feb 2021 16:19:02 +0000 Conditions: Type Status LastHeartbeatTime LastTransitionTime Reason Message ---- ------ ----------------- ------------------ ------ ------- MemoryPressure False Mon, 01 Feb 2021 16:14:11 +0000 Mon, 25 Jan 2021 22:28:35 +0000 KubeletHasSufficientMemory kubelet has sufficient memory available DiskPressure False Mon, 01 Feb 2021 16:14:11 +0000 Mon, 25 Jan 2021 22:28:35 +0000 KubeletHasNoDiskPressure kubelet has no disk pressure PIDPressure False Mon, 01 Feb 2021 16:14:11 +0000 Mon, 25 Jan 2021 22:28:35 +0000 KubeletHasSufficientPID kubelet has sufficient PID available Ready True Mon, 01 Feb 2021 16:14:11 +0000 Mon, 25 Jan 2021 22:29:16 +0000 KubeletReady kubelet is posting ready status Addresses: InternalIP: fd2e:6f44:5dd8::123 Hostname: worker-0-0.ocp-edge-cluster-jacot2-0.qe.lab.redhat.com Capacity: cpu: 8 ephemeral-storage: 52660Mi hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 16390284Ki pods: 250 Allocatable: cpu: 7500m ephemeral-storage: 48622469038 hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 15239308Ki pods: 250 System Info: Machine ID: e9733bffaa8c4bb39a954ef27df6bba0 System UUID: e9733bff-aa8c-4bb3-9a95-4ef27df6bba0 Boot ID: b35aaec2-8054-4056-84ba-9ec112c63f83 Kernel Version: 4.18.0-240.10.1.el8_3.x86_64 OS Image: Red Hat Enterprise Linux CoreOS 47.83.202101171239-0 (Ootpa) Operating System: linux Architecture: amd64 Container Runtime Version: cri-o://1.20.0-0.rhaos4.7.gitd9f17c8.el8.42 Kubelet Version: v1.20.0+d9c52cc Kube-Proxy Version: v1.20.0+d9c52cc ProviderID: baremetalhost:///openshift-machine-api/openshift-worker-0-0/293c9b55-57cb-4c6d-bf4a-89d37614a43e Non-terminated Pods: (30 in total) Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits AGE --------- ---- ------------ ---------- --------------- ------------- --- openshift-cluster-node-tuning-operator tuned-59q4x 10m (0%) 0 (0%) 50Mi (0%) 0 (0%) 6d17h openshift-dns dns-default-f2lrb 65m (0%) 0 (0%) 131Mi (0%) 0 (0%) 6d17h openshift-image-registry image-registry-5f57fbb64f-xlj9c 100m (1%) 0 (0%) 256Mi (1%) 0 (0%) 2d18h openshift-image-registry node-ca-zcx52 10m (0%) 0 (0%) 10Mi (0%) 0 (0%) 6d17h openshift-ingress-canary ingress-canary-5lccs 10m (0%) 0 (0%) 20Mi (0%) 0 (0%) 6d17h openshift-ingress router-default-5b47bd97f6-5bjf7 100m (1%) 0 (0%) 256Mi (1%) 0 (0%) 2d18h openshift-kni-infra coredns-worker-0-0.ocp-edge-cluster-jacot2-0.qe.lab.redhat.com 200m (2%) 0 (0%) 400Mi (2%) 0 (0%) 6d17h openshift-kni-infra keepalived-worker-0-0.ocp-edge-cluster-jacot2-0.qe.lab.redhat.com 200m (2%) 0 (0%) 400Mi (2%) 0 (0%) 6d17h openshift-kni-infra mdns-publisher-worker-0-0.ocp-edge-cluster-jacot2-0.qe.lab.redhat.com 100m (1%) 0 (0%) 200Mi (1%) 0 (0%) 6d17h openshift-kube-storage-version-migrator migrator-56998ccbc5-9m27v 100m (1%) 0 (0%) 200Mi (1%) 0 (0%) 2d18h openshift-machine-config-operator machine-config-daemon-cc2dk 40m (0%) 0 (0%) 100Mi (0%) 0 (0%) 6d17h openshift-marketplace redhat-operator-index-2nwqj 10m (0%) 0 (0%) 50Mi (0%) 0 (0%) 2d18h openshift-monitoring alertmanager-main-0 8m (0%) 0 (0%) 270Mi (1%) 0 (0%) 2d18h openshift-monitoring alertmanager-main-1 8m (0%) 0 (0%) 270Mi (1%) 0 (0%) 2d18h openshift-monitoring alertmanager-main-2 8m (0%) 0 (0%) 270Mi (1%) 0 (0%) 2d18h openshift-monitoring grafana-76ccdf9487-rgjqv 5m (0%) 0 (0%) 120Mi (0%) 0 (0%) 2d18h openshift-monitoring kube-state-metrics-56b4768c7-hpbxv 4m (0%) 0 (0%) 120Mi (0%) 0 (0%) 2d18h openshift-monitoring node-exporter-t7nlc 9m (0%) 0 (0%) 210Mi (1%) 0 (0%) 6d17h openshift-monitoring openshift-state-metrics-68f5786bbb-bzsjk 3m (0%) 0 (0%) 190Mi (1%) 0 (0%) 2d18h openshift-monitoring prometheus-k8s-0 76m (1%) 0 (0%) 1204Mi (8%) 0 (0%) 2d18h openshift-monitoring prometheus-k8s-1 76m (1%) 0 (0%) 1204Mi (8%) 0 (0%) 2d18h openshift-monitoring thanos-querier-64c9b86458-4r4mj 9m (0%) 0 (0%) 92Mi (0%) 0 (0%) 2d18h openshift-monitoring thanos-querier-64c9b86458-wl4qp 9m (0%) 0 (0%) 92Mi (0%) 0 (0%) 2d18h openshift-multus multus-6tw5h 10m (0%) 0 (0%) 150Mi (1%) 0 (0%) 6d17h openshift-multus network-metrics-daemon-hrt6c 20m (0%) 0 (0%) 120Mi (0%) 0 (0%) 6d17h openshift-network-diagnostics network-check-source-8b577f64-nlbgd 10m (0%) 0 (0%) 50Mi (0%) 0 (0%) 2d18h openshift-network-diagnostics network-check-target-b6lhj 10m (0%) 0 (0%) 150Mi (1%) 0 (0%) 6d17h openshift-nfd nfd-worker-w9s8v 0 (0%) 0 (0%) 0 (0%) 0 (0%) 2d17h openshift-ovn-kubernetes ovnkube-node-zc8mc 30m (0%) 0 (0%) 620Mi (4%) 0 (0%) 6d17h openshift-ovn-kubernetes ovs-node-wzx9z 100m (1%) 0 (0%) 300Mi (2%) 0 (0%) 6d17h Allocated resources: (Total limits may be over 100 percent, i.e., overcommitted.) Resource Requests Limits -------- -------- ------ cpu 1340m (17%) 0 (0%) memory 7505Mi (50%) 0 (0%) ephemeral-storage 0 (0%) 0 (0%) hugepages-1Gi 0 (0%) 0 (0%) hugepages-2Mi 0 (0%) 0 (0%) Events:
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 extras and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5635