Description of problem: OCP 4.11: nmstate-operator installed cluster on POWER shows issues while adding new dhcp interface. Version-Release number of selected component (if applicable): [root@rdr-shw-nm-3c3f-bastion-0 ~]# oc version Client Version: 4.11.0-0.nightly-ppc64le-2022-05-04-095738 Kustomize Version: v4.5.4 Server Version: 4.11.0-0.nightly-ppc64le-2022-05-04-095738 Kubernetes Version: v1.23.3+d464c70 Nmstate Operator Version: kubernetes-nmstate-operator.4.11.0-202205020057 How reproducible: Always Steps to Reproduce: 1. Deploy kubernetes-nmstate-operator 2. Create an interface on nodes in the cluster by applying a NodeNetworkConfigurationPolicy cat dhcp-nncp.yaml apiVersion: nmstate.io/v1 kind: NodeNetworkConfigurationPolicy metadata: name: env33 spec: nodeSelector: kubernetes.io/hostname: worker-0 desiredState: interfaces: - name: env33 description: dhcp routing on env33 type: ethernet state: up ipv4: dhcp: true enabled: true # oc apply -f dhcp-nncp.yaml nodenetworkconfigurationpolicy.nmstate.io/env33 configured Actual results: [root@rdr-shw-nm-3c3f-bastion-0 ~]# oc get nncp NAME STATUS env33 Degraded Expected results: [root@rdr-shw-nm-3c3f-bastion-0 ~]# oc get nncp NAME STATUS env33 Available Additional info: [root@rdr-shw-nm-3c3f-bastion-0 ~]# oc describe pod nmstate-handler-9rrsn -n openshift-nmstate Name: nmstate-handler-9rrsn Namespace: openshift-nmstate Priority: 2000001000 Priority Class Name: system-node-critical Node: worker-0/9.114.99.222 Start Time: Tue, 10 May 2022 03:08:42 -0400 Labels: app=kubernetes-nmstate component=kubernetes-nmstate-handler controller-revision-hash=58d848db6f name=nmstate-handler pod-template-generation=1 Annotations: description: kubernetes-nmstate-handler configures and presents node networking, reconciling declerative NNCP and reports with NNS and NNCE openshift.io/scc: privileged Status: Running IP: 9.114.99.222 IPs: IP: 9.114.99.222 Controlled By: DaemonSet/nmstate-handler Containers: nmstate-handler: Container ID: cri-o://eeccc7097e81808ab953fc506446842cfb5e899116371877d23ac0b4f9ffd50d Image: registry.redhat.io/openshift4/ose-kubernetes-nmstate-handler-rhel8@sha256:1a657eba807505606d04591cc827c920ab0f1a0d26efe19fbd5e74387e20b90e Image ID: registry.redhat.io/openshift4/ose-kubernetes-nmstate-handler-rhel8@sha256:1a657eba807505606d04591cc827c920ab0f1a0d26efe19fbd5e74387e20b90e Port: <none> Host Port: <none> Command: manager Args: --zap-time-encoding=iso8601 State: Running Started: Tue, 10 May 2022 03:08:48 -0400 Ready: True Restart Count: 0 Requests: cpu: 100m memory: 100Mi Readiness: exec [cat /tmp/healthy] delay=5s timeout=1s period=5s #success=1 #failure=3 Environment: WATCH_NAMESPACE: POD_NAME: nmstate-handler-9rrsn (v1:metadata.name) COMPONENT: (v1:metadata.labels['app.kubernetes.io/component']) PART_OF: (v1:metadata.labels['app.kubernetes.io/part-of']) VERSION: (v1:metadata.labels['app.kubernetes.io/version']) MANAGED_BY: (v1:metadata.labels['app.kubernetes.io/managed-by']) OPERATOR_NAME: nmstate NODE_NAME: (v1:spec.nodeName) ENABLE_PROFILER: False PROFILER_PORT: 6060 NMSTATE_INSTANCE_NODE_LOCK_FILE: /var/k8s_nmstate/handler_lock Mounts: /run/dbus/system_bus_socket from dbus-socket (rw) /run/openvswitch/db.sock from ovs-socket (rw) /var/k8s_nmstate from nmstate-lock (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-hslfx (ro) Conditions: Type Status Initialized True Ready False ContainersReady True PodScheduled True Volumes: dbus-socket: Type: HostPath (bare host directory volume) Path: /run/dbus/system_bus_socket HostPathType: Socket nmstate-lock: Type: HostPath (bare host directory volume) Path: /var/k8s_nmstate HostPathType: ovs-socket: Type: HostPath (bare host directory volume) Path: /run/openvswitch/db.sock HostPathType: kube-api-access-hslfx: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: <nil> DownwardAPI: true ConfigMapName: openshift-service-ca.crt ConfigMapOptional: <nil> QoS Class: Burstable Node-Selectors: beta.kubernetes.io/arch=ppc64le kubernetes.io/os=linux Tolerations: op=Exists Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning NodeNotReady 158m node-controller Node is not ready
We'll need a must-gather from the cluster so we can see the logs that would indicate what went wrong here.
must-gather logs. https://drive.google.com/file/d/1QTSrO7qmfwOizXhiTcanetAvvYmfKTWT/view?usp=sharing
Hello @sbiragda , thanks for providing the must-gathers. It looks you're running in an issue which got fixed in the last days (https://bugzilla.redhat.com/show_bug.cgi?id=2078573). Could you please retry with the latest build?
Making Comment 3 un-private as Shweta is a Partner Engineer and therefore could not see the private comment regarding re-try.
Thanks, @danili @cstabler Retried with 4.11.0-0.nightly-ppc64le-2022-06-11-114807 build. Issue is not seen. The nmstate operator version is 4.11.0-202206011509 # oc version Client Version: 4.11.0-0.nightly-ppc64le-2022-06-11-114807 Kustomize Version: v4.5.4 Server Version: 4.11.0-0.nightly-ppc64le-2022-06-11-114807 Kubernetes Version: v1.24.0+cb71478 #cat dhcp-nncp.yaml apiVersion: nmstate.io/v1 kind: NodeNetworkConfigurationPolicy metadata: name: env33 spec: nodeSelector: kubernetes.io/hostname: worker-0 desiredState: interfaces: - name: env33 description: dhcp routing on env33 type: ethernet state: up ipv4: dhcp: true enabled: true # oc apply -f dhcp-nncp.yaml nodenetworkconfigurationpolicy.nmstate.io/env33 configured # oc get nncp NAME STATUS REASON env33 Available SuccessfullyConfigured
[kni@provisionhost-0-0 ~]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.11.0-0.nightly-2022-06-11-054027 True False 20h Cluster version is 4.11.0-0.nightly-2022-06-11-054027 [kni@provisionhost-0-0 ~]$ oc get csv -A NAMESPACE NAME DISPLAY VERSION REPLACES PHASE openshift-nmstate kubernetes-nmstate-operator.4.11.0-202206011509 Kubernetes NMState Operator 4.11.0-202206011509 Succeeded [kni@provisionhost-0-0 ~]$ vi policy.yaml apiVersion: nmstate.io/v1 kind: NodeNetworkConfigurationPolicy metadata: name: enp0s3-up spec: nodeSelector: kubernetes.io/hostname: worker-0-0.ocp-edge-cluster-0.qe.lab.redhat.com desiredState: interfaces: - name: enp0s3 description: dhcp routing on enp0s3 type: ethernet state: up ipv4: dhcp: true enabled: true [kni@provisionhost-0-0 ~]$ oc apply -f policy.yaml nodenetworkconfigurationpolicy.nmstate.io/enp0s3-up configured [kni@provisionhost-0-0 ~]$ oc get nnce -w NAME STATUS REASON worker-0-0.ocp-edge-cluster-0.qe.lab.redhat.com.enp0s3-up Progressing ConfigurationProgressing worker-0-0.ocp-edge-cluster-0.qe.lab.redhat.com.enp0s3-up Available SuccessfullyConfigured ^C[kni@provisionhost-0-0 ~]$ oc get nncp NAME STATUS REASON enp0s3-up Available SuccessfullyConfigured [kni@provisionhost-0-0 ~]$
I reproduced the issue with the kubernates nmstate v4.11.0.202205102228. So I assume that the problem is not specific to the POWER platform (that we haven't in our environment).
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069