2084081 – nmstate-operator installed cluster on POWER shows issues while adding new dhcp interface

Bug 2084081 - nmstate-operator installed cluster on POWER shows issues while adding new dhcp interface

Summary: nmstate-operator installed cluster on POWER shows issues while adding new dhc...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Networking
Sub Component:
Version:	4.11
Hardware:	ppc64le
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	4.11.0
Assignee:	Christoph Stäbler
QA Contact:	Aleksandra Malykhin
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2022-05-11 10:46 UTC by shweta
Modified:	2022-08-10 11:11 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2022-08-10 11:11:18 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2022:5069	0	None	None	None	2022-08-10 11:11:45 UTC

Description shweta 2022-05-11 10:46:38 UTC

Description of problem:
OCP 4.11: nmstate-operator installed cluster on POWER shows issues while adding new dhcp interface. 

Version-Release number of selected component (if applicable):

[root@rdr-shw-nm-3c3f-bastion-0 ~]# oc version
Client Version: 4.11.0-0.nightly-ppc64le-2022-05-04-095738
Kustomize Version: v4.5.4
Server Version: 4.11.0-0.nightly-ppc64le-2022-05-04-095738
Kubernetes Version: v1.23.3+d464c70

Nmstate Operator Version:  kubernetes-nmstate-operator.4.11.0-202205020057


How reproducible: Always


Steps to Reproduce:
1. Deploy kubernetes-nmstate-operator
2. Create an interface on nodes in the cluster by applying a NodeNetworkConfigurationPolicy

cat dhcp-nncp.yaml
apiVersion: nmstate.io/v1
kind: NodeNetworkConfigurationPolicy
metadata:
  name: env33
spec:
  nodeSelector:
    kubernetes.io/hostname: worker-0
  desiredState:
    interfaces:
    - name: env33
      description: dhcp routing on env33
      type: ethernet
      state: up
      ipv4:
        dhcp: true
        enabled: true

# oc apply -f dhcp-nncp.yaml
nodenetworkconfigurationpolicy.nmstate.io/env33 configured


Actual results:
[root@rdr-shw-nm-3c3f-bastion-0 ~]# oc get nncp
NAME    STATUS
env33   Degraded

Expected results:
[root@rdr-shw-nm-3c3f-bastion-0 ~]# oc get nncp
NAME    STATUS
env33   Available

Additional info:
[root@rdr-shw-nm-3c3f-bastion-0 ~]# oc describe pod  nmstate-handler-9rrsn -n openshift-nmstate
Name:                 nmstate-handler-9rrsn
Namespace:            openshift-nmstate
Priority:             2000001000
Priority Class Name:  system-node-critical
Node:                 worker-0/9.114.99.222
Start Time:           Tue, 10 May 2022 03:08:42 -0400
Labels:               app=kubernetes-nmstate
                      component=kubernetes-nmstate-handler
                      controller-revision-hash=58d848db6f
                      name=nmstate-handler
                      pod-template-generation=1
Annotations:          description: kubernetes-nmstate-handler configures and presents node networking, reconciling declerative NNCP and reports with NNS and NNCE
                      openshift.io/scc: privileged
Status:               Running
IP:                   9.114.99.222
IPs:
  IP:           9.114.99.222
Controlled By:  DaemonSet/nmstate-handler
Containers:
  nmstate-handler:
    Container ID:  cri-o://eeccc7097e81808ab953fc506446842cfb5e899116371877d23ac0b4f9ffd50d
    Image:         registry.redhat.io/openshift4/ose-kubernetes-nmstate-handler-rhel8@sha256:1a657eba807505606d04591cc827c920ab0f1a0d26efe19fbd5e74387e20b90e
    Image ID:      registry.redhat.io/openshift4/ose-kubernetes-nmstate-handler-rhel8@sha256:1a657eba807505606d04591cc827c920ab0f1a0d26efe19fbd5e74387e20b90e
    Port:          <none>
    Host Port:     <none>
    Command:
      manager
    Args:
      --zap-time-encoding=iso8601
    State:          Running
      Started:      Tue, 10 May 2022 03:08:48 -0400
    Ready:          True
    Restart Count:  0
    Requests:
      cpu:      100m
      memory:   100Mi
    Readiness:  exec [cat /tmp/healthy] delay=5s timeout=1s period=5s #success=1 #failure=3
    Environment:
      WATCH_NAMESPACE:
      POD_NAME:                         nmstate-handler-9rrsn (v1:metadata.name)
      COMPONENT:                         (v1:metadata.labels['app.kubernetes.io/component'])
      PART_OF:                           (v1:metadata.labels['app.kubernetes.io/part-of'])
      VERSION:                           (v1:metadata.labels['app.kubernetes.io/version'])
      MANAGED_BY:                        (v1:metadata.labels['app.kubernetes.io/managed-by'])
      OPERATOR_NAME:                    nmstate
      NODE_NAME:                         (v1:spec.nodeName)
      ENABLE_PROFILER:                  False
      PROFILER_PORT:                    6060
      NMSTATE_INSTANCE_NODE_LOCK_FILE:  /var/k8s_nmstate/handler_lock
    Mounts:
      /run/dbus/system_bus_socket from dbus-socket (rw)
      /run/openvswitch/db.sock from ovs-socket (rw)
      /var/k8s_nmstate from nmstate-lock (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-hslfx (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   True
  PodScheduled      True
Volumes:
  dbus-socket:
    Type:          HostPath (bare host directory volume)
    Path:          /run/dbus/system_bus_socket
    HostPathType:  Socket
  nmstate-lock:
    Type:          HostPath (bare host directory volume)
    Path:          /var/k8s_nmstate
    HostPathType:
  ovs-socket:
    Type:          HostPath (bare host directory volume)
    Path:          /run/openvswitch/db.sock
    HostPathType:
  kube-api-access-hslfx:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
    ConfigMapName:           openshift-service-ca.crt
    ConfigMapOptional:       <nil>
QoS Class:                   Burstable
Node-Selectors:              beta.kubernetes.io/arch=ppc64le
                             kubernetes.io/os=linux
Tolerations:                 op=Exists
Events:
  Type     Reason        Age   From             Message
  ----     ------        ----  ----             -------
  Warning  NodeNotReady  158m  node-controller  Node is not ready

Comment 1 Ben Nemec 2022-05-11 15:39:01 UTC

We'll need a must-gather from the cluster so we can see the logs that would indicate what went wrong here.

Comment 2 shweta 2022-05-13 08:55:06 UTC

must-gather logs.
https://drive.google.com/file/d/1QTSrO7qmfwOizXhiTcanetAvvYmfKTWT/view?usp=sharing

Comment 3 Christoph Stäbler 2022-05-18 14:20:24 UTC

Hello @sbiragda ,
thanks for providing the must-gathers. It looks you're running in an issue which got fixed in the last days (https://bugzilla.redhat.com/show_bug.cgi?id=2078573).
Could you please retry with the latest build?

Comment 4 Dan Li 2022-05-26 15:28:07 UTC

Making Comment 3 un-private as Shweta is a Partner Engineer and therefore could not see the private comment regarding re-try.

Comment 6 shweta 2022-06-14 10:25:15 UTC

Thanks, @danili @cstabler
Retried with 4.11.0-0.nightly-ppc64le-2022-06-11-114807 build.

Issue is not seen.

The nmstate operator version is 4.11.0-202206011509

# oc version
Client Version: 4.11.0-0.nightly-ppc64le-2022-06-11-114807
Kustomize Version: v4.5.4
Server Version: 4.11.0-0.nightly-ppc64le-2022-06-11-114807
Kubernetes Version: v1.24.0+cb71478

#cat dhcp-nncp.yaml
apiVersion: nmstate.io/v1
kind: NodeNetworkConfigurationPolicy
metadata:
  name: env33
spec:
  nodeSelector:
    kubernetes.io/hostname: worker-0
  desiredState:
    interfaces:
    - name: env33
      description: dhcp routing on env33
      type: ethernet
      state: up
      ipv4:
        dhcp: true
        enabled: true

# oc apply -f dhcp-nncp.yaml
nodenetworkconfigurationpolicy.nmstate.io/env33 configured

# oc get nncp
NAME    STATUS      REASON
env33   Available   SuccessfullyConfigured

Comment 8 Aleksandra Malykhin 2022-06-16 07:27:29 UTC

[kni@provisionhost-0-0 ~]$ oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.11.0-0.nightly-2022-06-11-054027   True        False         20h     Cluster version is 4.11.0-0.nightly-2022-06-11-054027
[kni@provisionhost-0-0 ~]$ oc get csv -A
NAMESPACE                              NAME                                              DISPLAY                       VERSION               REPLACES   PHASE
openshift-nmstate                      kubernetes-nmstate-operator.4.11.0-202206011509   Kubernetes NMState Operator   4.11.0-202206011509              Succeeded

[kni@provisionhost-0-0 ~]$ vi policy.yaml

apiVersion: nmstate.io/v1
kind: NodeNetworkConfigurationPolicy
metadata:
  name: enp0s3-up
spec:
  nodeSelector:
    kubernetes.io/hostname: worker-0-0.ocp-edge-cluster-0.qe.lab.redhat.com
  desiredState:
    interfaces:
    - name: enp0s3
      description: dhcp routing on enp0s3
      type: ethernet
      state: up
      ipv4:
        dhcp: true
        enabled: true


[kni@provisionhost-0-0 ~]$ oc apply -f policy.yaml 

nodenetworkconfigurationpolicy.nmstate.io/enp0s3-up configured
[kni@provisionhost-0-0 ~]$ oc get nnce -w
NAME                                                        STATUS        REASON
worker-0-0.ocp-edge-cluster-0.qe.lab.redhat.com.enp0s3-up   Progressing   ConfigurationProgressing
worker-0-0.ocp-edge-cluster-0.qe.lab.redhat.com.enp0s3-up   Available     SuccessfullyConfigured
^C[kni@provisionhost-0-0 ~]$ oc get nncp
NAME        STATUS      REASON
enp0s3-up   Available   SuccessfullyConfigured
[kni@provisionhost-0-0 ~]$

Comment 9 Aleksandra Malykhin 2022-06-16 08:34:18 UTC

I reproduced the issue with the kubernates nmstate v4.11.0.202205102228. 
So I assume that the problem is not specific to the POWER platform (that we haven't in our environment).

Comment 12 errata-xmlrpc 2022-08-10 11:11:18 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069

Note You need to log in before you can comment on or make changes to this bug.