Bug 1887398

Summary: [openshift-cnv][CNV] nodes need to exist and be labeled first, *before* the NodeNetworkConfigurationPolicy is applied
Product: Container Native Virtualization (CNV) Reporter: Andreas Karis <akaris>
Component: NetworkingAssignee: Quique Llorente <ellorent>
Status: CLOSED ERRATA QA Contact: Ofir Nash <onash>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 2.5.0CC: aos-bugs, cnv-qe-bugs, eparis, fpan, jokerman, onash, phoracek
Target Milestone: ---   
Target Release: 2.6.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: kubernetes-nmstate-handler-container-v2.6.0-9 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1887592 (view as bug list) Environment:
Last Closed: 2021-03-10 11:18:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1887592    
Attachments:
Description Flags
NNCP Yaml Example none

Description Andreas Karis 2020-10-12 11:25:15 UTC
Description of problem:

I just tried this out in my lab and what I see is that the nodes need to exist and be labeled first, *before* the NodeNetworkConfigurationPolicy is applied. If the NodeNetworkConfigurationPolicy is applied first and only after that the node receives its label, it will not be applied, it seems.

Otherwise, my policies look like this and get applied:
~~~
[root@openshift-jumpserver-0 ~]# cat nnp.yaml 
---
apiVersion: nmstate.io/v1alpha1
kind: NodeNetworkConfigurationPolicy
metadata:
  name: worker0-policy
spec:
  nodeSelector:
    node-role.kubernetes.io/worker: ''
    region: worker0-region
(...)
~~~

See "Additional info". If I invert a) and b) and run b) first, then a), the configuration is not applied for me.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

So as an example in my lab - this here worked:

a) Label nodes:
~~~
for i in 0 1 2 ; do oc label node openshift-worker-$i.example.com region=worker$i-region; done
~~~

b) Create NodeNetworkConfigurationPolicy:
~~~
cat <<'EOF' > nnp.yaml
---
apiVersion: nmstate.io/v1alpha1
kind: NodeNetworkConfigurationPolicy
metadata:
  name: worker0-policy
spec:
  nodeSelector:
    node-role.kubernetes.io/worker: ''
    region: worker0-region
  desiredState:
    interfaces:
    - name: enp4s0f0
      type: ethernet
      state: up
      mtu: 9000
    - name: enp4s0f0.906
      type: vlan
      state: up
      vlan:
        base-iface: enp4s0f0
        id: 906
      mtu: 9000
    - name: br-vm
      description: Linux bridge
      type: linux-bridge
      state: up
      bridge:
        options:
          stp:
            enabled: false
        port:
        - name: enp4s0f0.906
      mtu: 9000
---
apiVersion: nmstate.io/v1alpha1
kind: NodeNetworkConfigurationPolicy
metadata:
  name: worker1-policy
spec:
  nodeSelector:
    node-role.kubernetes.io/worker: ''
    region: worker1-region
  desiredState:
    interfaces:
    - name: enp4s0f1
      type: ethernet
      state: up
      mtu: 9000
    - name: enp4s0f1.906
      type: vlan
      state: up
      vlan:
        base-iface: enp4s0f1
        id: 906
      mtu: 9000
    - name: br-vm
      description: Linux bridge
      type: linux-bridge
      state: up
      bridge:
        options:
          stp:
            enabled: false
        port:
        - name: enp4s0f1.906
      mtu: 9000
---
apiVersion: nmstate.io/v1alpha1
kind: NodeNetworkConfigurationPolicy
metadata:
  name: worker2-policy
spec:
  nodeSelector:
    node-role.kubernetes.io/worker: ''
    region: worker2-region
  desiredState:
    interfaces:
    - name: enp5s0f0
      type: ethernet
      state: up
      mtu: 9000
    - name: enp5s0f0.906
      type: vlan
      state: up
      vlan:
        base-iface: enp5s0f0
        id: 906
      mtu: 9000
    - name: br-vm
      description: Linux bridge
      type: linux-bridge
      state: up
      bridge:
        options:
          stp:
            enabled: false
        port:
        - name: enp5s0f0.906
      mtu: 9000
EOF
oc apply -f nnp.yaml
~~~

c) List status:
~~~
[root@openshift-jumpserver-0 ~]# oc get -f nnp.yaml 
NAME             STATUS
worker0-policy   SuccessfullyConfigured
worker1-policy   SuccessfullyConfigured
worker2-policy   SuccessfullyConfigured
[root@openshift-jumpserver-0 ~]# oc get NodeNetworkConfigurationEnactment
NAME                                            STATUS
openshift-master-0.example.com.worker0-policy   NodeSelectorNotMatching
openshift-master-0.example.com.worker1-policy   NodeSelectorNotMatching
openshift-master-0.example.com.worker2-policy   NodeSelectorNotMatching
openshift-master-1.example.com.worker0-policy   NodeSelectorNotMatching
openshift-master-1.example.com.worker1-policy   NodeSelectorNotMatching
openshift-master-1.example.com.worker2-policy   NodeSelectorNotMatching
openshift-master-2.example.com.worker0-policy   NodeSelectorNotMatching
openshift-master-2.example.com.worker1-policy   NodeSelectorNotMatching
openshift-master-2.example.com.worker2-policy   NodeSelectorNotMatching
openshift-worker-0.example.com.worker0-policy   SuccessfullyConfigured
openshift-worker-0.example.com.worker1-policy   NodeSelectorNotMatching
openshift-worker-0.example.com.worker2-policy   NodeSelectorNotMatching
openshift-worker-1.example.com.worker0-policy   NodeSelectorNotMatching
openshift-worker-1.example.com.worker1-policy   SuccessfullyConfigured
openshift-worker-1.example.com.worker2-policy   NodeSelectorNotMatching
openshift-worker-2.example.com.worker0-policy   NodeSelectorNotMatching
openshift-worker-2.example.com.worker1-policy   NodeSelectorNotMatching
openshift-worker-2.example.com.worker2-policy   SuccessfullyConfigured
~~~

And I get a different configuration per "region":
~~~
[root@openshift-jumpserver-0 ~]# for i in 0 1 2 ; do ssh -i ~/.ssh/gss-stack-tools core@openshift-worker-$i.example.com "hostname ; ip a ls | egrep 'enp|br-vm'" ; done
openshift-worker-0.example.com
4: enp4s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP group default qlen 1000
7: enp4s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
8: enp5s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
9: enp5s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
73: br-vm: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP group default qlen 1000
74: enp4s0f0.906@enp4s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue master br-vm state UP group default qlen 1000
openshift-worker-1.example.com
3: enp4s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
7: enp4s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP group default qlen 1000
8: enp5s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
9: enp5s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
63: br-vm: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP group default qlen 1000
64: enp4s0f1.906@enp4s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue master br-vm state UP group default qlen 1000
openshift-worker-2.example.com
6: enp4s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
7: enp4s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
8: enp5s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP group default qlen 1000
9: enp5s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
27: br-vm: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP group default qlen 1000
28: enp5s0f0.906@enp5s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue master br-vm state UP group default qlen 1000
~~~

Comment 2 Ofir Nash 2021-01-03 13:51:14 UTC
Verified.

Nodes:
[cnv-qe-jenkins@network03-9n2rz-executor bug-nncp-label-nodes]$ oc get nodes
NAME                             STATUS   ROLES    AGE   VERSION
network03-9n2rz-master-0         Ready    master   12d   v1.20.0+87544c5
network03-9n2rz-master-1         Ready    master   12d   v1.20.0+87544c5
network03-9n2rz-master-2         Ready    master   12d   v1.20.0+87544c5
network03-9n2rz-worker-0-4rp7p   Ready    worker   12d   v1.20.0+87544c5
network03-9n2rz-worker-0-jrjr9   Ready    worker   12d   v1.20.0+87544c5
network03-9n2rz-worker-0-rnbm5   Ready    worker   12d   v1.20.0+87544c5


Scenario checked:
1. Verify worker nodes don't have 'region' label:
oc label node network03-9n2rz-worker-0-4rp7p region-
oc label node network03-9n2rz-worker-0-jrjr9 region-
oc label node network03-9n2rz-worker-0-rnbm5 region-

2. Create nncp (attached nncp example).

3. Apply nncp - `oc apply -f nncp.yaml`

4. List status - NoMatchingNode since we still didn't label the nodes.
[cnv-qe-jenkins@network03-9n2rz-executor bug-nncp-label-nodes]$ oc get -f nnp.yaml 
NAME             STATUS
worker0-policy   NoMatchingNode
worker1-policy   NoMatchingNode
worker2-policy   NoMatchingNode

5. Label Worker Nodes with region (as defined in the nncp nodeSelector we applied):
oc label node network03-9n2rz-worker-0-4rp7p region=worker0-region
oc label node network03-9n2rz-worker-0-jrjr9 region=worker1-region
oc label node network03-9n2rz-worker-0-rnbm5 region=worker2-region

6. Check Status of nncp and nnce after labeling the nodes:
[cnv-qe-jenkins@network03-9n2rz-executor bug-nncp-label-nodes]$ oc get -f nnp.yaml 
NAME             STATUS
worker0-policy   SuccessfullyConfigured
worker1-policy   SuccessfullyConfigured
worker2-policy   SuccessfullyConfigured

[cnv-qe-jenkins@network03-9n2rz-executor bug-nncp-label-nodes]$ oc get nnce
NAME                                            STATUS
network03-9n2rz-master-0.worker0-policy         NodeSelectorNotMatching
network03-9n2rz-master-0.worker1-policy         NodeSelectorNotMatching
network03-9n2rz-master-0.worker2-policy         NodeSelectorNotMatching
network03-9n2rz-master-1.worker0-policy         NodeSelectorNotMatching
network03-9n2rz-master-1.worker1-policy         NodeSelectorNotMatching
network03-9n2rz-master-1.worker2-policy         NodeSelectorNotMatching
network03-9n2rz-master-2.worker0-policy         NodeSelectorNotMatching
network03-9n2rz-master-2.worker1-policy         NodeSelectorNotMatching
network03-9n2rz-master-2.worker2-policy         NodeSelectorNotMatching
network03-9n2rz-worker-0-4rp7p.worker0-policy   SuccessfullyConfigured
network03-9n2rz-worker-0-4rp7p.worker1-policy   NodeSelectorNotMatching
network03-9n2rz-worker-0-4rp7p.worker2-policy   NodeSelectorNotMatching
network03-9n2rz-worker-0-jrjr9.worker0-policy   NodeSelectorNotMatching
network03-9n2rz-worker-0-jrjr9.worker1-policy   SuccessfullyConfigured
network03-9n2rz-worker-0-jrjr9.worker2-policy   NodeSelectorNotMatching
network03-9n2rz-worker-0-rnbm5.worker0-policy   NodeSelectorNotMatching
network03-9n2rz-worker-0-rnbm5.worker1-policy   NodeSelectorNotMatching
network03-9n2rz-worker-0-rnbm5.worker2-policy   SuccessfullyConfigured

Comment 3 Ofir Nash 2021-01-03 13:53:12 UTC
Created attachment 1744082 [details]
NNCP Yaml Example

Comment 6 errata-xmlrpc 2021-03-10 11:18:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Virtualization 2.6.0 security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:0799