Bug 1913215 - Unable to configure vlan interface via NodeNetworkConfigurationPolicy on a baremetal IPI deployment
Summary: Unable to configure vlan interface via NodeNetworkConfigurationPolicy on a ba...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: Networking
Version: 2.6.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.8.0
Assignee: Petr Horáček
QA Contact: Meni Yakove
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-01-06 09:54 UTC by Marius Cornea
Modified: 2021-02-03 12:14 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-02-03 12:14:36 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
nmstate_pods.log (33.76 KB, text/plain)
2021-01-06 09:54 UTC, Marius Cornea
no flags Details

Description Marius Cornea 2021-01-06 09:54:11 UTC
Created attachment 1744841 [details]
nmstate_pods.log

Description of problem:

I am trying to configure the following vlan interface via NodeNetworkConfigurationPolicy:


---
apiVersion: nmstate.io/v1alpha1
kind: NodeNetworkConfigurationPolicy
metadata:
  name: test-vlan
spec:
  nodeSelector:
    node-role.kubernetes.io/worker: ""
  desiredState:
    interfaces:
    - name: enp4s0.375
      type: vlan
      state: up
      vlan:
        base-iface: enp4s0
        id: 375

The configuration fails with the following error message:

{"level":"info","ts":1609926055.297617,"logger":"enactmentstatus","msg":"status: {DesiredState:interfaces:\n- name: enp4s0.375\n  state: up\n  type: vlan\n  vlan:\n    base-iface: enp4s0\n    id: 375\n PolicyGeneration:1 Conditions:[{Type:Failing Status:True Reason:FailedToConfigure Message:error reconciling NodeNetworkConfigurationPolicy at desired state apply: , failed to execute nmstatectl set --no-commit --timeout 480: 'exit status 1' '' 'Traceback (most recent call last):\n  File \"/usr/bin/nmstatectl\", line 11, in <module>\n    load_entry_point('nmstate==0.3.4', 'console_scripts', 'nmstatectl')()\n  File \"/usr/lib/python3.6/site-packages/nmstatectl/nmstatectl.py\", line 67, in main\n    return args.func(args)\n  File \"/usr/lib/python3.6/site-packages/nmstatectl/nmstatectl.py\", line 267, in apply\n    args.save_to_disk,\n  File \"/usr/lib/python3.6/site-packages/nmstatectl/nmstatectl.py\", line 289, in apply_state\n    save_to_disk=save_to_disk,\n  File \"/usr/lib/python3.6/site-packages/libnmstate/netapplier.py\", line 69, in apply\n    net_state = NetState(desired_state, current_state, save_to_disk)\n  File \"/usr/lib/python3.6/site-packages/libnmstate/net_state.py\", line 40, in __init__\n    save_to_disk,\n  File \"/usr/lib/python3.6/site-packages/libnmstate/ifaces/ifaces.py\", line 106, in __init__\n    self._pre_edit_validation_and_cleanup()\n  File \"/usr/lib/python3.6/site-packages/libnmstate/ifaces/ifaces.py\", line 128, in _pre_edit_validation_and_cleanup\n    self._validate_over_booked_slaves()\n  File \"/usr/lib/python3.6/site-packages/libnmstate/ifaces/ifaces.py\", line 423, in _validate_over_booked_slaves\n    f\"Interface {iface.name} slave {slave_name} is \"\nlibnmstate.error.NmstateValueError: Interface br-ex slave enp5s0 is already enslaved by interface bond0\n' LastHeartbeatTime:2021-01-06 09:40:55.297551833 +0000 UTC m=+480.815653246 LastTransitionTime:2021-01-06 09:40:55.297551833 +0000 UTC m=+480.815653246} {Type:Available Status:False Reason:FailedToConfigure Message: LastHeartbeatTime:2021-01-06 09:40:55.297552154 +0000 UTC m=+480.815653547 LastTransitionTime:2021-01-06 09:40:55.297552154 +0000 UTC m=+480.815653547} {Type:Progressing Status:False Reason:FailedToConfigure Message: LastHeartbeatTime:2021-01-06 09:40:55.297552411 +0000 UTC m=+480.815653802 LastTransitionTime:2021-01-06 09:40:55.297552411 +0000 UTC m=+480.815653802} {Type:Matching Status:True Reason:AllSelectorsMatching Message:All policy selectors are matching the node LastHeartbeatTime:2021-01-06 09:40:52 +0000 UTC LastTransitionTime:2021-01-06 09:40:52 +0000 UTC} {Type:Aborted Status:False Reason:SuccessfullyConfigured Message: LastHeartbeatTime:2021-01-06 09:40:55.297552791 +0000 UTC m=+480.815654181 LastTransitionTime:2021-01-06 09:40:55.297552791 +0000 UTC m=+480.815654181}]}","enactment":"worker-0-0.test-vlan"}

Nevertheless the requested NNCP configuration is not related to enp5s0 interface mentioned in the error message.


This is the networking layout on the nodes:

enp4s0
bond0: enp5s0,enp6s0
br-ex bridge created during deployment includes bond0 interface


nmcli con
NAME                UUID                                  TYPE           DEVICE 
ovs-if-br-ex        e0b6fe95-b4c6-4bbb-8f15-50ad1bd6b718  ovs-interface  br-ex  
Wired connection 1  9ca44b7c-265d-3fe3-bc51-7e52d84ab74c  ethernet       enp4s0 
br-ex               49a80196-d3df-42d9-ac1b-33282d94ae8d  ovs-bridge     br-ex  
ovs-if-phys0        22fb5643-4768-4fff-839a-122a0868a6c5  bond           bond0  
ovs-port-br-ex      3c390181-35c4-4f6b-9fe6-464f10210121  ovs-port       br-ex  
ovs-port-phys0      69aaa7d4-450d-40f6-80f9-696ebbc6bc72  ovs-port       bond0  
System enp5s0       9310e179-14b6-430a-6843-6491c047d532  ethernet       enp5s0 
System enp6s0       b43fa2aa-5a85-7b0a-9a20-469067dba6d6  ethernet       enp6s0 
bond0               ad33d8b0-1f7b-cab9-9447-ba07f855b143  bond           --     


[core@worker-0-0 ~]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: enp4s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:35:d4:fc brd ff:ff:ff:ff:ff:ff
    inet6 fd00:1101::552a:b19:e27a:4e9/128 scope global dynamic noprefixroute 
       valid_lft 2045sec preferred_lft 2045sec
    inet6 fe80::83db:b124:3a5d:20fd/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
3: enp5s0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc fq_codel master bond0 state UP group default qlen 1000
    link/ether 52:54:00:a5:49:43 brd ff:ff:ff:ff:ff:ff
4: enp6s0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc fq_codel master bond0 state UP group default qlen 1000
    link/ether 52:54:00:a5:49:43 brd ff:ff:ff:ff:ff:ff
6: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue master ovs-system state UP group default qlen 1000
    link/ether 52:54:00:a5:49:43 brd ff:ff:ff:ff:ff:ff
7: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 42:0e:c2:af:fe:9d brd ff:ff:ff:ff:ff:ff
8: br-ex: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 52:54:00:a5:49:43 brd ff:ff:ff:ff:ff:ff
    inet 192.168.123.123/24 brd 192.168.123.255 scope global dynamic noprefixroute br-ex
       valid_lft 2229sec preferred_lft 2229sec
    inet 192.168.123.10/32 scope global br-ex
       valid_lft forever preferred_lft forever
    inet6 fe80::acb7:e581:b640:6c69/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
[...]

Version-Release number of selected component (if applicable):
registry-proxy.engineering.redhat.com/rh-osbs/iib:35589
python3-libnmstate-0.3.4-17.el8_3.noarch
nmstate-0.3.4-17.el8_3.noarc

How reproducible:
100%

Steps to Reproduce:

1. Deploy OCP 4.7 via baremetal IPI flow. Nodes have the following network layout: one nic used for provisioning network and 2 nics grouped in a bond used for the control plane network

2. Deploy CNV 2.6

3. Create the following NNCP

---
apiVersion: nmstate.io/v1alpha1
kind: NodeNetworkConfigurationPolicy
metadata:
  name: test-vlan
spec:
  nodeSelector:
    node-role.kubernetes.io/worker: ""
  desiredState:
    interfaces:
    - name: enp4s0.375
      type: vlan
      state: up
      vlan:
        base-iface: enp4s0
        id: 375

Actual results:

NNCP fails to get configured

Expected results:

NNCP is configured correctly

Additional info:

Attaching nmstate pods logs.

Comment 1 Petr Horáček 2021-01-07 08:19:48 UTC
Readable traceback:

{"level":"info","ts":1609926055.297617,"logger":"enactmentstatus","msg":"status: {DesiredState:interfaces:
- name: enp4s0.375
  state: up
  type: vlan
  vlan:
    base-iface: enp4s0
    id: 375
 PolicyGeneration:1 Conditions:[{Type:Failing Status:True Reason:FailedToConfigure Message:error reconciling NodeNetworkConfigurationPolicy at desired state apply: , failed to execute nmstatectl set --no-commit --timeout 480: 'exit status 1' '' 'Traceback (most recent call last):
  File \"/usr/bin/nmstatectl\", line 11, in <module>
    load_entry_point('nmstate==0.3.4', 'console_scripts', 'nmstatectl')()
  File \"/usr/lib/python3.6/site-packages/nmstatectl/nmstatectl.py\", line 67, in main
    return args.func(args)
  File \"/usr/lib/python3.6/site-packages/nmstatectl/nmstatectl.py\", line 267, in apply
    args.save_to_disk,
  File \"/usr/lib/python3.6/site-packages/nmstatectl/nmstatectl.py\", line 289, in apply_state
    save_to_disk=save_to_disk,
  File \"/usr/lib/python3.6/site-packages/libnmstate/netapplier.py\", line 69, in apply
    net_state = NetState(desired_state, current_state, save_to_disk)
  File \"/usr/lib/python3.6/site-packages/libnmstate/net_state.py\", line 40, in __init__
    save_to_disk,
  File \"/usr/lib/python3.6/site-packages/libnmstate/ifaces/ifaces.py\", line 106, in __init__
    self._pre_edit_validation_and_cleanup()
  File \"/usr/lib/python3.6/site-packages/libnmstate/ifaces/ifaces.py\", line 128, in _pre_edit_validation_and_cleanup
    self._validate_over_booked_slaves()
  File \"/usr/lib/python3.6/site-packages/libnmstate/ifaces/ifaces.py\", line 423, in _validate_over_booked_slaves
    f\"Interface {iface.name} slave {slave_name} is \"
libnmstate.error.NmstateValueError: Interface br-ex slave enp5s0 is already enslaved by interface bond0
' LastHeartbeatTime:2021-01-06 09:40:55.297551833 +0000 UTC m=+480.815653246 LastTransitionTime:2021-01-06 09:40:55.297551833 +0000 UTC m=+480.815653246} {Type:Available Status:False Reason:FailedToConfigure Message: LastHeartbeatTime:2021-01-06 09:40:55.297552154 +0000 UTC m=+480.815653547 LastTransitionTime:2021-01-06 09:40:55.297552154 +0000 UTC m=+480.815653547} {Type:Progressing Status:False Reason:FailedToConfigure Message: LastHeartbeatTime:2021-01-06 09:40:55.297552411 +0000 UTC m=+480.815653802 LastTransitionTime:2021-01-06 09:40:55.297552411 +0000 UTC m=+480.815653802} {Type:Matching Status:True Reason:AllSelectorsMatching Message:All policy selectors are matching the node LastHeartbeatTime:2021-01-06 09:40:52 +0000 UTC LastTransitionTime:2021-01-06 09:40:52 +0000 UTC} {Type:Aborted Status:False Reason:SuccessfullyConfigured Message: LastHeartbeatTime:2021-01-06 09:40:55.297552791 +0000 UTC m=+480.815654181 LastTransitionTime:2021-01-06 09:40:55.297552791 +0000 UTC m=+480.815654181}]}","enactment":"worker-0-0.test-vlan"}

It appears to be the same issue as https://bugzilla.redhat.com/show_bug.cgi?id=1913248

Setting the priority to urgent until we at least find a workaround.

Comment 3 Marius Cornea 2021-02-03 12:14:36 UTC
This is fixed on registry-proxy.engineering.redhat.com/rh-osbs/iib:42945

oc -n openshift-cnv exec -it nmstate-handler-mp4np -- rpm -q nmstate
nmstate-0.3.4-22.el8_3.noarch


oc get NodeNetworkConfigurationPolicy test-vlan -o yaml
apiVersion: nmstate.io/v1beta1
kind: NodeNetworkConfigurationPolicy
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"nmstate.io/v1alpha1","kind":"NodeNetworkConfigurationPolicy","metadata":{"annotations":{},"name":"test-vlan"},"spec":{"desiredState":{"interfaces":[{"name":"enp4s0.375","state":"up","type":"vlan","vlan":{"base-iface":"enp4s0","id":375}}]},"nodeSelector":{"node-role.kubernetes.io/worker":""}}}
    nmstate.io/webhook-mutating-timestamp: "1612354322527864547"
  creationTimestamp: "2021-02-03T12:12:02Z"
  generation: 1
  managedFields:
  - apiVersion: nmstate.io/v1alpha1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .: {}
          f:kubectl.kubernetes.io/last-applied-configuration: {}
      f:spec:
        .: {}
        f:desiredState:
          .: {}
          f:interfaces: {}
        f:nodeSelector:
          .: {}
          f:node-role.kubernetes.io/worker: {}
    manager: kubectl-client-side-apply
    operation: Update
    time: "2021-02-03T12:12:02Z"
  - apiVersion: nmstate.io/v1beta1
    fieldsType: FieldsV1
    fieldsV1:
      f:status:
        .: {}
        f:conditions: {}
    manager: manager
    operation: Update
    time: "2021-02-03T12:12:15Z"
  name: test-vlan
  resourceVersion: "117844"
  selfLink: /apis/nmstate.io/v1beta1/nodenetworkconfigurationpolicies/test-vlan
  uid: 29c79f14-3a01-4e81-acf8-0e358fc2ca0e
spec:
  desiredState:
    interfaces:
    - name: enp4s0.375
      state: up
      type: vlan
      vlan:
        base-iface: enp4s0
        id: 375
  nodeSelector:
    node-role.kubernetes.io/worker: ""
status:
  conditions:
  - lastHearbeatTime: "2021-02-03T12:12:25Z"
    lastTransitionTime: "2021-02-03T12:12:25Z"
    message: 2/2 nodes successfully configured
    reason: SuccessfullyConfigured
    status: "True"
    type: Available
  - lastHearbeatTime: "2021-02-03T12:12:25Z"
    lastTransitionTime: "2021-02-03T12:12:25Z"
    reason: SuccessfullyConfigured
    status: "False"
    type: Degraded


Note You need to log in before you can comment on or make changes to this bug.