Bug 1901859 - NodeNetworkConfigurationPolicy failed to retrieve default gw - create VLAN interface
Summary: NodeNetworkConfigurationPolicy failed to retrieve default gw - create VLAN in...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: Networking
Version: 2.5.1
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 4.8.0
Assignee: Quique Llorente
QA Contact: Meni Yakove
URL:
Whiteboard:
: 1879458 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-11-26 09:38 UTC by Robert Bohne
Modified: 2022-08-09 10:08 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-07-27 14:21:17 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
NodeNetworkState (4.83 KB, text/plain)
2020-11-26 09:38 UTC, Robert Bohne
no flags Details
NodeNetworkConfigurationEnactment (2.34 KB, text/plain)
2020-11-26 09:39 UTC, Robert Bohne
no flags Details
NodeNetworkConfigurationEnactment-with-ip (13.35 KB, text/plain)
2020-11-26 09:39 UTC, Robert Bohne
no flags Details
NodeNetworkState-with-ip (4.83 KB, text/plain)
2020-11-26 09:40 UTC, Robert Bohne
no flags Details
NodeNetworkState-with-ip-false (4.83 KB, text/plain)
2020-11-26 10:57 UTC, Robert Bohne
no flags Details
NodeNetworkConfigurationEnactment-with-ip-false (2.40 KB, text/plain)
2020-11-26 10:58 UTC, Robert Bohne
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2021:2920 0 None None None 2021-07-27 14:22:27 UTC

Description Robert Bohne 2020-11-26 09:38:29 UTC
Created attachment 1733676 [details]
NodeNetworkState

Created attachment 1733676 [details]
NodeNetworkState

Description of problem:

We want to create a VLAN interface:

oc apply -f - <<EOF
apiVersion: nmstate.io/v1alpha1
kind: NodeNetworkConfigurationPolicy
metadata:
  name: vlan-ens3f1-policy
spec:
  nodeSelector:
    kubernetes.io/hostname: "ocp-master1"
  desiredState:
    interfaces:
    - name: ens3f1.602
      description: VLAN using ens3f1
      type: vlan
      state: up
      vlan:
        base-iface: ens3f1
        id: 602
EOF

and got the error: "error reconciling NodeNetworkConfigurationPolicy at desired state apply: , rolling back desired state configuration: failed runnig probes after network changes: failed to retrieve default gw at runProbes: timed out waiting for the condition"

During the installation, we created a VLAN interface via Live ISO for the OpenShiftSDN:

[root@ocp-lb ~]# oc debug node/ocp-master1
Creating debug namespace/openshift-debug-node-m6wfb ...
Starting pod/ocp-master1-debug ...
To use host binaries, run `chroot /host`
Pod IP: 172.29.26.52
If you don't see a command prompt, try pressing enter.
sh-4.4# chroot /host
sh-4.4# nmcli con show
NAME                UUID                                  TYPE      DEVICE
VLAN connection 1   561dcb98-d2e2-49cb-9485-8d43bfbdd1d1  vlan      ens2f1.602
Wired connection 1  00a6d56a-4cd5-3e15-ba80-7a9358522368  ethernet  --
Wired connection 2  b7bab4a4-99f7-379e-bda7-acd3e1e2396d  ethernet  --
Wired connection 3  b59db2e4-6794-3d45-ba16-4e32aa2ea89a  ethernet  --
Wired connection 4  4172378f-325c-386f-865b-c043539360dd  ethernet  --
Wired connection 5  2a8f681d-0481-3382-856f-4a85859b06ab  ethernet  --
Wired connection 6  f6bad38f-7e2c-3153-b8a8-a232ee3fec8d  ethernet  --


During the NodeNetworkConfigurationPolicy progressing the con show's up:
sh-4.4# nmcli con show
NAME                UUID                                  TYPE      DEVICE
VLAN connection 1   561dcb98-d2e2-49cb-9485-8d43bfbdd1d1  vlan      ens2f1.602
ens3f1.602          64cebaa5-5d79-435d-af1d-849c5545cd6a  vlan      ens3f1.602
...

Attached NodeNetworkState and NodeNetworkConfigurationEnactment

I tried to add
        ipv4:
          enabled: true
          dhcp: false
to my policy, another error: 

[root@ocp-lb ~]# oc get nnce/ocp-master1.vlan-ens3f1-policy -o json | jq '.status.conditions[0].message' -r
error reconciling NodeNetworkConfigurationPolicy at desired state apply: , failed to execute nmstatectl set --no-commit --timeout 480: 'exit status 1' '' '2020-11-26 09:29:31,434 root         DEBUG    Checkpoint /org/freedesktop/NetworkManager/Checkpoint/13 created for all devices: 480
2020-11-26 09:29:31,435 root         DEBUG    Adding new interfaces: ['ens3f1.602']
2020-11-26 09:29:31,437 root         DEBUG    Editing interfaces: ['ens2f1.602']
2020-11-26 09:29:31,439 root         DEBUG    Executing NM action: func=add_connection_async
2020-11-26 09:29:31,455 root         DEBUG    Connection adding succeeded: dev=ens3f1.602
2020-11-26 09:29:31,455 root         DEBUG    Executing NM action: func=commit_changes_async
2020-11-26 09:29:31,459 root         DEBUG    Connection update succeeded: dev=ens2f1.602
2020-11-26 09:29:31,459 root         DEBUG    Executing NM action: func=_safe_modify_async
2020-11-26 09:29:31,463 root         DEBUG    Device reapply succeeded: dev=ens2f1.602
2020-11-26 09:29:31,463 root         DEBUG    Executing NM action: func=safe_activate_async
2020-11-26 09:29:31,465 root         DEBUG    Connection activation initiated: dev=ens3f1.602, con-state=<enum NM_ACTIVE_CONNECTION_STATE_ACTIVATING of type NM.ActiveConnectionState>
2020-11-26 09:29:31,484 root         DEBUG    Connection activation succeeded: dev=ens3f1.602, con-state=<enum NM_ACTIVE_CONNECTION_STATE_ACTIVATED of type NM.ActiveConnectionState>, dev-state=<enum NM_DEVICE_STATE_ACTIVATED of type NM.DeviceState>, state-flags=<flags NM_ACTIVATION_STATE_FLAG_LAYER2_READY | NM_ACTIVATION_STATE_FLAG_IP4_READY | NM_ACTIVATION_STATE_FLAG_IP6_READY of type NM.ActivationStateFlags>
2020-11-26 09:29:31,484 root         DEBUG    NM action queue exhausted, quiting mainloop
2020-11-26 09:29:36,867 root         DEBUG    Checkpoint /org/freedesktop/NetworkManager/Checkpoint/13 rollback executed: dbus.Dictionary({dbus.String('/org/freedesktop/NetworkManager/Devices/92'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/451'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/96'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/118'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/67'): db
us.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/49'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/3'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/81'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/127'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/102'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/57'): dbus.UInt32(0), dbus.String('/org/freedesktop/Network
Manager/Devices/68'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/91'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/65'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/93'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/95'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/82'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/99'): dbus.UInt32(0), dbus.String('/
org/freedesktop/NetworkManager/Devices/35'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/30'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/34'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/39'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/449'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/64'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/134'): dbus.
UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/128'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/452'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/72'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/24'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/141'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/142'): dbus.UInt32(0), dbus.String('/org/freedesktop/Network
Manager/Devices/53'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/4'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/122'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/146'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/63'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/148'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/9'): dbus.UInt32(0), dbus.String('
/org/freedesktop/NetworkManager/Devices/33'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/98'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/140'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/20'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/136'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/55'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/7'): dbus.
UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/110'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/108'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/143'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/130'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/145'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/25'): dbus.UInt32(0), dbus.String('/org/freedesktop/Networ
kManager/Devices/74'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/23'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/104'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/97'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/29'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/22'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/50'): dbus.UInt32(0), dbus.String(
'/org/freedesktop/NetworkManager/Devices/101'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/150'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/459'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/137'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/28'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/8'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/54'): db
us.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/117'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/36'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/103'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/6'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/115'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/66'): dbus.UInt32(0), dbus.String('/org/freedesktop/Networ
kManager/Devices/73'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/71'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/51'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/26'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/138'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/58'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/47'): dbus.UInt32(0), dbus.String(
'/org/freedesktop/NetworkManager/Devices/144'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/120'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/52'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/121'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/21'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/32'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/135'): d
bus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/133'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/119'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/1'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/5'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/100'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/129'): dbus.UInt32(0), dbus.String('/org/freedesktop/Netwo
rkManager/Devices/107'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/139'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/113'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/458'): dbus.UInt32(0)}, signature=dbus.Signature('su'))
Traceback (most recent call last):
  File "/usr/bin/nmstatectl", line 11, in <module>
    load_entry_point('nmstate==0.2.6', 'console_scripts', 'nmstatectl')()
  File "/usr/lib/python3.6/site-packages/nmstatectl/nmstatectl.py", line 59, in main
    return args.func(args)
  File "/usr/lib/python3.6/site-packages/nmstatectl/nmstatectl.py", line 221, in apply
    return apply_state(statedata, args.verify, args.commit, args.timeout)
  File "/usr/lib/python3.6/site-packages/nmstatectl/nmstatectl.py", line 241, in apply_state
    rollback_timeout=timeout,
  File "/usr/lib/python3.6/site-packages/libnmstate/deprecation.py", line 40, in wrapper
    return func(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/libnmstate/nm/nmclient.py", line 96, in wrapped
    ret = func(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/libnmstate/netapplier.py", line 73, in apply
    state.State(desired_state), verify_change, commit, rollback_timeout
  File "/usr/lib/python3.6/site-packages/libnmstate/netapplier.py", line 175, in _apply_ifaces_state
    _verify_change(desired_state)
  File "/usr/lib/python3.6/site-packages/libnmstate/netapplier.py", line 221, in _verify_change
    verifiable_desired_state.verify_interfaces(current_state)
  File "/usr/lib/python3.6/site-packages/libnmstate/state.py", line 331, in verify_interfaces
    self._assert_interfaces_equal(other_state)
  File "/usr/lib/python3.6/site-packages/libnmstate/state.py", line 751, in _assert_interfaces_equal
    current_state.interfaces[ifname],
libnmstate.error.NmstateVerificationError:
desired
=======
---
name: ens3f1.602
type: vlan
state: up
description: VLAN using ens3f1
ipv4:
  dhcp: false
  enabled: true
ipv6:
  enabled: false
mac-address: D4:F5:EF:1A:25:78
mtu: 1500
vlan:
  base-iface: ens3f1
  id: 602

current
=======
---
name: ens3f1.602
type: vlan
state: up
description: VLAN using ens3f1
ipv4:
  enabled: false
ipv6:
  enabled: false
mac-address: D4:F5:EF:1A:25:78
mtu: 1500
vlan:
  base-iface: ens3f1
  id: 602

difference
==========
--- desired
+++ current
@@ -4,8 +4,7 @@
 state: up
 description: VLAN using ens3f1
 ipv4:
-  dhcp: false
-  enabled: true
+  enabled: false
 ipv6:
   enabled: false
 mac-address: D4:F5:EF:1A:25:78


'

Attached NodeNetworkState-with-ip and NodeNetworkConfigurationEnactment-with-ip


Version-Release number of selected component (if applicable):

CNV: 2.5.1
OCP: 2.6.4

How reproducible:


Steps to Reproduce:
1. Apply NodeNetworkConfigurationPolicy from above.
2.
3.

Actual results:

Interface ens3f1.602 is missing


Expected results:

Create a ens3f1.602 interface


Additional info:

The final setup would be: linux-bridge for VM's, linux bridge is attached to ens3f1.602

We followed the official documentation: https://docs.openshift.com/container-platform/4.6/virt/node_network/virt-updating-node-network-config.html#virt-example-vlan-nncp_virt-updating-node-network-config

Comment 1 Robert Bohne 2020-11-26 09:39:22 UTC
Created attachment 1733677 [details]
NodeNetworkConfigurationEnactment

Comment 2 Robert Bohne 2020-11-26 09:39:41 UTC
Created attachment 1733678 [details]
NodeNetworkConfigurationEnactment-with-ip

Comment 3 Robert Bohne 2020-11-26 09:40:12 UTC
Created attachment 1733679 [details]
NodeNetworkState-with-ip

Comment 4 Robert Bohne 2020-11-26 09:40:52 UTC
Here the NCP with IP

oc apply -f - <<EOF
apiVersion: nmstate.io/v1alpha1
kind: NodeNetworkConfigurationPolicy
metadata:
  name: vlan-ens3f1-policy
spec:
  nodeSelector:
    kubernetes.io/hostname: "ocp-master1"
  desiredState:
    interfaces:
    - name: ens3f1.602
      description: VLAN using ens3f1
      type: vlan
      state: up
      ipv4:
          enabled: true
          dhcp: false
      vlan:
        base-iface: ens3f1
        id: 602
EOF

Comment 5 Robert Bohne 2020-11-26 09:53:42 UTC
nns says ens3f1 os down, but ip link show state UP ?? 

[root@ocp-lb ~]# oc debug node/ocp-master1
Creating debug namespace/openshift-debug-node-qmms2 ...
Starting pod/ocp-master1-debug ...
To use host binaries, run `chroot /host`
Pod IP: 172.29.26.52
If you don't see a command prompt, try pressing enter.
sh-4.4# chroot /host
sh-4.4#  ip link show dev ens3f1
3: ens3f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether d4:f5:ef:1a:25:78 brd ff:ff:ff:ff:ff:ff

[root@ocp-lb ~]# oc get nns ocp-master1 -o jsonpath="{.status.currentState.interfaces[?(@.name=='ens3f1')]}"  | jq
{
  "ethernet": {
    "auto-negotiation": false,
    "duplex": "full",
    "speed": 10000,
    "sr-iov": {
      "total-vfs": 0,
      "vfs": []
    }
  },
  "ipv4": {
    "enabled": false
  },
  "ipv6": {
    "enabled": false
  },
  "mac-address": "D4:F5:EF:1A:25:78",
  "mtu": 1500,
  "name": "ens3f1",
  "state": "down",
  "type": "ethernet"
}

Comment 6 Petr Horáček 2020-11-26 09:56:35 UTC
Hello Robert,

Could you try it with following? "ipv4: {enabled: true}" requires an IP to be set on the interface. Since you disabled DHCP and static, there is none.

      ipv4:
          enabled: false

This would create the interface without an IP.

Comment 7 Robert Bohne 2020-11-26 10:54:04 UTC
Hello Petr,

i tried and if failed to:

oc apply -f - <<EOF
apiVersion: nmstate.io/v1alpha1
kind: NodeNetworkConfigurationPolicy
metadata:
  name: vlan-ens3f1-policy
spec:
  nodeSelector:
    kubernetes.io/hostname: "ocp-master1"
  desiredState:
    interfaces:
    - name: ens3f1.602
      description: VLAN using ens3f1
      type: vlan
      state: up
      ipv4:
          enabled: false
          dhcp: false
      vlan:
        base-iface: ens3f1
        id: 602
EOF

Error: 

[root@ocp-lb ~]# oc get nodenetworkconfigurationenactment.nmstate.io/ocp-master1.vlan-ens3f1-policy -o jsonpath='{.status.conditions[?(@.type=="Failing")].message}'
error reconciling NodeNetworkConfigurationPolicy at desired state apply: , rolling back desired state configuration: failed runnig probes after network changes: failed to retrieve default gw at runProbes: timed out waiting for the condition

I will attached NodeNetworkState-with-ip-false and NodeNetworkConfigurationEnactment-with-ip-false in a second.

Comment 8 Robert Bohne 2020-11-26 10:57:53 UTC
Created attachment 1733695 [details]
NodeNetworkState-with-ip-false

Comment 9 Robert Bohne 2020-11-26 10:58:14 UTC
Created attachment 1733696 [details]
NodeNetworkConfigurationEnactment-with-ip-false

Comment 10 Petr Horáček 2020-11-26 12:24:03 UTC
Robert, would you please provide the log of knmstate from the recent run? It should hopefully show us whether nmstatectl did something to the enp2s0 default interface (it did in your original setup, but that may have been due to DHCP).

Comment 12 Petr Horáček 2020-11-26 13:09:48 UTC
So before the configuration we have management IP set on ens2f1.602. And we also have 0.0.0.0/0 route on it:

    - ipv4:
        address:
        - ip: 172.29.26.52
          prefix-length: 24
        dhcp: false
        enabled: true

      - destination: 0.0.0.0/0
        metric: 402
        next-hop-address: 172.29.26.33
        next-hop-interface: ens2f1.602
        table-id: 254

After the configuration, when we run our connectivity probes, we still see the static IP, but we don't have any default route set:

      "ipv4": {
        "address": [
          {
            "ip": "172.29.26.52",
            "prefix-length": 24
          }
        ],
        "dhcp": false,
        "enabled": true
      },

The new connection has DHCP clearly disabled:

      "ipv4": {
        "dhcp": false,
        "enabled": false
      },

-------------

It seems that nmstatectl removed the default GW even though your default interface was not explicitly touched.

We have a workaround for this - setting the default GW explicitly in the policy. However, we don't see this issue for the first time, we should investigate it properly.

Quique, would you find some time to assist Robert with the workaround. But please don't let him get away until we get a proper fix for this bug <.<

Comment 19 Quique Llorente 2020-11-27 11:39:43 UTC
I have being able to to ensure that nmstate does not remove the vlan's default gw adding the whole config to policy

The values are from my env you will have to extrapolate

apiVersion: nmstate.io/v1alpha1                                                 
kind: NodeNetworkConfigurationPolicy                                            
metadata:                                                                       
  name: vlan-bug                                                                
spec:                                                                           
  nodeSelector:                                                                 
    kubernetes.io/hostname: "node02"                                            
  desiredState:                                                                 
    routes:                                                                     
      config:                                                                   
      - destination: 0.0.0.0/0                                                  
        next-hop-address: 192.168.66.2                                          
        next-hop-interface: eth0.602                                            
    interfaces:                                                                 
    - name: eth0.602                                                            
      description: VLAN using ens3f1                                            
      type: vlan                                                                
      state: up                                                                 
      ipv4:                                                                     
        address:                                                                
        - ip: 172.29.25.52                                                      
          prefix-length: 24                                                     
        dhcp: false                                                             
        enabled: true                                                           
      vlan:                                                                     
        base-iface: eth0                                                        
        id: 602                                                                 
    - name: eth1.602                                                            
      description: VLAN using ens3f1                                            
      type: vlan                                                                
      state: up                                                                 
      ipv4:                                                                     
        address:                                                                
        - ip: 172.29.26.52                                                      
          prefix-length: 24                                                     
        dhcp: false                                                             
        enabled: true                                                           
      vlan:                                                                     
        base-iface: eth1                                                        
        id: 602

Comment 20 Robert Bohne 2020-12-04 20:28:25 UTC
After some investigation together with Quique - thanks again! 

The work-around is to add the primary interface to the nncp, the static IP address will inherit


apiVersion: nmstate.io/v1alpha1
kind: NodeNetworkConfigurationPolicy
metadata:
  name: network-config-ocp-master1
spec:
  nodeSelector:
    kubernetes.io/hostname: "ocp-master1"
  desiredState:
    interfaces:
    - name: ens2f1.602                  <==== Primary interface, configured during installation via liveiso + nmcli/nmtui (Static IP).
      description: VLAN using ens2f1
      type: vlan
      state: up
      ipv4:
        dhcp: false
        enabled: true                   <===== enabled because inherit configuration from existing nm config.
      vlan:
        base-iface: ens2f1
        id: 602
    - name: ens3f1.602
      description: VLAN using ens3f1
      type: vlan
      state: up
      ipv4:
        dhcp: false
        enabled: false
      vlan:
        base-iface: ens3f1
        id: 602
    - name: br1
      description: Linux bridge with ens3f1.602 as a port
      type: linux-bridge
      state: up
      ipv4:
        enabled: false
        dhcp: false
      bridge:
        options:
          stp:
            enabled: false
        port:
          - name: ens3f1.602


@Quique sadly I don't have access to the customer env. anymore. If I remember correctly you were able to reproduce the bug. Thank you very much again for your support!

Comment 21 Gris Ge 2020-12-18 03:58:56 UTC
CNV seems ignoring metrics of route(default gateway), but nmstate does not.
This cause nmstate merging desire state with current state, then generated two default gateways.

nmstate-1.0.0-1.el8 has support of multiple gateways, so there is no problem there.

Comment 22 Petr Horáček 2020-12-18 08:21:29 UTC
Gris, this indeed explains the issues with the route setup workaround. However, that issue was secondary. The main problem is that configuration of a VLAN interface (as described in https://bugzilla.redhat.com/show_bug.cgi?id=1901859#c0) deletes the default route of the system. IIUIC there is no connection between the two interfaces and VLAN config should not affect host's connectivity. Any clue what might have caused this? I understand if it is impossible to figure this out without a system reproducing it.

Comment 23 Gris Ge 2020-12-18 09:41:12 UTC
No idea.

Let me backport the multiple gateway support into nmstate-0.3 and we try again there.

Comment 24 Gris Ge 2020-12-22 05:12:19 UTC
I take a second look on this bug.

Initially, the comment https://bugzilla.redhat.com/show_bug.cgi?id=1901859#c0 is caused by ipv4 enabled without IP address.

Then, the multiple default gateway is caused by in desire state the default gateways has metric 402 which is different from running config, hence nmstate treat it as two default gateway which leads to the NmstateNotImplementedError error.
The better way to set a default gateway should be:

```yml
routes:
  config:
  - destination: 0.0.0.0/0
    state: absent
  - destination: 0.0.0.0/0
    next-hop-address: 192.0.2.1
    next-hop-interface: eth1
```

I tried in my VM. The nmstate-0.3 does not remove the default gateways when adding new vlan when the default gateways was created by NetworkManager.

For ip address and routes created by other tool(like ip command or kernel/dracut option), nmstate-0.3 does not support it yet. 1.0 in RHEL 8.4 should works well there.

To continue debuging this issue, the nmstate logs and pre-nmstate network state could helps.

Comment 25 Quique Llorente 2020-12-22 09:43:35 UTC
@(In reply to Gris Ge from comment #24)
> I take a second look on this bug.
> 
> Initially, the comment
> https://bugzilla.redhat.com/show_bug.cgi?id=1901859#c0 is caused by ipv4
> enabled without IP address.
> 
> Then, the multiple default gateway is caused by in desire state the default
> gateways has metric 402 which is different from running config, hence
> nmstate treat it as two default gateway which leads to the
> NmstateNotImplementedError error.
> The better way to set a default gateway should be:
> 
> ```yml
> routes:
>   config:
>   - destination: 0.0.0.0/0
>     state: absent
>   - destination: 0.0.0.0/0
>     next-hop-address: 192.0.2.1
>     next-hop-interface: eth1
> ```
> 
> I tried in my VM. The nmstate-0.3 does not remove the default gateways when
> adding new vlan when the default gateways was created by NetworkManager.
> 
> For ip address and routes created by other tool(like ip command or
> kernel/dracut option), nmstate-0.3 does not support it yet. 1.0 in RHEL 8.4
> should works well there.

But it has being configured with NetworkManager as stated on https://bugzilla.redhat.com/show_bug.cgi?id=1901859#c13, so nmstate should be aware of it.

> 
> To continue debuging this issue, the nmstate logs and pre-nmstate network
> state could helps.

We don't have this env anymore, we have similar bz maybe there we can get the info needed https://bugzilla.redhat.com/show_bug.cgi?id=1879458.

Also it would be nice to retest this fixing the dup default gw issue at nmstate https://bugzilla.redhat.com/show_bug.cgi?id=1909729 to see what happend.

Comment 26 Petr Horáček 2021-02-03 13:40:05 UTC
@fge thanks for you help here. Would you please check Quique's questions and suggestions in the comment above?

Comment 27 Petr Horáček 2021-02-11 13:20:23 UTC
*** Bug 1879458 has been marked as a duplicate of this bug. ***

Comment 30 Gris Ge 2021-02-24 03:52:36 UTC
Hi Quique,


The bug 1909729 has been shipped to RHEL 8.3.0.z. Could check again whether it fix this bug or not?

Comment 31 Quique Llorente 2021-02-24 10:46:28 UTC
(In reply to Gris Ge from comment #30)
> Hi Quique,
> 
> 
> The bug 1909729 has been shipped to RHEL 8.3.0.z. Could check again whether
> it fix this bug or not?

@ysegev Is going to verify it

Comment 32 Yossi Segev 2021-02-24 11:50:06 UTC
Hi Gris,

Can you please specify on which nmstate (or knmstate-handler) version it was fixed, so we can be sure the fix exists on our cluster before trying to verify?

Thanks.

Comment 33 Gris Ge 2021-02-25 04:17:51 UTC
Hi Yossi:

nmstate-0.3.4-25.el8_3

Comment 36 Yossi Segev 2021-03-04 13:07:37 UTC
Verified on a cluster with the following versions:
nmstate-0.3.4-25.el8_3.noarch
kubernetes-nmstate-handler-container-v4.8.0-3 (6a66d2c9e338103d5573289afdeb856c4d1f2b86669206851eef189ea5d0e88f)
OCP: 4.8.0-0.nightly-2021-03-04-014703
CNV: 4.8.0

Verified by running the original scenario from the bug description (with adjustments to the cluster in use - selected worker hostname and NIC name), by applying this NNCP:
apiVersion: nmstate.io/v1alpha1
kind: NodeNetworkConfigurationPolicy
metadata:
  name: vlan-ens8-policy
spec:
  nodeSelector:
    kubernetes.io/hostname: "network01-9rfjb-worker-0-jscck"
  desiredState:
    interfaces:
    - name: ens8.602
      description: VLAN using ens8
      type: vlan
      state: up
      vlan:
        base-iface: ens8
        id: 602

Results:
1. NNCP successfully configured.
2. No error message.
3. Configured VLAN interface (ens8.602) exists on the selected node.

Comment 39 errata-xmlrpc 2021-07-27 14:21:17 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Virtualization 4.8.0 Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2920


Note You need to log in before you can comment on or make changes to this bug.