Bug 1724901

Summary: Random failure on creating/removing OVS bridge
Product: Red Hat Enterprise Linux 8 Reporter: Jianlin Shi <jishi>
Component: nmstateAssignee: Gris Ge <fge>
Status: CLOSED CURRENTRELEASE QA Contact: Mingyu Shi <mshi>
Severity: medium Docs Contact:
Priority: medium    
Version: 8.1CC: jiji, network-qe, pasik, phoracek, till
Target Milestone: rc   
Target Release: 8.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-02-20 08:56:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1787989    
Bug Blocks:    

Description Jianlin Shi 2019-06-28 04:02:52 UTC
Description of problem:
fail to add ovs-bridge when add and delete ovs-bridge repeatedly

Version-Release number of selected component (if applicable):
nmstate-0.0.7-1.el8.noarch

How reproducible:
Always

Steps to Reproduce:
#!/bin/bash

ip link del veth0
ip link del veth1
ip link add veth0 type veth peer name veth1
ip link set veth1 up
sleep 2

cat > ovs_bridge_create.yaml << EOF
---
interfaces:
  - name: veth0
    type: ethernet
    state: up
  - name: ovs0
    type: ovs-interface
    state: up
    ipv4:
      enabled: true
      address:
        - ip: 192.0.2.1
          prefix-length: 24
  - name: ovs-br0
    type: ovs-bridge
    state: up
    bridge:
      options:
        fail-mode: ''
        mcast-snooping-enable: false
        rstp: false
        stp: true
      port:
        - name: veth0
          type: system
        - name: ovs0
          type: internal
EOF

cat > ovs_bridge_delete.yaml << EOF
---
interfaces:
  - name: ovs-br0
    type: ovs-bridge
    state: absent
    bridge:
      options:
        fail-mode: ''
        mcast-snooping-enable: false
        rstp: false
        stp: true
      port:
        - name: veth0
          type: system
        - name: ovs0
          type: internal
EOF

for x in `seq 1 100`; do nmstatectl set ovs_bridge_create.yaml || break; nmstatectl set ovs_bridge_delete.yaml || break; done; echo $x 

Actual results:
create or delete would fail

Expected results:
no fail

Additional info:

[root@kvm-06-guest02 ovs]# rpm -q openvswitch
openvswitch-2.9.0-3.el8+7.x86_64
[root@kvm-06-guest02 ovs]# rpm -q NetworkManager
NetworkManager-1.20.0-0.1.el8.x86_64
[root@kvm-06-guest02 ovs]# uname -a
Linux kvm-06-guest02.hv2.lab.eng.bos.redhat.com 4.18.0-107.el8.x86_64 #1 SMP Fri Jun 14 13:46:34 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

Comment 1 Till Maas 2019-11-19 09:40:02 UTC
This seems to be a lot better with NM from the master branch in the upstream CI.

Comment 2 Mingyu Shi 2019-12-16 07:19:00 UTC
In the test recently, it seems OK (but not always) when test with debug/kasan kernel, but very easy to get failed with normal kernel:

[root@hp-dl385g7-04 /mnt/tests/kernel/networking/nmstate/libnmstate]2# rpm -q nmstate NetworkManager openvswitch
nmstate-0.2.0-2.el8.noarch
NetworkManager-1.22.0-0.2.el8.x86_64
openvswitch-2.9.0-3.el8+7.x86_64

Do as following:
ip netns add client
ip netns add server
ip link add vethc type veth peer name veth0_c_p netns client
ip link add veths type veth peer name veth0_s_p netns server
ip netns exec client ip link set lo up
ip netns exec client ip link set veth0_c_p up
ip netns exec client ip addr add 192.168.0.1/24 dev veth0_c_p
ip netns exec client ip addr add 2000::1/64 dev veth0_c_p
ip netns exec server ip link set lo up
ip netns exec server ip link set veth0_s_p up
ip netns exec server ip addr add 192.168.0.2/24 dev veth0_s_p
ip netns exec server ip addr add 2000::2/64 dev veth0_s_p
nmcli device set vethc managed yes
nmcli device set veths managed yes
sleep 5

	cat > ovs-bridge-add.yaml << EOF
---
interfaces:
  - name: vethc
    type: ethernet
    state: up
  - name: veths
    type: ethernet
    state: up
  - name: ovs0
    type: ovs-interface
    state: up
    ipv4:
      enabled: true
      address:
        - ip: 192.168.0.254
          prefix-length: 24
    ipv6:
      enabled: true
      address:
        - ip: 2000::a
          prefix-length: 64
  - name: ovs-br0
    type: ovs-bridge
    state: up
    bridge:
      options:
        fail-mode: ''
        mcast-snooping-enable: false
        rstp: false
        stp: true
      port:
        - name: ovs0
          type: internal
        - name: vethc
          type: system
        - name: veths
          type: system
EOF

nmstatectl set ovs-bridge-add.yaml
2019-12-16 02:10:49,241 root         DEBUG    Checkpoint /org/freedesktop/NetworkManager/Checkpoint/1 created for all devices: 60
2019-12-16 02:10:49,241 root         DEBUG    Adding new interfaces: ['ovs-br0', 'ovs0']
2019-12-16 02:10:49,244 root         DEBUG    Connection settings for ConnectionSetting.create:
id: ovs-br0
iface: ovs-br0
uuid: 9649c21b-fb45-4951-99a8-799bd843bb3d
type: ovs-bridge
autoconnect: True
autoconnect_slaves: <enum NM_SETTING_CONNECTION_AUTOCONNECT_SLAVES_YES of type NM.SettingConnectionAutoconnectSlaves>
2019-12-16 02:10:49,246 root         DEBUG    Connection settings for ConnectionSetting.create:
id: ovs0
iface: ovs0
uuid: 41e9da00-c810-42a8-bbbc-34b7024b4343
type: ovs-interface
autoconnect: True
autoconnect_slaves: <enum NM_SETTING_CONNECTION_AUTOCONNECT_SLAVES_YES of type NM.SettingConnectionAutoconnectSlaves>
2019-12-16 02:10:49,248 root         DEBUG    Connection settings for ConnectionSetting.create:
id: ovs-port-ovs0
iface: ovs-port-ovs0
uuid: 9feda49e-b4da-4853-a317-ac25245dc32f
type: ovs-port
autoconnect: True
autoconnect_slaves: <enum NM_SETTING_CONNECTION_AUTOCONNECT_SLAVES_YES of type NM.SettingConnectionAutoconnectSlaves>
2019-12-16 02:10:49,253 root         DEBUG    Editing interfaces: ['vethc', 'veths']
2019-12-16 02:10:49,253 root         DEBUG    Connection settings for ConnectionSetting.create:
id: vethc
iface: vethc
uuid: 6d1e5e09-6e68-4b6d-9ecc-a05d3afca27c
type: 802-3-ethernet
autoconnect: True
autoconnect_slaves: <enum NM_SETTING_CONNECTION_AUTOCONNECT_SLAVES_YES of type NM.SettingConnectionAutoconnectSlaves>
2019-12-16 02:10:49,255 root         DEBUG    Connection settings for ConnectionSetting.create:
id: veths
iface: veths
uuid: ae98d8ad-3126-4d7d-8472-3f389b14238c
type: 802-3-ethernet
autoconnect: True
autoconnect_slaves: <enum NM_SETTING_CONNECTION_AUTOCONNECT_SLAVES_YES of type NM.SettingConnectionAutoconnectSlaves>
2019-12-16 02:10:49,256 root         DEBUG    Connection settings for ConnectionSetting.create:
id: ovs-port-vethc
iface: ovs-port-vethc
uuid: 6b953778-9c8e-40a9-8e7e-852047525f3e
type: ovs-port
autoconnect: True
autoconnect_slaves: <enum NM_SETTING_CONNECTION_AUTOCONNECT_SLAVES_YES of type NM.SettingConnectionAutoconnectSlaves>
2019-12-16 02:10:49,257 root         DEBUG    Connection settings for ConnectionSetting.create:
id: ovs-port-veths
iface: ovs-port-veths
uuid: ca74fffe-b2ba-437d-8cc2-c6a43af9febb
type: ovs-port
autoconnect: True
autoconnect_slaves: <enum NM_SETTING_CONNECTION_AUTOCONNECT_SLAVES_YES of type NM.SettingConnectionAutoconnectSlaves>
2019-12-16 02:10:49,261 root         DEBUG    Executing NM action: func=add_connection_async
2019-12-16 02:10:49,277 root         DEBUG    Connection adding succeeded: dev=ovs-br0
2019-12-16 02:10:49,277 root         DEBUG    Executing NM action: func=add_connection_async
2019-12-16 02:10:49,293 root         DEBUG    Connection adding succeeded: dev=ovs0
2019-12-16 02:10:49,294 root         DEBUG    Executing NM action: func=add_connection_async
2019-12-16 02:10:49,307 root         DEBUG    Connection adding succeeded: dev=ovs-port-ovs0
2019-12-16 02:10:49,307 root         DEBUG    Executing NM action: func=add_connection_async
2019-12-16 02:10:49,324 root         DEBUG    Connection adding succeeded: dev=vethc
2019-12-16 02:10:49,325 root         DEBUG    Executing NM action: func=add_connection_async
2019-12-16 02:10:49,339 root         DEBUG    Connection adding succeeded: dev=veths
2019-12-16 02:10:49,340 root         DEBUG    Executing NM action: func=add_connection_async
2019-12-16 02:10:49,352 root         DEBUG    Connection adding succeeded: dev=ovs-port-vethc
2019-12-16 02:10:49,352 root         DEBUG    Executing NM action: func=add_connection_async
2019-12-16 02:10:49,541 root         DEBUG    Connection adding succeeded: dev=ovs-port-veths
2019-12-16 02:10:49,542 root         DEBUG    Executing NM action: func=safe_activate_async
2019-12-16 02:10:49,560 root         DEBUG    Connection activation initiated: dev=ovs-br0, con-state=<enum NM_ACTIVE_CONNECTION_STATE_ACTIVATING of type NM.ActiveConnectionState>
2019-12-16 02:10:49,780 root         DEBUG    Connection activation succeeded: dev=ovs-br0, con-state=<enum NM_ACTIVE_CONNECTION_STATE_ACTIVATING of type NM.ActiveConnectionState>, dev-state= <enum NM_DEVICE_STATE_IP_CONFIG of type NM.DeviceState>
2019-12-16 02:10:49,781 root         DEBUG    Executing NM action: func=safe_activate_async
2019-12-16 02:10:49,834 root         DEBUG    Connection activation initiated: dev=ovs-port-ovs0, con-state=<enum NM_ACTIVE_CONNECTION_STATE_ACTIVATING of type NM.ActiveConnectionState>
2019-12-16 02:10:49,904 root         DEBUG    Connection activation succeeded: dev=ovs-port-ovs0, con-state=<enum NM_ACTIVE_CONNECTION_STATE_ACTIVATED of type NM.ActiveConnectionState>, dev-state= <enum NM_DEVICE_STATE_ACTIVATED of type NM.DeviceState>
2019-12-16 02:10:49,904 root         DEBUG    Executing NM action: func=safe_activate_async
2019-12-16 02:10:49,924 root         DEBUG    Connection activation initiated: dev=ovs-port-veths, con-state=<enum NM_ACTIVE_CONNECTION_STATE_ACTIVATING of type NM.ActiveConnectionState>
2019-12-16 02:10:49,980 root         DEBUG    Connection activation succeeded: dev=ovs-port-veths, con-state=<enum NM_ACTIVE_CONNECTION_STATE_ACTIVATED of type NM.ActiveConnectionState>, dev-state= <enum NM_DEVICE_STATE_ACTIVATED of type NM.DeviceState>
2019-12-16 02:10:49,980 root         DEBUG    Executing NM action: func=safe_activate_async
2019-12-16 02:10:49,993 root         DEBUG    Connection activation initiated: dev=ovs-port-vethc, con-state=<enum NM_ACTIVE_CONNECTION_STATE_ACTIVATING of type NM.ActiveConnectionState>
2019-12-16 02:10:50,028 root         DEBUG    Connection activation succeeded: dev=ovs-port-vethc, con-state=<enum NM_ACTIVE_CONNECTION_STATE_ACTIVATED of type NM.ActiveConnectionState>, dev-state= <enum NM_DEVICE_STATE_ACTIVATED of type NM.DeviceState>
2019-12-16 02:10:50,029 root         DEBUG    Executing NM action: func=safe_activate_async
2019-12-16 02:10:50,066 root         DEBUG    Connection activation initiated: dev=ovs0, con-state=<enum NM_ACTIVE_CONNECTION_STATE_ACTIVATING of type NM.ActiveConnectionState>
2019-12-16 02:10:50,205 root         DEBUG    Connection activation succeeded: dev=ovs0, con-state=<enum NM_ACTIVE_CONNECTION_STATE_ACTIVATED of type NM.ActiveConnectionState>, dev-state= <enum NM_DEVICE_STATE_ACTIVATED of type NM.DeviceState>
2019-12-16 02:10:50,206 root         DEBUG    Executing NM action: func=_safe_modify_async
2019-12-16 02:10:50,216 root         DEBUG    Device reapply failed on vethc: error=nm-device-error-quark: Can't reapply changes to 'connection.autoconnect-slaves' setting (3)
Fallback to device activation
2019-12-16 02:10:50,227 root         DEBUG    Connection activation initiated: dev=vethc, con-state=<enum NM_ACTIVE_CONNECTION_STATE_ACTIVATING of type NM.ActiveConnectionState>
2019-12-16 02:10:50,279 root         DEBUG    Connection activation succeeded: dev=vethc, con-state=<enum NM_ACTIVE_CONNECTION_STATE_ACTIVATED of type NM.ActiveConnectionState>, dev-state= <enum NM_DEVICE_STATE_ACTIVATED of type NM.DeviceState>
2019-12-16 02:10:50,280 root         DEBUG    Executing NM action: func=_safe_modify_async
2019-12-16 02:10:50,289 root         DEBUG    Device reapply failed on veths: error=nm-device-error-quark: Can't reapply changes to 'connection.autoconnect-slaves' setting (3)
Fallback to device activation
2019-12-16 02:10:50,303 root         DEBUG    Connection activation initiated: dev=veths, con-state=<enum NM_ACTIVE_CONNECTION_STATE_ACTIVATING of type NM.ActiveConnectionState>
2019-12-16 02:10:50,349 root         DEBUG    Connection activation succeeded: dev=veths, con-state=<enum NM_ACTIVE_CONNECTION_STATE_ACTIVATED of type NM.ActiveConnectionState>, dev-state= <enum NM_DEVICE_STATE_ACTIVATED of type NM.DeviceState>
2019-12-16 02:10:50,350 root         DEBUG    NM action queue exhausted, quiting mainloop
2019-12-16 02:10:50,566 root         DEBUG    Checkpoint /org/freedesktop/NetworkManager/Checkpoint/1 rollback executed: dbus.Dictionary({dbus.String('/org/freedesktop/NetworkManager/Devices/8'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/9'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/1'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/12'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/3'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/2'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/6'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/10'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/4'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/5'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/7'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/11'): dbus.UInt32(0)}, signature=dbus.Signature('su'))
Traceback (most recent call last):
  File "/usr/bin/nmstatectl", line 11, in <module>
    load_entry_point('nmstate==0.2.0', 'console_scripts', 'nmstatectl')()
  File "/usr/lib/python3.6/site-packages/nmstatectl/nmstatectl.py", line 59, in main
    return args.func(args)
  File "/usr/lib/python3.6/site-packages/nmstatectl/nmstatectl.py", line 217, in apply
    statedata, args.verify, args.commit, args.timeout
  File "/usr/lib/python3.6/site-packages/nmstatectl/nmstatectl.py", line 237, in apply_state
    checkpoint = libnmstate.apply(state, verify_change, commit, timeout)
  File "/usr/lib/python3.6/site-packages/libnmstate/netapplier.py", line 66, in apply
    state.State(desired_state), verify_change, commit, rollback_timeout
  File "/usr/lib/python3.6/site-packages/libnmstate/netapplier.py", line 153, in _apply_ifaces_state
    _verify_change(desired_state)
  File "/usr/lib/python3.6/site-packages/libnmstate/netapplier.py", line 200, in _verify_change
    desired_state.verify_interfaces(current_state)
  File "/usr/lib/python3.6/site-packages/libnmstate/state.py", line 327, in verify_interfaces
    self._assert_interfaces_equal(other_state)
  File "/usr/lib/python3.6/site-packages/libnmstate/state.py", line 634, in _assert_interfaces_equal
    current_state.interfaces[ifname],
libnmstate.error.NmstateVerificationError: 
desired
=======
---
name: ovs-br0
type: ovs-bridge
state: up
bridge:
  options:
    fail-mode: ''
    mcast-snooping-enable: false
    rstp: false
    stp: true
  port:
  - name: ovs0
    type: internal
  - name: vethc
    type: system
  - name: veths
    type: system
ipv4: {}
ipv6:
  enabled: false

current
=======
---
name: ovs-br0
type: ovs-bridge
state: up
bridge:
  options:
    fail-mode: ''
    mcast-snooping-enable: false
    rstp: false
    stp: true
  port:
  - name: ovs0
  - name: vethc
  - name: veths
ipv4: {}
ipv6:
  enabled: false

difference
==========
--- desired
+++ current
@@ -10,11 +10,8 @@
     stp: true
   port:
   - name: ovs0
-    type: internal
   - name: vethc
-    type: system
   - name: veths
-    type: system
 ipv4: {}
 ipv6:
   enabled: false


Almost 100% failed.

Comment 3 Gris Ge 2020-02-12 03:55:30 UTC
All the know failure has been fixed in upstream of NetworkManager except:


Bug 1787989 - NM OVS: Activating of OVS internal port fail with NM_DEVICE_STATE_UNMANAGED

Comment 4 Gris Ge 2020-02-20 08:56:32 UTC
This has been fixed by NetworkManager-1.22.8-1.el8.

No failure found in 1000 round.