Description of problem: Create bridge: resources.resource 2019-10-06 19:35:54 INFO Posting {'apiVersion': 'nmstate.io/v1alpha1', 'kind': 'NodeNetworkConfigurationPolicy', 'metadata': {'name': 'linux-bridge-test-connectivity'}, 'spec': {'desiredState': {'interfaces': [{'name': 'br1test', 'type': 'linux-bridge', 'state': 'up', 'bridge': {'options': {'stp': {'enabled': False}}, 'port': []}}]}}} on the node the bridge created without vlan_filtering 1 cat /sys/class/net/br1test/bridge/vlan_filtering 0 Version-Release number of selected component (if applicable): /container-native-virtualization-kubernetes-nmstate-handler-rhel8:v2.1.0-7
Looks like we are affected by the "unmanaged" issue we have upstream https://nmstate.atlassian.net/browse/NMSTATE-247 2019-10-06 20:24:43,041 root ERROR NM main-loop aborted: Connection activation failed on connection_id br1test: error=nm-manager-error-quark: Connection 'br1test' is not available on device br1test because device is strictly unmanaged (2) 2019-10-06 20:24:43,047 root DEBUG Checkpoint /org/freedesktop/NetworkManager/Checkpoint/8 rollback executed: dbus.Dictionary({dbus.String('/org/freedesktop/NetworkManager/Devices/3'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/50'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/5'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/28'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/20'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/11'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/1'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/15'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/23'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/47'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/27'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/6'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/16'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/7'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/48'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/51'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/2'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/13'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/49'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/14'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/21'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/12'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/18'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkMa We have fix it this upstream, with a simple retry mechanism, so we don't wait for reconcile. Let's release new version of kubernetes-nmstate d/s and test it for CNV 2.2
Quique, please include a link to PR when you move a bug to POST. (If the PR is already merged, the BZ should go to MODIFIED.)
This is the PR that retry if nmstatectl set fails https://github.com/nmstate/kubernetes-nmstate/pull/196 There is a bug at NetworkManager and it's fixed at NetworkManager 1.20.
Meni, this was a bug in older version of NetworkManager. RHCOS 4.3 should have it fixed (NetworkManager 1.20). Would you please verify that and close this bug?
QE will verify the bug once get an available OCP4.3 + CNV2.2 build
Yan, did you mean to mark 1771519 as blocking (it blocks this bug, not the other way around)? Or maybe I just don't understand BugZilla tags properly.
@Petr, yes, we failed to setup OCP4.3 because there was a change in rhcos image delivery. But I received a mail just now, seems we have fix for that now, I will have a try and update to you.
Test on OCP4.3 CNV2.2 4.3.0-0.nightly-2019-11-21-122827 container-native-virtualization-kubernetes-nmstate-handler-rhel8:v2.2.0-8 The bridge create ot below error for nmstate-handler {"level":"info","ts":1574924853.6925921,"logger":"client","msg":"2 retries left after nmstatectl set command error: failed to execute nmstatectl set --no-commit --timeout 240: 'exit status 1' '' '2019-11-28 07:06:52,000 root DEBUG Checkpoint /org/freedesktop/NetworkManager/Checkpoint/4 created for all devices: 240\n2019-11-28 07:06:52,000 root DEBUG Adding new interfaces: ['brext']\n2019-11-28 07:06:52,091 root DEBUG Connection settings for ConnectionSetting.create:\nid: brext\niface: brext\nuuid: 1b9101c2-b931-4906-9780-2f644a8c6eef\ntype: bridge\nautoconnect: True\nautoconnect_slaves: <enum NM_SETTING_CONNECTION_AUTOCONNECT_SLAVES_YES of type NM.SettingConnectionAutoconnectSlaves>\n2019-11-28 07:06:52,094 root DEBUG Editing interfaces: []\n2019-11-28 07:06:52,096 root DEBUG Executing NM action: func=add_connection_async\n2019-11-28 07:06:52,197 root DEBUG Connection adding succeeded: dev=brext\n2019-11-28 07:06:52,197 root DEBUG Executing NM action: func=safe_activate_async\n2019-11-28 07:06:52,401 root DEBUG Connection activation initiated: dev=brext, con-state=<enum NM_ACTIVE_CONNECTION_STATE_ACTIVATING of type NM.ActiveConnectionState>\n2019-11-28 07:07:27,132 root WARNING NM main-loop timed out.\n2019-11-28 07:07:27,141 root DEBUG Checkpoint /org/freedesktop/NetworkManager/Checkpoint/4 rollback executed: dbus.Dictionary({dbus.String('/org/freedesktop/NetworkManager/Devices/84'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/3'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/10'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/28'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/4'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/19'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/31'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/29'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/77'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/6'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/13'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/14'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/17'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/79'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/21'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/53'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/54'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/30'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/9'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/51'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/24'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/5'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/81'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/82'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/50'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/75'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/7'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/56'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/1'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/15'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/27'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/83'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/25'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/52'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/57'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/12'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/55'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/76'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/2'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/18'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/80'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/26'): dbus.UInt32(0), dbus.String('/org/freedesktop/NetworkManager/Devices/58'): dbus.UInt32(0)}, signature=dbus.Signature('su'))\nTraceback (most recent call last):\n File \"/usr/bin/nmstatectl\", line 11, in <module>\n load_entry_point('nmstate==0.1.1', 'console_scripts', 'nmstatectl')()\n File \"/usr/lib/python3.6/site-packages/nmstatectl/nmstatectl.py\", line 62, in main\n return args.func(args)\n File \"/usr/lib/python3.6/site-packages/nmstatectl/nmstatectl.py\", line 224, in apply\n return apply_state(statedata, args.verify, args.commit, args.timeout)\n File \"/usr/lib/python3.6/site-packages/nmstatectl/nmstatectl.py\", line 240, in apply_state\n checkpoint = libnmstate.apply(state, verify_change, commit, timeout)\n File \"/usr/lib/python3.6/site-packages/libnmstate/netapplier.py\", line 67, in apply\n state.State(desired_state), verify_change, commit, rollback_timeout\n File \"/usr/lib/python3.6/site-packages/libnmstate/netapplier.py\", line 147, in _apply_ifaces_state\n con_profiles=ifaces_add_configs + ifaces_edit_configs,\n File \"/usr/lib64/python3.6/contextlib.py\", line 88, in __exit__\n next(self.gen)\n File \"/usr/lib/python3.6/site-packages/libnmstate/netapplier.py\", line 213, in _setup_providers\n mainloop.error\nlibnmstate.error.NmstateLibnmError: Unexpected failure of libnm when running the mainloop: run timeout\n'"}
The bridge created at first and then disappeared and got some error in nmstate-handler.
Quique, it looks like we still get the mainloop timeout. Do we have the timeout value overwrite in place on D/S?
We have the old sed # TODO: Delete this line after we update nmstate to include the change # https://github.com/nmstate/nmstate/commit/a4baaff50fef84d6d326977ff647bece1e917a26 RUN sed -i "s/run(timeout=20/run(timeout=60/g" /usr/lib/python3.6/site-packages/libnmstate/netapplier.py But this does not apply anymore, also we have to really investigate why its timing out and propose proper solution to nmstate team.0
Test on OCP4.3 CNV2.2 nmstate image: container-native-virtualization-kubernetes-nmstate-handler-rhel8:v2.2.0-11 $ cat /sys/class/net/br3test/bridge/vlan_filtering 1 Issue have been fixed. Move to verify.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2020:0307