+++ This bug was initially created as a clone of Bug #1855078 +++ Description of problem: I'm adding a new host installed with CentOS 8.2 and ovirt 4.4 to a new cluster on my recently upgraded 4.4 engine. The networks are also configured in other clusters that still are version 4.3 I was able to add the ovirtmgmt network and save the changes. When I now try to add another network (or change the bonding mode), it always fails with a python traceback: 2020-07-08 16:18:47,861+0200 INFO (jsonrpc/3) [api.network] START setupNetworks(networks={}, bondings={'bond0': {'nics': ['eno2', 'eno1'], 'options' : 'mode=4 miimon=100 xmit_hash_policy=2', 'switch': 'legacy'}}, options={'connectivityTimeout': 120, 'commitOnSuccess': True, 'connectivityCheck': 'true'}) from=::ffff:10.86.18.3,39088, flow_id=bbd75b4a-ed6d-431a-8fed-27b3f888efca (api:48) 2020-07-08 16:18:47,863+0200 INFO (jsonrpc/5) [jsonrpc.JsonRpcServer] RPC call Host.confirmConnectivity succeeded in 0.00 seconds (__init__:312) 2020-07-08 16:18:48,143+0200 INFO (jsonrpc/3) [api.network] FINISH setupNetworks error='mtu' from=::ffff:10.86.18.3,39088, flow_id=bbd75b4a-ed6d-431 a-8fed-27b3f888efca (api:52) 2020-07-08 16:18:48,143+0200 ERROR (jsonrpc/3) [jsonrpc.JsonRpcServer] Internal server error (__init__:350) Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/yajsonrpc/__init__.py", line 345, in _handle_request res = method(**params) File "/usr/lib/python3.6/site-packages/vdsm/rpc/Bridge.py", line 198, in _dynamicMethod result = fn(*methodArgs) File "<decorator-gen-480>", line 2, in setupNetworks File "/usr/lib/python3.6/site-packages/vdsm/common/api.py", line 50, in method ret = func(*args, **kwargs) File "/usr/lib/python3.6/site-packages/vdsm/API.py", line 1548, in setupNetworks supervdsm.getProxy().setupNetworks(networks, bondings, options) File "/usr/lib/python3.6/site-packages/vdsm/common/supervdsm.py", line 56, in __call__ return callMethod() File "/usr/lib/python3.6/site-packages/vdsm/common/supervdsm.py", line 54, in <lambda> **kwargs) File "<string>", line 2, in setupNetworks File "/usr/lib64/python3.6/multiprocessing/managers.py", line 772, in _callmethod raise convert_to_error(kind, result) KeyError: 'mtu' 2020-07-08 16:18:48,144+0200 INFO (jsonrpc/3) [jsonrpc.JsonRpcServer] RPC call Host.setupNetworks failed (error -32603) in 0.28 seconds (__init__:312) Version-Release number of selected component (if applicable): vdsm-4.40.22-1.el8.x86_64 How reproducible: Steps to Reproduce: 1. Set up a new host 2. Set up the host networking and add ovirtmgmt network to an (existing) bond, and save the changes 3. Add another logical network to the bond and try to save Actual results: Add fails and python traceback is logged Expected results: Network is added. Additional info: The ifcfg files on the host are: ifcfg-bond0:BONDING_OPTS=mode=802.3ad [16/1831] ifcfg-bond0:TYPE=Bond ifcfg-bond0:BONDING_MASTER=yes ifcfg-bond0:HWADDR= ifcfg-bond0:PROXY_METHOD=none ifcfg-bond0:BROWSER_ONLY=no ifcfg-bond0:DEFROUTE=yes ifcfg-bond0:IPV4_FAILURE_FATAL=no ifcfg-bond0:IPV6INIT=no ifcfg-bond0:IPV6_DEFROUTE=yes ifcfg-bond0:IPV6_FAILURE_FATAL=no ifcfg-bond0:IPV6_ADDR_GEN_MODE=stable-privacy ifcfg-bond0:NAME=bond0 ifcfg-bond0:UUID=6ddfb1b5-8f64-4de2-872f-22da490d5946 ifcfg-bond0:DEVICE=bond0 ifcfg-bond0:ONBOOT=yes ifcfg-bond0:IPV6_DISABLED=yes ifcfg-bond0:MACADDR=D0:94:66:33:E8:06 ifcfg-bond0.19:VLAN=yes ifcfg-bond0.19:TYPE=Vlan ifcfg-bond0.19:PHYSDEV=bond0 ifcfg-bond0.19:VLAN_ID=19 ifcfg-bond0.19:REORDER_HDR=yes ifcfg-bond0.19:GVRP=no ifcfg-bond0.19:MVRP=no ifcfg-bond0.19:HWADDR= ifcfg-bond0.19:PROXY_METHOD=none ifcfg-bond0.19:BROWSER_ONLY=no ifcfg-bond0.19:BOOTPROTO=none ifcfg-bond0.19:IPADDR=10.86.19.4 ifcfg-bond0.19:PREFIX=24 ifcfg-bond0.19:GATEWAY=10.86.19.254 ifcfg-bond0.19:DNS1=10.86.18.20 ifcfg-bond0.19:DNS2=10.86.18.21 ifcfg-bond0.19:DNS3=10.86.18.22 ifcfg-bond0.19:DEFROUTE=yes ifcfg-bond0.19:IPV4_FAILURE_FATAL=yes ifcfg-bond0.19:IPV6INIT=no ifcfg-bond0.19:IPV6_DEFROUTE=yes ifcfg-bond0.19:IPV6_FAILURE_FATAL=no ifcfg-bond0.19:IPV6_ADDR_GEN_MODE=stable-privacy ifcfg-bond0.19:NAME=bond0.19 ifcfg-bond0.19:UUID=48608de2-dac3-410d-a05e-80184a95325d ifcfg-bond0.19:DEVICE=bond0.19 ifcfg-bond0.19:ONBOOT=yes ifcfg-bond0.19:DHCP_CLIENT_ID=mac ifcfg-bond0.19:IPV4_DNS_PRIORITY=40 ifcfg-bond0.19:IPV6_DISABLED=yes ifcfg-bond0.19:MTU=1500 ifcfg-eno1:TYPE=Ethernet ifcfg-eno1:NAME=eno1 ifcfg-eno1:UUID=86035cd6-8b42-450f-abd2-f25b8c93b445 ifcfg-eno1:DEVICE=eno1 ifcfg-eno1:ONBOOT=yes ifcfg-eno1:MASTER=bond0 ifcfg-eno1:SLAVE=yes ifcfg-eno1:MTU=1500 ifcfg-eno2:TYPE=Ethernet ifcfg-eno2:NAME=eno2 ifcfg-eno2:UUID=dc640175-0a0e-4ea3-a909-164a026fc495 ifcfg-eno2:DEVICE=eno2 ifcfg-eno2:ONBOOT=yes ifcfg-eno2:MASTER=bond0 ifcfg-eno2:SLAVE=yes ifcfg-eno2:MTU=1500 --- Additional comment from Ales Musil on 2020-07-09 07:29:50 UTC --- Hello Rik, can you please share relevant part of supervdsm.log? --- Additional comment from Rik Theys on 2020-07-09 07:41:37 UTC --- Hi, This is the relevant part of the supervdsm log. According to the debug message, the gateway is on more than one interface? I only see it on a single interface in the ifcfg-* files. Regards, Rik --- Additional comment from Ales Musil on 2020-07-09 08:45:37 UTC --- Hi, thank you for the log. I was able to find the root cause of this bug. It is caused by the fact that your management network (the default route network) is bridgeless (non-VM) with vlan. Until we have a proper fix can you please try to switch ovirtmgmt to bridged (VM) network this should prevent any further issues on that host. Also just FYI the ifcfg is not relevant anymore. We are using NetworkManager since 4.4 and the output of 'nmstatectl show' is more important to us. --- Additional comment from Rik Theys on 2020-07-09 09:00:10 UTC --- Hi Ales, Thanks for the hint. Marking the ovirtmgmt network as a VM network indeed works around the problem. Hopefully this will be fixed in the near future. Regards, Rik --- Additional comment from Dominik Holler on 2020-07-09 09:58:57 UTC --- Steps to reproduce: 1. Have a non-vm (bridge-less) vlaned network attached to the host as default route 2. Change another network attachment The issue in vdsm is in the workaround for bug 1748389. We plan to remove this workaround after the release of ovirt-4.4.2 and check if this scenario works on el 8.3, which will include the fix for bug 1748389. --- Additional comment from Dominik Holler on 2020-08-11 13:36:29 UTC --- Moving to 4.4.4, because the fix will require nmstate 0.3 from RHEL 8.3.
set target milestone according to the upstream bug
Verified on - vdsm-4.40.26-1.el8ev.x86_64 and 4.4.2.3-0.6.el8ev
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (RHV RHEL Host (ovirt-host) 4.4.z [ovirt-4.4.2]), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:3822