Description of problem: I'm adding a new host installed with CentOS 8.2 and ovirt 4.4 to a new cluster on my recently upgraded 4.4 engine. The networks are also configured in other clusters that still are version 4.3 I was able to add the ovirtmgmt network and save the changes. When I now try to add another network (or change the bonding mode), it always fails with a python traceback: 2020-07-08 16:18:47,861+0200 INFO (jsonrpc/3) [api.network] START setupNetworks(networks={}, bondings={'bond0': {'nics': ['eno2', 'eno1'], 'options' : 'mode=4 miimon=100 xmit_hash_policy=2', 'switch': 'legacy'}}, options={'connectivityTimeout': 120, 'commitOnSuccess': True, 'connectivityCheck': 'true'}) from=::ffff:10.86.18.3,39088, flow_id=bbd75b4a-ed6d-431a-8fed-27b3f888efca (api:48) 2020-07-08 16:18:47,863+0200 INFO (jsonrpc/5) [jsonrpc.JsonRpcServer] RPC call Host.confirmConnectivity succeeded in 0.00 seconds (__init__:312) 2020-07-08 16:18:48,143+0200 INFO (jsonrpc/3) [api.network] FINISH setupNetworks error='mtu' from=::ffff:10.86.18.3,39088, flow_id=bbd75b4a-ed6d-431 a-8fed-27b3f888efca (api:52) 2020-07-08 16:18:48,143+0200 ERROR (jsonrpc/3) [jsonrpc.JsonRpcServer] Internal server error (__init__:350) Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/yajsonrpc/__init__.py", line 345, in _handle_request res = method(**params) File "/usr/lib/python3.6/site-packages/vdsm/rpc/Bridge.py", line 198, in _dynamicMethod result = fn(*methodArgs) File "<decorator-gen-480>", line 2, in setupNetworks File "/usr/lib/python3.6/site-packages/vdsm/common/api.py", line 50, in method ret = func(*args, **kwargs) File "/usr/lib/python3.6/site-packages/vdsm/API.py", line 1548, in setupNetworks supervdsm.getProxy().setupNetworks(networks, bondings, options) File "/usr/lib/python3.6/site-packages/vdsm/common/supervdsm.py", line 56, in __call__ return callMethod() File "/usr/lib/python3.6/site-packages/vdsm/common/supervdsm.py", line 54, in <lambda> **kwargs) File "<string>", line 2, in setupNetworks File "/usr/lib64/python3.6/multiprocessing/managers.py", line 772, in _callmethod raise convert_to_error(kind, result) KeyError: 'mtu' 2020-07-08 16:18:48,144+0200 INFO (jsonrpc/3) [jsonrpc.JsonRpcServer] RPC call Host.setupNetworks failed (error -32603) in 0.28 seconds (__init__:312) Version-Release number of selected component (if applicable): vdsm-4.40.22-1.el8.x86_64 How reproducible: Steps to Reproduce: 1. Set up a new host 2. Set up the host networking and add ovirtmgmt network to an (existing) bond, and save the changes 3. Add another logical network to the bond and try to save Actual results: Add fails and python traceback is logged Expected results: Network is added. Additional info: The ifcfg files on the host are: ifcfg-bond0:BONDING_OPTS=mode=802.3ad [16/1831] ifcfg-bond0:TYPE=Bond ifcfg-bond0:BONDING_MASTER=yes ifcfg-bond0:HWADDR= ifcfg-bond0:PROXY_METHOD=none ifcfg-bond0:BROWSER_ONLY=no ifcfg-bond0:DEFROUTE=yes ifcfg-bond0:IPV4_FAILURE_FATAL=no ifcfg-bond0:IPV6INIT=no ifcfg-bond0:IPV6_DEFROUTE=yes ifcfg-bond0:IPV6_FAILURE_FATAL=no ifcfg-bond0:IPV6_ADDR_GEN_MODE=stable-privacy ifcfg-bond0:NAME=bond0 ifcfg-bond0:UUID=6ddfb1b5-8f64-4de2-872f-22da490d5946 ifcfg-bond0:DEVICE=bond0 ifcfg-bond0:ONBOOT=yes ifcfg-bond0:IPV6_DISABLED=yes ifcfg-bond0:MACADDR=D0:94:66:33:E8:06 ifcfg-bond0.19:VLAN=yes ifcfg-bond0.19:TYPE=Vlan ifcfg-bond0.19:PHYSDEV=bond0 ifcfg-bond0.19:VLAN_ID=19 ifcfg-bond0.19:REORDER_HDR=yes ifcfg-bond0.19:GVRP=no ifcfg-bond0.19:MVRP=no ifcfg-bond0.19:HWADDR= ifcfg-bond0.19:PROXY_METHOD=none ifcfg-bond0.19:BROWSER_ONLY=no ifcfg-bond0.19:BOOTPROTO=none ifcfg-bond0.19:IPADDR=10.86.19.4 ifcfg-bond0.19:PREFIX=24 ifcfg-bond0.19:GATEWAY=10.86.19.254 ifcfg-bond0.19:DNS1=10.86.18.20 ifcfg-bond0.19:DNS2=10.86.18.21 ifcfg-bond0.19:DNS3=10.86.18.22 ifcfg-bond0.19:DEFROUTE=yes ifcfg-bond0.19:IPV4_FAILURE_FATAL=yes ifcfg-bond0.19:IPV6INIT=no ifcfg-bond0.19:IPV6_DEFROUTE=yes ifcfg-bond0.19:IPV6_FAILURE_FATAL=no ifcfg-bond0.19:IPV6_ADDR_GEN_MODE=stable-privacy ifcfg-bond0.19:NAME=bond0.19 ifcfg-bond0.19:UUID=48608de2-dac3-410d-a05e-80184a95325d ifcfg-bond0.19:DEVICE=bond0.19 ifcfg-bond0.19:ONBOOT=yes ifcfg-bond0.19:DHCP_CLIENT_ID=mac ifcfg-bond0.19:IPV4_DNS_PRIORITY=40 ifcfg-bond0.19:IPV6_DISABLED=yes ifcfg-bond0.19:MTU=1500 ifcfg-eno1:TYPE=Ethernet ifcfg-eno1:NAME=eno1 ifcfg-eno1:UUID=86035cd6-8b42-450f-abd2-f25b8c93b445 ifcfg-eno1:DEVICE=eno1 ifcfg-eno1:ONBOOT=yes ifcfg-eno1:MASTER=bond0 ifcfg-eno1:SLAVE=yes ifcfg-eno1:MTU=1500 ifcfg-eno2:TYPE=Ethernet ifcfg-eno2:NAME=eno2 ifcfg-eno2:UUID=dc640175-0a0e-4ea3-a909-164a026fc495 ifcfg-eno2:DEVICE=eno2 ifcfg-eno2:ONBOOT=yes ifcfg-eno2:MASTER=bond0 ifcfg-eno2:SLAVE=yes ifcfg-eno2:MTU=1500
Hello Rik, can you please share relevant part of supervdsm.log?
Created attachment 1700409 [details] supervdsm log Hi, This is the relevant part of the supervdsm log. According to the debug message, the gateway is on more than one interface? I only see it on a single interface in the ifcfg-* files. Regards, Rik
Hi, thank you for the log. I was able to find the root cause of this bug. It is caused by the fact that your management network (the default route network) is bridgeless (non-VM) with vlan. Until we have a proper fix can you please try to switch ovirtmgmt to bridged (VM) network this should prevent any further issues on that host. Also just FYI the ifcfg is not relevant anymore. We are using NetworkManager since 4.4 and the output of 'nmstatectl show' is more important to us.
Hi Ales, Thanks for the hint. Marking the ovirtmgmt network as a VM network indeed works around the problem. Hopefully this will be fixed in the near future. Regards, Rik
Steps to reproduce: 1. Have a non-vm (bridge-less) vlaned network attached to the host as default route 2. Change another network attachment The issue in vdsm is in the workaround for bug 1748389. We plan to remove this workaround after the release of ovirt-4.4.2 and check if this scenario works on el 8.3, which will include the fix for bug 1748389.
Moving to 4.4.4, because the fix will require nmstate 0.3 from RHEL 8.3.
(In reply to Dominik Holler from comment #5) > Steps to reproduce: > 1. Have a non-vm (bridge-less) vlaned network attached to the host as > default route > 2. Change another network attachment > > The issue in vdsm is in the workaround for bug 1748389. > We plan to remove this workaround after the release of ovirt-4.4.2 and > check if this scenario works on el 8.3, which will include the fix for bug > 1748389. The correct steps to reproduce are: 1. Make ovirtmgmt as non-vm network and vlan tagged network 2. Try to edit any other network attached on the host At the current moment reporudced on 8.2.1 and 8.3 nmstate-0.2.10-1.el8.noarch nmstate-0.3.4-10.el8.noarch
The fix slipped already in 4.4.2.
Verified on - vdsm-4.40.26-1.el8ev and 4.4.2.3-0.6.el8ev
This bugzilla is included in oVirt 4.4.2 release, published on September 17th 2020. Since the problem described in this bug report should be resolved in oVirt 4.4.2 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.