DescriptionKumar Mashalkar
2020-08-12 10:07:12 UTC
+++ This bug was initially created as a clone of Bug #1855078 +++
Description of problem:
I'm adding a new host installed with CentOS 8.2 and ovirt 4.4 to a new cluster on my recently upgraded 4.4 engine. The networks are also configured in other clusters that still are version 4.3
I was able to add the ovirtmgmt network and save the changes. When I now try to add another network (or change the bonding mode), it always fails with a python traceback:
2020-07-08 16:18:47,861+0200 INFO (jsonrpc/3) [api.network] START setupNetworks(networks={}, bondings={'bond0': {'nics': ['eno2', 'eno1'], 'options'
: 'mode=4 miimon=100 xmit_hash_policy=2', 'switch': 'legacy'}}, options={'connectivityTimeout': 120, 'commitOnSuccess': True, 'connectivityCheck': 'true'}) from=::ffff:10.86.18.3,39088, flow_id=bbd75b4a-ed6d-431a-8fed-27b3f888efca (api:48)
2020-07-08 16:18:47,863+0200 INFO (jsonrpc/5) [jsonrpc.JsonRpcServer] RPC call Host.confirmConnectivity succeeded in 0.00 seconds (__init__:312)
2020-07-08 16:18:48,143+0200 INFO (jsonrpc/3) [api.network] FINISH setupNetworks error='mtu' from=::ffff:10.86.18.3,39088, flow_id=bbd75b4a-ed6d-431
a-8fed-27b3f888efca (api:52)
2020-07-08 16:18:48,143+0200 ERROR (jsonrpc/3) [jsonrpc.JsonRpcServer] Internal server error (__init__:350)
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/yajsonrpc/__init__.py", line 345, in _handle_request
res = method(**params)
File "/usr/lib/python3.6/site-packages/vdsm/rpc/Bridge.py", line 198, in _dynamicMethod
result = fn(*methodArgs)
File "<decorator-gen-480>", line 2, in setupNetworks
File "/usr/lib/python3.6/site-packages/vdsm/common/api.py", line 50, in method
ret = func(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/vdsm/API.py", line 1548, in setupNetworks
supervdsm.getProxy().setupNetworks(networks, bondings, options)
File "/usr/lib/python3.6/site-packages/vdsm/common/supervdsm.py", line 56, in __call__
return callMethod()
File "/usr/lib/python3.6/site-packages/vdsm/common/supervdsm.py", line 54, in <lambda>
**kwargs)
File "<string>", line 2, in setupNetworks
File "/usr/lib64/python3.6/multiprocessing/managers.py", line 772, in _callmethod
raise convert_to_error(kind, result)
KeyError: 'mtu'
2020-07-08 16:18:48,144+0200 INFO (jsonrpc/3) [jsonrpc.JsonRpcServer] RPC call Host.setupNetworks failed (error -32603) in 0.28 seconds (__init__:312)
Version-Release number of selected component (if applicable):
vdsm-4.40.22-1.el8.x86_64
How reproducible:
Steps to Reproduce:
1. Set up a new host
2. Set up the host networking and add ovirtmgmt network to an (existing) bond, and save the changes
3. Add another logical network to the bond and try to save
Actual results:
Add fails and python traceback is logged
Expected results:
Network is added.
Additional info:
The ifcfg files on the host are:
ifcfg-bond0:BONDING_OPTS=mode=802.3ad [16/1831]
ifcfg-bond0:TYPE=Bond
ifcfg-bond0:BONDING_MASTER=yes
ifcfg-bond0:HWADDR=
ifcfg-bond0:PROXY_METHOD=none
ifcfg-bond0:BROWSER_ONLY=no
ifcfg-bond0:DEFROUTE=yes
ifcfg-bond0:IPV4_FAILURE_FATAL=no
ifcfg-bond0:IPV6INIT=no
ifcfg-bond0:IPV6_DEFROUTE=yes
ifcfg-bond0:IPV6_FAILURE_FATAL=no
ifcfg-bond0:IPV6_ADDR_GEN_MODE=stable-privacy
ifcfg-bond0:NAME=bond0
ifcfg-bond0:UUID=6ddfb1b5-8f64-4de2-872f-22da490d5946
ifcfg-bond0:DEVICE=bond0
ifcfg-bond0:ONBOOT=yes
ifcfg-bond0:IPV6_DISABLED=yes
ifcfg-bond0:MACADDR=D0:94:66:33:E8:06
ifcfg-bond0.19:VLAN=yes
ifcfg-bond0.19:TYPE=Vlan
ifcfg-bond0.19:PHYSDEV=bond0
ifcfg-bond0.19:VLAN_ID=19
ifcfg-bond0.19:REORDER_HDR=yes
ifcfg-bond0.19:GVRP=no
ifcfg-bond0.19:MVRP=no
ifcfg-bond0.19:HWADDR=
ifcfg-bond0.19:PROXY_METHOD=none
ifcfg-bond0.19:BROWSER_ONLY=no
ifcfg-bond0.19:BOOTPROTO=none
ifcfg-bond0.19:IPADDR=10.86.19.4
ifcfg-bond0.19:PREFIX=24
ifcfg-bond0.19:GATEWAY=10.86.19.254
ifcfg-bond0.19:DNS1=10.86.18.20
ifcfg-bond0.19:DNS2=10.86.18.21
ifcfg-bond0.19:DNS3=10.86.18.22
ifcfg-bond0.19:DEFROUTE=yes
ifcfg-bond0.19:IPV4_FAILURE_FATAL=yes
ifcfg-bond0.19:IPV6INIT=no
ifcfg-bond0.19:IPV6_DEFROUTE=yes
ifcfg-bond0.19:IPV6_FAILURE_FATAL=no
ifcfg-bond0.19:IPV6_ADDR_GEN_MODE=stable-privacy
ifcfg-bond0.19:NAME=bond0.19
ifcfg-bond0.19:UUID=48608de2-dac3-410d-a05e-80184a95325d
ifcfg-bond0.19:DEVICE=bond0.19
ifcfg-bond0.19:ONBOOT=yes
ifcfg-bond0.19:DHCP_CLIENT_ID=mac
ifcfg-bond0.19:IPV4_DNS_PRIORITY=40
ifcfg-bond0.19:IPV6_DISABLED=yes
ifcfg-bond0.19:MTU=1500
ifcfg-eno1:TYPE=Ethernet
ifcfg-eno1:NAME=eno1
ifcfg-eno1:UUID=86035cd6-8b42-450f-abd2-f25b8c93b445
ifcfg-eno1:DEVICE=eno1
ifcfg-eno1:ONBOOT=yes
ifcfg-eno1:MASTER=bond0
ifcfg-eno1:SLAVE=yes
ifcfg-eno1:MTU=1500
ifcfg-eno2:TYPE=Ethernet
ifcfg-eno2:NAME=eno2
ifcfg-eno2:UUID=dc640175-0a0e-4ea3-a909-164a026fc495
ifcfg-eno2:DEVICE=eno2
ifcfg-eno2:ONBOOT=yes
ifcfg-eno2:MASTER=bond0
ifcfg-eno2:SLAVE=yes
ifcfg-eno2:MTU=1500
--- Additional comment from Ales Musil on 2020-07-09 07:29:50 UTC ---
Hello Rik,
can you please share relevant part of supervdsm.log?
--- Additional comment from Rik Theys on 2020-07-09 07:41:37 UTC ---
Hi,
This is the relevant part of the supervdsm log. According to the debug message, the gateway is on more than one interface? I only see it on a single interface in the ifcfg-* files.
Regards,
Rik
--- Additional comment from Ales Musil on 2020-07-09 08:45:37 UTC ---
Hi,
thank you for the log. I was able to find the root cause of this bug. It is caused by the fact that your management network (the default route network) is bridgeless (non-VM) with vlan.
Until we have a proper fix can you please try to switch ovirtmgmt to bridged (VM) network this should prevent any further issues on that host.
Also just FYI the ifcfg is not relevant anymore. We are using NetworkManager since 4.4 and the output of 'nmstatectl show' is more important to us.
--- Additional comment from Rik Theys on 2020-07-09 09:00:10 UTC ---
Hi Ales,
Thanks for the hint. Marking the ovirtmgmt network as a VM network indeed works around the problem.
Hopefully this will be fixed in the near future.
Regards,
Rik
--- Additional comment from Dominik Holler on 2020-07-09 09:58:57 UTC ---
Steps to reproduce:
1. Have a non-vm (bridge-less) vlaned network attached to the host as default route
2. Change another network attachment
The issue in vdsm is in the workaround for bug 1748389.
We plan to remove this workaround after the release of ovirt-4.4.2 and
check if this scenario works on el 8.3, which will include the fix for bug 1748389.
--- Additional comment from Dominik Holler on 2020-08-11 13:36:29 UTC ---
Moving to 4.4.4, because the fix will require nmstate 0.3 from RHEL 8.3.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory (RHV RHEL Host (ovirt-host) 4.4.z [ovirt-4.4.2]), and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHBA-2020:3822