Bug 1855078 - KeyError with vlanned bridgeless default route networks
Summary: KeyError with vlanned bridgeless default route networks
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: vdsm
Classification: oVirt
Component: General
Version: 4.40.22
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified with 1 vote
Target Milestone: ovirt-4.4.2
: 4.40.26
Assignee: Ales Musil
QA Contact: Michael Burman
URL:
Whiteboard:
Depends On:
Blocks: 1868321
TreeView+ depends on / blocked
 
Reported: 2020-07-08 21:47 UTC by Rik Theys
Modified: 2020-09-18 07:11 UTC (History)
5 users (show)

Fixed In Version: 4.40.26
Clone Of:
: 1868321 (view as bug list)
Environment:
Last Closed: 2020-09-18 07:11:52 UTC
oVirt Team: Network
Embargoed:
mperina: ovirt-4.4?
dholler: planning_ack?
dholler: devel_ack+
mburman: testing_ack+


Attachments (Terms of Use)
supervdsm log (4.25 KB, text/plain)
2020-07-09 07:41 UTC, Rik Theys
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 110776 0 master MERGED net, nmstate: Fix KeyError for bridgeless management networks 2020-12-08 07:22:43 UTC

Description Rik Theys 2020-07-08 21:47:57 UTC
Description of problem:
I'm adding a new host installed with CentOS 8.2 and ovirt 4.4 to a new cluster on my recently upgraded 4.4 engine. The networks are also configured in other clusters that still are version 4.3

I was able to add the ovirtmgmt network and save the changes. When I now try to add another network (or change the bonding mode), it always fails with a python traceback:

2020-07-08 16:18:47,861+0200 INFO  (jsonrpc/3) [api.network] START setupNetworks(networks={}, bondings={'bond0': {'nics': ['eno2', 'eno1'], 'options'
: 'mode=4 miimon=100 xmit_hash_policy=2', 'switch': 'legacy'}}, options={'connectivityTimeout': 120, 'commitOnSuccess': True, 'connectivityCheck': 'true'}) from=::ffff:10.86.18.3,39088, flow_id=bbd75b4a-ed6d-431a-8fed-27b3f888efca (api:48)
2020-07-08 16:18:47,863+0200 INFO  (jsonrpc/5) [jsonrpc.JsonRpcServer] RPC call Host.confirmConnectivity succeeded in 0.00 seconds (__init__:312)

2020-07-08 16:18:48,143+0200 INFO  (jsonrpc/3) [api.network] FINISH setupNetworks error='mtu' from=::ffff:10.86.18.3,39088, flow_id=bbd75b4a-ed6d-431
a-8fed-27b3f888efca (api:52)
2020-07-08 16:18:48,143+0200 ERROR (jsonrpc/3) [jsonrpc.JsonRpcServer] Internal server error (__init__:350)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/yajsonrpc/__init__.py", line 345, in _handle_request
    res = method(**params)
  File "/usr/lib/python3.6/site-packages/vdsm/rpc/Bridge.py", line 198, in _dynamicMethod
    result = fn(*methodArgs)
  File "<decorator-gen-480>", line 2, in setupNetworks
  File "/usr/lib/python3.6/site-packages/vdsm/common/api.py", line 50, in method
    ret = func(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/vdsm/API.py", line 1548, in setupNetworks
    supervdsm.getProxy().setupNetworks(networks, bondings, options)
  File "/usr/lib/python3.6/site-packages/vdsm/common/supervdsm.py", line 56, in __call__
    return callMethod()
  File "/usr/lib/python3.6/site-packages/vdsm/common/supervdsm.py", line 54, in <lambda>
    **kwargs)
  File "<string>", line 2, in setupNetworks
  File "/usr/lib64/python3.6/multiprocessing/managers.py", line 772, in _callmethod
    raise convert_to_error(kind, result)
KeyError: 'mtu'
2020-07-08 16:18:48,144+0200 INFO  (jsonrpc/3) [jsonrpc.JsonRpcServer] RPC call Host.setupNetworks failed (error -32603) in 0.28 seconds (__init__:312)



Version-Release number of selected component (if applicable):
vdsm-4.40.22-1.el8.x86_64

How reproducible:


Steps to Reproduce:
1. Set up a new host
2. Set up the host networking and add ovirtmgmt network to an (existing) bond, and save the changes
3. Add another logical network to the bond and try to save

Actual results:
Add fails and python traceback is logged

Expected results:
Network is added.


Additional info:

The ifcfg files on the host are:

ifcfg-bond0:BONDING_OPTS=mode=802.3ad                                                                                                       [16/1831]
ifcfg-bond0:TYPE=Bond
ifcfg-bond0:BONDING_MASTER=yes                      
ifcfg-bond0:HWADDR=   
ifcfg-bond0:PROXY_METHOD=none
ifcfg-bond0:BROWSER_ONLY=no
ifcfg-bond0:DEFROUTE=yes
ifcfg-bond0:IPV4_FAILURE_FATAL=no
ifcfg-bond0:IPV6INIT=no
ifcfg-bond0:IPV6_DEFROUTE=yes
ifcfg-bond0:IPV6_FAILURE_FATAL=no
ifcfg-bond0:IPV6_ADDR_GEN_MODE=stable-privacy
ifcfg-bond0:NAME=bond0                                                                                                                              
ifcfg-bond0:UUID=6ddfb1b5-8f64-4de2-872f-22da490d5946                                                                                               
ifcfg-bond0:DEVICE=bond0          
ifcfg-bond0:ONBOOT=yes
ifcfg-bond0:IPV6_DISABLED=yes
ifcfg-bond0:MACADDR=D0:94:66:33:E8:06
ifcfg-bond0.19:VLAN=yes
ifcfg-bond0.19:TYPE=Vlan
ifcfg-bond0.19:PHYSDEV=bond0
ifcfg-bond0.19:VLAN_ID=19
ifcfg-bond0.19:REORDER_HDR=yes
ifcfg-bond0.19:GVRP=no
ifcfg-bond0.19:MVRP=no
ifcfg-bond0.19:HWADDR=
ifcfg-bond0.19:PROXY_METHOD=none
ifcfg-bond0.19:BROWSER_ONLY=no
ifcfg-bond0.19:BOOTPROTO=none
ifcfg-bond0.19:IPADDR=10.86.19.4
ifcfg-bond0.19:PREFIX=24
ifcfg-bond0.19:GATEWAY=10.86.19.254
ifcfg-bond0.19:DNS1=10.86.18.20
ifcfg-bond0.19:DNS2=10.86.18.21
ifcfg-bond0.19:DNS3=10.86.18.22
ifcfg-bond0.19:DEFROUTE=yes
ifcfg-bond0.19:IPV4_FAILURE_FATAL=yes
ifcfg-bond0.19:IPV6INIT=no
ifcfg-bond0.19:IPV6_DEFROUTE=yes
ifcfg-bond0.19:IPV6_FAILURE_FATAL=no
ifcfg-bond0.19:IPV6_ADDR_GEN_MODE=stable-privacy
ifcfg-bond0.19:NAME=bond0.19
ifcfg-bond0.19:UUID=48608de2-dac3-410d-a05e-80184a95325d
ifcfg-bond0.19:DEVICE=bond0.19
ifcfg-bond0.19:ONBOOT=yes
ifcfg-bond0.19:DHCP_CLIENT_ID=mac
ifcfg-bond0.19:IPV4_DNS_PRIORITY=40
ifcfg-bond0.19:IPV6_DISABLED=yes
ifcfg-bond0.19:MTU=1500
ifcfg-eno1:TYPE=Ethernet
ifcfg-eno1:NAME=eno1
ifcfg-eno1:UUID=86035cd6-8b42-450f-abd2-f25b8c93b445
ifcfg-eno1:DEVICE=eno1
ifcfg-eno1:ONBOOT=yes
ifcfg-eno1:MASTER=bond0
ifcfg-eno1:SLAVE=yes
ifcfg-eno1:MTU=1500
ifcfg-eno2:TYPE=Ethernet
ifcfg-eno2:NAME=eno2
ifcfg-eno2:UUID=dc640175-0a0e-4ea3-a909-164a026fc495
ifcfg-eno2:DEVICE=eno2
ifcfg-eno2:ONBOOT=yes
ifcfg-eno2:MASTER=bond0
ifcfg-eno2:SLAVE=yes
ifcfg-eno2:MTU=1500

Comment 1 Ales Musil 2020-07-09 07:29:50 UTC
Hello Rik,

can you please share relevant part of supervdsm.log?

Comment 2 Rik Theys 2020-07-09 07:41:37 UTC
Created attachment 1700409 [details]
supervdsm log

Hi,

This is the relevant part of the supervdsm log. According to the debug message, the gateway is on more than one interface? I only see it on a single interface in the ifcfg-* files.

Regards,
Rik

Comment 3 Ales Musil 2020-07-09 08:45:37 UTC
Hi,

thank you for the log. I was able to find the root cause of this bug. It is caused by the fact that your management network (the default route network) is bridgeless (non-VM) with vlan.
Until we have a proper fix can you please try to switch ovirtmgmt to bridged (VM) network this should prevent any further issues on that host. 


Also just FYI the ifcfg is not relevant anymore. We are using NetworkManager since 4.4 and the output of 'nmstatectl show' is more important to us.

Comment 4 Rik Theys 2020-07-09 09:00:10 UTC
Hi Ales,

Thanks for the hint. Marking the ovirtmgmt network as a VM network indeed works around the problem.

Hopefully this will be fixed in the near future.

Regards,
Rik

Comment 5 Dominik Holler 2020-07-09 09:58:57 UTC
Steps to reproduce:
1. Have a non-vm (bridge-less) vlaned network attached to the host as default route
2. Change another network attachment

The issue in vdsm is in the workaround for bug 1748389.
We plan to remove this workaround after the release of ovirt-4.4.2 and
check if this scenario works on el 8.3, which will include the fix for bug 1748389.

Comment 6 Dominik Holler 2020-08-11 13:36:29 UTC
Moving to 4.4.4, because the fix will require nmstate 0.3 from RHEL 8.3.

Comment 7 Michael Burman 2020-08-12 10:57:17 UTC
(In reply to Dominik Holler from comment #5)
> Steps to reproduce:
> 1. Have a non-vm (bridge-less) vlaned network attached to the host as
> default route
> 2. Change another network attachment
> 
> The issue in vdsm is in the workaround for bug 1748389.
> We plan to remove this workaround after the release of ovirt-4.4.2 and
> check if this scenario works on el 8.3, which will include the fix for bug
> 1748389.

The correct steps to reproduce are:
1. Make ovirtmgmt as non-vm network and vlan tagged network
2. Try to edit any other network attached on the host

At the current moment reporudced on 8.2.1 and 8.3
nmstate-0.2.10-1.el8.noarch
nmstate-0.3.4-10.el8.noarch

Comment 8 Dominik Holler 2020-08-13 12:19:13 UTC
The fix slipped already in 4.4.2.

Comment 9 Michael Burman 2020-08-24 08:00:08 UTC
Verified on - vdsm-4.40.26-1.el8ev and 4.4.2.3-0.6.el8ev

Comment 10 Sandro Bonazzola 2020-09-18 07:11:52 UTC
This bugzilla is included in oVirt 4.4.2 release, published on September 17th 2020.

Since the problem described in this bug report should be resolved in oVirt 4.4.2 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.