Description of problem: I see this part of the code is quite different in 4.20, but I couldn't find any BZ related to this. Maybe it is already fixed, but found important to report it: If on the same setupNetworks command a default route network is being removed and a new one is being added, the validation fails: Only a singe default route network is allowed New network: OCE02-NP-MGMT0 (being added) Old network: on3462ee86c69c4 (being removed) Looks like _validate_default_route (which was replaced in 4.2) doesn't look at networks being removed. MainProcess|jsonrpc/3::DEBUG::2018-05-08 18:15:40,903::api::204::root::(setupNetworks) Setting up network according to configuration: networks:{u'OCE02-NP-MGMT01': {u'ipv6autoconf': True, u'vlan': u'52', u'ipaddr': u'10.182.52.184', u'switch': u'legacy', u'mtu': 1500, u'bonding': u'bond0', u'dhcpv6': False, u'STP': u'no', u'bridged': u'true', u'netmask': u'255.255.252.0', u'gateway': u'10.182.52.1', u'defaultRoute': True}, u'on3462ee86c69c4': {u'remove': u'true'}}, bondings:{}, options:{u'connectivityCheck': u'true', u'connectivityTimeout': 120} MainProcess|jsonrpc/3::ERROR::2018-05-08 18:15:40,909::supervdsmServer::94::SuperVdsm.ServerCallback::(wrapper) Error in setupNetworks Traceback (most recent call last): File "/usr/share/vdsm/supervdsmServer", line 92, in wrapper res = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/network/api.py", line 210, in setupNetworks ipvalidator.validate(networks) File "/usr/lib/python2.7/site-packages/vdsm/network/ip/validator.py", line 43, in validate _validate_default_route(default_route_nets, no_default_route_nets) File "/usr/lib/python2.7/site-packages/vdsm/network/ip/validator.py", line 53, in _validate_default_route 'Only a singe default route network is allowed.') ConfigNetworkError: (21, 'Only a singe default route network is allowed.') var/lib/vdsm/persistence/netconf/nets/on3462ee86c69c4: "defaultRoute": true Version-Release number of selected component (if applicable): vdsm-4.19.50-1.el7ev.x86_64 This is what the customer did: 1. Move host from cluster A to B (A and B have different default route networks) 2. Attempt to setup networks on host, swapping the default route networks.
on3462ee86c69c4 was unmanaged at that point. That's why I added BZ1515880.
Edy, danken thinks that in 4.2 vdsm would remove defaultRoute=True from the unmanaged network when adding it to another network, right?
Michael, could you try to reproduce this in 4.2? I think we have fixed it in Vdsm already.
I think this is the scenario raised here: https://bugzilla.redhat.com/show_bug.cgi?id=1522971
(In reply to Dan Kenigsberg from comment #3) > Michael, could you try to reproduce this in 4.2? I think we have fixed it in > Vdsm already. The error no longer happens on 4.2, BUT it is only partially works: This is what the customer did and it's what i did as well -> 1. Move host from cluster A to B (A and B have different default route networks) 2. Attempt to setup networks on host, swapping the default route networks. Result - swapping the network seem to be fine with engine(after setup networks and refresh caps the network is in sync), no errors on vdsm side of duplicate default route(as Germano reported in the bug) and the icon moved to the correct network, BUT - The default route wasn't really update on vdsm side properly(both now has true) , and route is still via the unmanaged network. Example - df1 network is the default route in cluster A df11 is the default route in cluster B This is how it looks after the host move from cluster A to B from caps: "df1": { "ipv6autoconf": false, "addr": "10.35.x.x", "ipv4defaultroute": true, }, "df11": { "ipv6autoconf": false, "addr": "10.35.x.x", "ipv4defaultroute": true, Both networks has "ipv4defaultroute": true, The actual route is still via df1(unmanaged) and not df11 [root@orchid-vds1 ~]# ping -I df1 8.8.8.8 PING 8.8.8.8 (8.8.8.8) from 10.35.x.x df1: 56(84) bytes of data. 64 bytes from 8.8.8.8: icmp_seq=1 ttl=57 time=58.2 ms 64 bytes from 8.8.8.8: icmp_seq=2 ttl=57 time=58.1 ms ^C --- 8.8.8.8 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1001ms rtt min/avg/max/mdev = 58.152/58.177/58.202/0.025 ms [root@orchid-vds1 ~]# ping -I df11 8.8.8.8 PING 8.8.8.8 (8.8.8.8) from 10.35.x.x df11: 56(84) bytes of data. ^C --- 8.8.8.8 ping statistics --- 7 packets transmitted, 0 received, 100% packet loss, time 5999ms Destination Gateway Genmask Flags Metric Ref Use Iface 0.0.0.0 10.35.x.x 0.0.0.0 UG 0 0 0 df1 * So in the bottom line it doesn't work, but the error doesn't reproduced. It become very bad if trying to remove the unmanaged network because if removing the unmanaged network from this state, df11 will be reported again as out-of-sync and only after few minutes it become synced again. By doing it manually. The unmanaged scenario is not handled properly at all.
When the networks are on different subnets as you requested it works only if syncing manually the network on clusterB df1 vlan 162 in clusterA df2 vlan 163 in clusterB After host moved to clusterA, df1 is unmanaged and df2 is out-of-sync(default route true/false) The route is still via df1 }, "df1": { "ipv6autoconf": false, "addr": "10.35.129.161", "ipv4defaultroute": true, "df2": { "ipv6autoconf": false, "addr": "10.35.130.61", "ipv4defaultroute": false, [root@orchid-vds1 ~]# ping 8.8.8.8 -I df1 PING 8.8.8.8 (8.8.8.8) from 10.35.129.161 df1: 56(84) bytes of data. 64 bytes from 8.8.8.8: icmp_seq=1 ttl=55 time=66.8 ms - After doing manual sync all networks, df1 is unmanaged, df2 remain out-of-sync for few minutes and then synced. [root@orchid-vds1 ~]# ping 8.8.8.8 -I df1 PING 8.8.8.8 (8.8.8.8) from 10.35.129.161 df1: 56(84) bytes of data. ^C --- 8.8.8.8 ping statistics --- 2 packets transmitted, 0 received, 100% packet loss, time 999ms [root@orchid-vds1 ~]# ping 8.8.8.8 -I df2 PING 8.8.8.8 (8.8.8.8) from 10.35.130.61 df2: 56(84) bytes of data. 64 bytes from 8.8.8.8: icmp_seq=1 ttl=55 time=66.8 ms ^C --- 8.8.8.8 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 66.875/66.875/66.875/0.000 ms "df1": { "ipv6autoconf": false, "addr": "10.35.129.161", "ipv4defaultroute": false, }, "df2": { "ipv6autoconf": false, "addr": "10.35.130.61", "ipv4defaultroute": true, BUT this only works if syncing manually, engine doesn't take care of it and doesn't syncing the network with default route property.
Summary: - The original report no longer exist on 4.2 - I did found an issue with the described scenario with handling an 'unmanaged' network that used to be the default route network in cluster A. We keep reporting default route=true for this network unless doing manual remove of it. Cluster A - net1 is the default route and doesn't exist in cluster B Cluster B - net2 is the default route and attached to host(with bootproto) prior the host move. The correct way to work around this issue is: 1) Once host moved from cluster A (net1 is the default route and doesn't exist in cluster B) to cluster B, is to remove the 'unmanaged' network net1 from the host 2) net2 is out-of-sync, we should sync all networks only after the 'unmanaged' network has been removed from the host. - I don't think we going to handle this any time soon as i understand from our dev guys. As Edy requested, i believe this report can be closed. If user/admin doing such operations(move host cluster with different default route networks) then he should be aware that manual intervention will be required from him.
To avoid the reported issue, please upgrade your hypervisor to 4.2
BZ<2>Jira Resync