Bug 1683967 - Attaching network with dhcpv4 ends up as out-of-sync becasue of a IPv6 property
Summary: Attaching network with dhcpv4 ends up as out-of-sync becasue of a IPv6 property
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: vdsm
Classification: oVirt
Component: General
Version: 4.30.8
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ovirt-4.3.3
: ---
Assignee: Edward Haas
QA Contact: Michael Burman
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-02-28 08:25 UTC by Michael Burman
Modified: 2019-04-16 13:58 UTC (History)
6 users (show)

Fixed In Version: vdsm-4.30.12
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-04-16 13:58:21 UTC
oVirt Team: Network
Embargoed:
edwardh: needinfo-
dfediuck: ovirt-4.3+


Attachments (Terms of Use)
vdsm logs of this flow in ovirt-4.2 (37.44 KB, application/gzip)
2019-02-28 16:41 UTC, Dominik Holler
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 98407 0 master MERGED net: Disable IPv6 before and after ifup 2019-03-10 18:46:08 UTC
oVirt gerrit 98418 0 ovirt-4.3 ABANDONED net: Disable IPv6 before and after ifup 2019-03-21 06:52:20 UTC
oVirt gerrit 98518 0 master ABANDONED net: Apply autoconf state before ifup 2019-03-20 14:17:19 UTC
oVirt gerrit 98522 0 master MERGED Revert "net: Disable IPv6 before and after ifup" 2019-03-14 06:40:18 UTC
oVirt gerrit 98523 0 master MERGED net: Delay the notification sent to Engine on DHCPv4 response 2019-03-21 07:26:15 UTC
oVirt gerrit 98743 0 ovirt-4.3 MERGED net: Delay the notification sent to Engine on DHCPv4 response 2019-03-21 12:14:46 UTC

Description Michael Burman 2019-02-28 08:25:13 UTC
Description of problem:
Attaching network with dhcpv4 ends up as out-of-sync becasue of a IPv6 property

New regression, possible caused by some latest changes done on this area.
Now, when setting a network with dhcpv4 or attaching a network with dhcpv4, engine always coplains that the network is out-of-sync because of an unrelated ipv6 property - 'IPv6 bootprotocol' host - autoconf, dc- none
This is happens in 100% of the times.
A manual refresh caps fix the out-of-sync issue and network become in sync.

Version-Release number of selected component (if applicable):
4.3.1.1-0.1.el7

How reproducible:
100%

Steps to Reproduce:
1. Attach network with dhcpv4 bootproto or set network with dhcpv4 bootproto via setup networks

Actual results:
Network is out-of-sync because of an unrelated ipv6 property - 'IPv6 bootprotocol' host - autoconf, dc- none

Expected results:
Network should be in sync

Comment 1 Michael Burman 2019-02-28 08:31:10 UTC
[org.ovirt.engine.core.vdsbroker.vdsbroker.HostSetupNetworksVDSCommand]
HostNetwork:{defaultRoute='false', bonding='false', networkName='net1_vlan', vdsmName='net1_vlan', nicName='enp6s0', vlan='169', vmNetwork='true', stp='false', properties='[]', ipv4BootProtocol='DHCP', ipv4Address='null', ipv4Netmask='null', ipv4Gateway='null', ipv6BootProtocol='NONE', ipv6Address='null', ipv6Prefix='null', ipv6Gateway='null', nameServers='null'}]

Comment 3 Dominik Holler 2019-02-28 16:41:57 UTC
Created attachment 1539604 [details]
vdsm logs of this flow in ovirt-4.2

The recent changes just uncovered this behavior.
IPv6 seems to be disabled by VDSM after dhcp sets the IPv4 address.
Edy, do you think IPv6 can be disabled, before setupNetworks returns?

Comment 4 Edward Haas 2019-02-28 19:36:26 UTC
This is a super cool bug.
0. Initial state: ipv6/autoconf enabled (by default).
1. Request arrives to VDSM to change the network/interface to DHCPv4 enabled and disable IPv6.
2. VDSM edits the ifcfg file, and runs `ifup` in an async manner (deamonized). It will block until the dhcp server responds or timeouts.
   The interesting part here is: It will block on the dhcp request and will not apply the rest of the config (like disabling ipv6).
3. In parallel, the setup finishes.
4. Caps is requested by Engine, DHCP is on, no address yet and IPv6 is still enabled.
5. DHCP response arrives, address added and a signal is sent to Engine.
6. Engine in turn requests a caps, which returns the address now, but the IPv6 is still enabled.
   At this time, the `ifup` has not yet finished/returned.
7. `ifup` finishes, completed all the other settings, including the ipv6.
   But no caps is triggered now, which is unfortunate.

This is a race, where Engine caps (which got triggered by a signal from VDSM) arrives before `ifup` finished.
I can think of these options to try and fix it:
1. Wait a few seconds before sending the signal to Engine (following the dhcp response).
2. Engine should wait a few seconds before sending the caps request.
3. VDSM should send the signal only when the forked `ifup` process finishes.

Comment 5 eraviv 2019-03-03 11:35:43 UTC
1. Until now this bug was not discovered bc engine did not report out-of-sync on ipv6
2. This behaviour only happens when first attaching a network to a nic and configuring it with ipv4=dhc ipv6=none. On a second setup networks with 'sync networks' from engine, the nic is set with ipv6_autoconf=false as expected.

3. On the initial attachment of the network this bug reproduces with the following in supervdsm.log, which leads to the suspicion that this is a vdsm-side issue, bc setup networks request ipv6_autoconf=false for network 'n23' but after the setup vdsm reports it as true for 'n23' in the getCaps:

MainProcess|jsonrpc/2::DEBUG::2019-03-03 13:06:03,226::supervdsm_server::99::SuperVdsm.ServerCallback::(wrapper) call setupNetworks with ({u'n23': {u'ipv6autoconf': False, u'nic': u'eth2', u'dhcpv6': False, u'mtu': 1500, u'switch': u'legacy', u'bootproto': u'dhcp', u'STP': u'no', u'bridged': u'true'}}, {}, {u'connectivityCheck': u'true', u'connectivityTimeout': 120, u'commitOnSuccess': True}) {}
MainProcess|jsonrpc/2::DEBUG::2019-03-03 13:06:03,227::logutils::319::root::(_report_stats) ThreadedHandler is ok in the last 266 seconds (max pending: 1)
MainProcess|jsonrpc/2::INFO::2019-03-03 13:06:03,227::api::211::root::(setupNetworks) Setting up network according to configuration: networks:{u'n23': {u'ipv6autoconf': False, u'nic': u'eth2', u'dhcpv6': False, u'mtu': 1500, u'switch': u'legacy', u'bootproto': u'dhcp', u'STP': u'no', u'bridged': u'true'}}, bondings:{}, options:{u'connectivityCheck': u'true', u'connectivityTimeout': 120, u'commitOnSuccess': True}

MainProcess|jsonrpc/2::INFO::2019-03-03 13:06:03,322::netconfpersistence::58::root::(setNetwork) Adding network n23({u'ipv6autoconf': False, 'nameservers': [], u'nic': u'eth
2', u'mtu': 1500, u'switch': u'legacy', u'dhcpv6': False, 'stp': False, u'bridged': True, 'defaultRoute': False, u'bootproto': u'dhcp'})


MainProcess|jsonrpc/2::DEBUG::2019-03-03 13:06:03,826::ifcfg::578::root::(writeConfFile) Writing to file /etc/sysconfig/network-scripts/ifcfg-n23 configuration:
# Generated by VDSM version 4.30.9.25.git76d15e1
DEVICE=n23
TYPE=Bridge
DELAY=0
STP=off
ONBOOT=yes
BOOTPROTO=dhcp
MTU=1500
DEFROUTE=no
NM_CONTROLLED=no
IPV6INIT=no

MainProcess|jsonrpc/2::DEBUG::2019-03-03 13:06:03,841::cmdutils::133::root::(exec_cmd) /sbin/ip addr flush dev eth2 scope global (cwd None)
MainProcess|jsonrpc/2::DEBUG::2019-03-03 13:06:03,844::cmdutils::141::root::(exec_cmd) SUCCESS: <err> = ''; <rc> = 0
MainProcess|jsonrpc/2::DEBUG::2019-03-03 13:06:03,844::ifcfg::488::root::(_atomicBackup) Backed up /etc/sysconfig/network-scripts/ifcfg-eth2
MainProcess|jsonrpc/2::DEBUG::2019-03-03 13:06:03,845::ifcfg::521::root::(_persistentBackup) backing up ifcfg-eth2: # Generated by VDSM version 4.30.9.25.git76d15e1
DEVICE=eth2
ONBOOT=yes
MTU=1500
NM_CONTROLLED=no


Saved new config RunningConfig({'ovirtmgmt': {u'ipv6autoconf': False, u'nameservers': [u'192.168.122.1', u'2001:db8:ca3:2::1'], u'nic': u'eth0', u'ipaddr': u'192.168.122.251', u'netmask': u'255.255.255.0', u'mtu': 1500, u'switch': u'legacy', u'dhcpv6': False, u'stp': False, u'bridged': True, u'ipv6addr': u'2001:db8:ca3:2:1::21/64', u'ipv6gateway': u'2001:db8:ca3:2::1', u'gateway': u'192.168.122.1', u'defaultRoute': True, u'bootproto': u'none'}, u'n23': {u'ipv6autoconf': False, u'nameservers': [], u'nic': u'eth2', u'mtu': 1500, u'switch': u'legacy', u'dhcpv6': False, u'stp': False, u'bridged': True, u'defaultRoute': False, u'bootproto': u'dhcp'}}, {}, {}) to [/var/lib/vdsm/staging/netconf/nets,/var/lib/vdsm/staging/netconf/bonds,/var/lib/vdsm/staging/netconf/devices]



MainProcess|jsonrpc/1::DEBUG::2019-03-03 13:06:05,918::supervdsm_server::106::SuperVdsm.ServerCallback::(wrapper) return network_caps with {'bridges': {'ovirtmgmt': {'ipv6autoconf': False, 'addr': '192.168.122.251', 'dhcpv6': False, 'ipv6addrs': ['2001:db8:ca3:2:1::21/64'], 'gateway': '192.168.122.1', 'dhcpv4': False, 'netmask': '255.255.255.0', 'ipv4defaultroute': True, 'stp': 'off', 'ipv4addrs': ['192.168.122.251/24'], 'mtu': '1500', 'ipv6gateway': '2001:db8:ca3:2::1', 'ports': ['eth0'], 'opts': {'multicast_last_member_count': '2', 'vlan_protocol': '0x8100', 'hash_elasticity': '4', 'multicast_query_response_interval': '1000', 'group_fwd_mask': '0x0', 'multicast_snooping': '1', 'multicast_startup_query_interval': '3125', 'hello_timer': '0', 'multicast_querier_interval': '25500', 'max_age': '2000', 'hash_max': '512', 'stp_state': '0', 'topology_change_detected': '0', 'priority': '32768', 'multicast_igmp_version': '2', 'multicast_membership_interval': '26000', 'root_path_cost': '0', 'root_port': '0', 'multicast_stats_enabled': '0', 'multicast_startup_query_count': '2', 'nf_call_iptables': '0', 'vlan_stats_enabled': '0', 'topology_change': '0', 'hello_time': '200', 'root_id': '8000.525400d7a4ab', 'bridge_id': '8000.525400d7a4ab', 'topology_change_timer': '0', 'ageing_time': '30000', 'nf_call_ip6tables': '0', 'multicast_mld_version': '1', 'gc_timer': '14990', 'nf_call_arptables': '0', 'group_addr': '1:80:c2:0:0:0', 'multicast_last_member_interval': '100', 'default_pvid': '1', 'multicast_query_interval': '12500', 'multicast_query_use_ifaddr': '0', 'tcn_timer': '0', 'multicast_router': '1', 'vlan_filtering': '0', 'multicast_querier': '0', 'forward_delay': '0'}}, 'n23': {'ipv6autoconf': True, 'addr': '', 'dhcpv6': False, 'ipv6addrs': [], 'gateway': '', 'dhcpv4': True, 'netmask': '', 'ipv4defaultroute': False, 'stp': 'off', 'ipv4addrs': [], 'mtu': '1500', 'ipv6gateway': '::', 'ports': ['eth2'], 'opts': {'multicast_last_member_count': '2', 'vlan_protocol': '0x8100', 'hash_elasticity': '4', 'multicast_query_response_interval': '1000', 'group_fwd_mask': '0x0', 'multicast_snooping': '1', 'multicast_startup_query_interval': '3125', 'hello_timer': '0', 'multicast_querier_interval': '25500', 'max_age': '2000', 'hash_max': '512', 'stp_state': '0', 'topology_change_detected': '0', 'priority': '32768', 'multicast_igmp_version': '2', 'multicast_membership_interval': '26000', 'root_path_cost': '0', 'root_port': '0', 'multicast_stats_enabled': '0', 'multicast_startup_query_count': '2', 'nf_call_iptables': '0', 'vlan_stats_enabled': '0', 'topology_change': '0', 'hello_time': '200', 'root_id': '8000.525400335084', 'bridge_id': '8000.525400335084', 'topology_change_timer': '0', 'ageing_time': '30000', 'nf_call_ip6tables': '0', 'multicast_mld_version': '1', 'gc_timer': '29940', 'nf_call_arptables': '0', 'group_addr': '1:80:c2:0:0:0', 'multicast_last_member_interval': '100', 'default_pvid': '1', 'multicast_query_interval': '12500', 'multicast_query_use_ifaddr': '0', 'tcn_timer': '0', 'multicast_router': '1', 'vlan_filtering': '0', 'multicast_querier': '0', 'forward_delay': '0'}}}, 'bondings': {}, 'nameservers': ['192.168.122.1', '2001:db8:ca3:2::1'], 'nics': {'eth2': {'ipv6autoconf': False, 'addr': '', 'speed': 0, 'dhcpv6': False, 'ipv6addrs': [], 'mtu': '1500', 'dhcpv4': False, 'netmask': '', 'ipv4defaultroute': False, 'ipv4addrs': [], 'hwaddr': '52:54:00:33:50:84', 'ipv6gateway': '::', 'gateway': ''}, 'eth1': {'ipv6autoconf': True, 'addr': '', 'speed': 0, 'dhcpv6': False, 'ipv6addrs': [], 'mtu': '1500', 'dhcpv4': False, 'netmask': '', 'ipv4defaultroute': False, 'ipv4addrs': [], 'hwaddr': '52:54:00:a7:a6:6c', 'ipv6gateway': '::', 'gateway': ''}, 'eth0': {'ipv6autoconf': False, 'addr': '', 'speed': 0, 'dhcpv6': False, 'ipv6addrs': [], 'mtu': '1500', 'dhcpv4': False, 'netmask': '', 'ipv4defaultroute': False, 'ipv4addrs': [], 'hwaddr': '52:54:00:d7:a4:ab', 'ipv6gateway': '::', 'gateway': ''}}, 'supportsIPv6': True, 'vlans': {}, 'networks': {'ovirtmgmt': {'dhcpv6': False, 'southbound': 'eth0', 'iface': 'ovirtmgmt', 'ipv6autoconf': False, 'addr': '192.168.122.251', 'bridged': True, 'ipv6addrs': ['2001:db8:ca3:2:1::21/64'], 'switch': 'legacy', 'gateway': '192.168.122.1', 'dhcpv4': False, 'netmask': '255.255.255.0', 'ipv4defaultroute': True, 'stp': 'off', 'ipv4addrs': ['192.168.122.251/24'], 'mtu': '1500', 'ipv6gateway': '2001:db8:ca3:2::1', 'ports': ['eth0']}, 'n23': {'dhcpv6': False, 'southbound': 'eth2', 'iface': 'n23', 'ipv6autoconf': True, 'addr': '', 'bridged': True, 'ipv6addrs': [], 'switch': 'legacy', 'gateway': '', 'dhcpv4': True, 'netmask': '', 'ipv4defaultroute': False, 'stp': 'off', 'ipv4addrs': [], 'mtu': '1500', 'ipv6gateway': '::', 'ports': ['eth2']}}}

Comment 6 Dan Kenigsberg 2019-03-07 21:07:05 UTC
Eddy, if this hurts 100% of dhcpv4 assignments, we have to fix or hide it quickly.

> 1. Wait a few seconds before sending the signal to Engine (following the dhcp response).

keeps the hack closer to the source of the race (ifcfg._exec_ifup)

Comment 9 Edward Haas 2019-03-14 10:18:43 UTC
Current proposed solution keeps the out-of-sync state between the setup request and the DHCP reply.
IMO such a behaviour is acceptable, as the system has not yet stabilized.

Comment 10 Michael Burman 2019-03-14 10:27:37 UTC
(In reply to Edward Haas from comment #9)
> Current proposed solution keeps the out-of-sync state between the setup
> request and the DHCP reply.
> IMO such a behaviour is acceptable, as the system has not yet stabilized.

QE's concern here is that this out-of-sync state in between not related to the origin request in any way. 
If user setting a dhcpv4 only and get out-of-sync reason about ipv6 property, it is confusing. 
Another concern, is that another operation on this network may fail becasue it is yet out-of-sync state.

Comment 11 Michael Burman 2019-03-31 06:11:44 UTC
Verified on - vdsm-4.30.12-1.el7ev.x86_64 and rhvm-4.3.3.1-0.1.el7.noarch

Note, that the current solution keeps the out-of-sync state between the setup request and the DHCPv4 reply.

Comment 12 Sandro Bonazzola 2019-04-16 13:58:21 UTC
This bugzilla is included in oVirt 4.3.3 release, published on April 16th 2019.

Since the problem described in this bug report should be
resolved in oVirt 4.3.3 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.