Created attachment 1359306 [details] from static to dhcp - out of sync Description of problem: Networks often reported as out of sync when switching the network's bootproto - non-mgmt network Version-Release number of selected component (if applicable): 4.2.0-0.0.master.20171124141652.gita5b4f9b.el7.centos vdsm-4.20.8-30.gitfc72657.el7.centos.x86_64 How reproducible: From static to dhcp - around 95% Attach network with dhcp - around 95% Steps to Reproduce: 1. Attach a non-mgmt network to the host and set static ip, approve operation 2. Switch the network's bootproto to dhcp 3. Attach a non-mgmt network to the host and set dhcp, approve operation Actual results: 2 - network is out of sync - need to sync manually - tooltip shows host with static and DC with dhcp 3 - network is out of sync - need to sync manually - tooltip shows host with static and DC with dhcp Expected results: Must work as expected, networks must be in sync Attaching logs from both sceanrios and screenshots to descirbe the missmatch - the tooltip show one thing and the edit network bootproto show the opposite
Created attachment 1359307 [details] attach with dhcp - out of sync
Created attachment 1359308 [details] screenshots
Does Vdsm notify Engine of its dhcp-acquired address? Does Engine emit getCaps afterwards?
This bug report has Keywords: Regression or TestBlocker. Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.
In the following caps, it looks like for net-2 network VDSM reports two contradicting values: dhcpv4 is true under networks but false under bridges. MainProcess|jsonrpc/0::DEBUG::2017-11-27 08:55:09,315::supervdsm_server::97::SuperVdsm.ServerCallback::(wrapper) return network_caps with {'bridges': {'ovirtmgmt': {'ipv6autoconf': True, 'addr': '10.35.128.21', 'dhcpv6': False, 'ipv6addrs': [], 'gateway': '10.35.128.254', 'dhcpv4': False, 'netmask': '255.255.255.0', 'ipv4defaultroute': True, 'stp': 'off', 'ipv4addrs': ['10.35.128.21/24'], 'mtu': '1500', 'ipv6gateway': 'fe80:52:0:2380::fe', 'ports': ['enp4s0'], 'opts': {'multicast_last_member_count': '2', 'vlan_protocol': '0x8100', 'hash_elasticity': '4', 'multicast_query_response_interval': '1000', 'group_fwd_mask': '0x0', 'multicast_snooping': '1', 'multicast_startup_query_interval': '3125', 'hello_timer': '0', 'multicast_querier_interval': '25500', 'max_age': '2000', 'hash_max': '512', 'stp_state': '0', 'topology_change_detected': '0', 'priority': '32768', 'multicast_membership_interval': '26000', 'root_path_cost': '0', 'root_port': '0', 'multicast_stats_enabled': '0', 'multicast_startup_query_count': '2', 'nf_call_iptables': '0', 'vlan_stats_enabled': '0', 'topology_change': '0', 'hello_time': '200', 'root_id': '8000.00145e17d5b0', 'bridge_id': '8000.00145e17d5b0', 'topology_change_timer': '0', 'ageing_time': '30000', 'nf_call_ip6tables': '0', 'gc_timer': '18061', 'nf_call_arptables': '0', 'group_addr': '1:80:c2:0:0:0', 'multicast_last_member_interval': '100', 'default_pvid': '1', 'multicast_query_interval': '12500', 'multicast_query_use_ifaddr': '0', 'tcn_timer': '0', 'multicast_router': '1', 'vlan_filtering': '0', 'multicast_querier': '0', 'forward_delay': '0'}}, 'net-2': {'ipv6autoconf': True, 'addr': '10.35.129.157', 'dhcpv6': False, 'ipv6addrs': [], 'gateway': '10.35.129.254', 'dhcpv4': False, 'netmask': '255.255.255.0', 'ipv4defaultroute': False, 'stp': 'off', 'ipv4addrs': ['10.35.129.157/24'], 'mtu': '1500', 'ipv6gateway': '::', 'ports': ['enp6s0.162'], 'opts': {'multicast_last_member_count': '2', 'vlan_protocol': '0x8100', 'hash_elasticity': '4', 'multicast_query_response_interval': '1000', 'group_fwd_mask': '0x0', 'multicast_snooping': '1', 'multicast_startup_query_interval': '3125', 'hello_timer': '0', 'multicast_querier_interval': '25500', 'max_age': '2000', 'hash_max': '512', 'stp_state': '0', 'topology_change_detected': '0', 'priority': '32768', 'multicast_membership_interval': '26000', 'root_path_cost': '0', 'root_port': '0', 'multicast_stats_enabled': '0', 'multicast_startup_query_count': '2', 'nf_call_iptables': '0', 'vlan_stats_enabled': '0', 'topology_change': '0', 'hello_time': '200', 'root_id': '8000.00145e17d5b2', 'bridge_id': '8000.00145e17d5b2', 'topology_change_timer': '0', 'ageing_time': '30000', 'nf_call_ip6tables': '0', 'gc_timer': '25740', 'nf_call_arptables': '0', 'group_addr': '1:80:c2:0:0:0', 'multicast_last_member_interval': '100', 'default_pvid': '1', 'multicast_query_interval': '12500', 'multicast_query_use_ifaddr': '0', 'tcn_timer': '0', 'multicast_router': '1', 'vlan_filtering': '0', 'multicast_querier': '0', 'forward_delay': '0'}}}, 'bondings': {}, 'nameservers': ['10.35.28.1'], 'nics': {'ens1f1': {'ipv6autoconf': False, 'addr': '', 'speed': 1000, 'dhcpv6': False, 'ipv6addrs': [], 'mtu': '1500', 'dhcpv4': False, 'netmask': '', 'ipv4defaultroute': False, 'ipv4addrs': [], 'hwaddr': '00:15:17:3d:cd:ab', 'ipv6gateway': '::', 'gateway': ''}, 'ens1f0': {'ipv6autoconf': False, 'addr': '', 'speed': 1000, 'dhcpv6': False, 'ipv6addrs': [], 'mtu': '1500', 'dhcpv4': False, 'netmask': '', 'ipv4defaultroute': False, 'ipv4addrs': [], 'hwaddr': '00:15:17:3d:cd:aa', 'ipv6gateway': '::', 'gateway': ''}, 'enp4s0': {'ipv6autoconf': False, 'addr': '', 'speed': 1000, 'dhcpv6': False, 'ipv6addrs': [], 'mtu': '1500', 'dhcpv4': False, 'netmask': '', 'ipv4defaultroute': False, 'ipv4addrs': [], 'hwaddr': '00:14:5e:17:d5:b0', 'ipv6gateway': '::', 'gateway': ''}, 'enp6s0': {'ipv6autoconf': False, 'addr': '', 'speed': 1000, 'dhcpv6': False, 'ipv6addrs': [], 'mtu': '1500', 'dhcpv4': False, 'netmask': '', 'ipv4defaultroute': False, 'ipv4addrs': [], 'hwaddr': '00:14:5e:17:d5:b2', 'ipv6gateway': '::', 'gateway': ''}}, 'supportsIPv6': True, 'vlans': {'enp6s0.162': {'iface': 'enp6s0', 'ipv6autoconf': False, 'addr': '', 'dhcpv6': False, 'ipv6addrs': [], 'vlanid': 162, 'mtu': '1500', 'dhcpv4': False, 'netmask': '', 'ipv4defaultroute': False, 'ipv4addrs': [], 'ipv6gateway': '::', 'gateway': ''}}, 'networks': {'ovirtmgmt': {'dhcpv6': False, 'iface': 'ovirtmgmt', 'ipv6autoconf': True, 'addr': '10.35.128.21', 'bridged': True, 'ipv6addrs': [], 'switch': 'legacy', 'gateway': '10.35.128.254', 'dhcpv4': False, 'netmask': '255.255.255.0', 'ipv4defaultroute': True, 'stp': 'off', 'ipv4addrs': ['10.35.128.21/24'], 'mtu': '1500', 'ipv6gateway': 'fe80:52:0:2380::fe', 'ports': ['enp4s0']}, 'net-2': {'dhcpv6': False, 'iface': 'net-2', 'ipv6autoconf': True, 'addr': '10.35.129.157', 'bridged': True, 'ipv6addrs': [], 'switch': 'legacy', 'gateway': '10.35.129.254', 'dhcpv4': True, 'netmask': '255.255.255.0', 'ipv4defaultroute': False, 'stp': 'off', 'ipv4addrs': ['10.35.129.157/24'], 'mtu': '1500', 'ipv6gateway': '::', 'ports': ['enp6s0.162']}}}
I suspect that due to the async nature of the dhcp network setup (spawning an ifup execution), a following getCapabilities may still not see dhclient on the device. To avoid the inconsistency between the network and the device dhcp state, lets try to query them at the same time, after all the other properties of all devices have been processed.
I want to verify this, but not sure. I didn't managed to reproduce the exact origin scenario, but i did managed to reproduce a new flow that give the almost same issue. engine report out-of-sync - NONE on host and dhcp in engine(the real state is dhcp on host). It's an edge case but i would like you to see it before i verify this bug.
The scenario is: - Attach 3 with static IP to the host - Update the networks with dhcp and approve operation Result: refresh caps events are not sent, no IP displayed in the UI, some networks reported as out-of-sync on the host with a difference of: On host - NONE on DC - dhcp, while in fact vdsm caps show it's dhcpv4=true. I believe that we now blocked on BZ 1590109 and further investigation by DEV is needed here.
Moving back to assigned as i can't test it at the moment and will wait for the fix of BZ 1590109
Now when BZ 1590109 is finally fixed and verified, this bug can be tested again
Verified on - 4.2.6.3_SNAPSHOT-94.gbbcd5cb.0.scratch.master.el7ev and vdsm-4.20.37-1.el7ev.x86_64