Description of problem: I configured the network from cockpit creating bond0 over eth1 and eth2. Then, still from cockpit, I created bond0.123 for vlan 123 over bond0. I set a static IP address on bond0.123 and nothing on bond0 (untagged). The network was correctly working: bond0.123 was up and running. I run hosted-engine --deploy which let me choose bond0.123 for the management interface but not bond0 since bond0 is lacking an IPv4 address. The engine sends to the host: 2018-06-05 18:07:26,081+02 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.HostSetupNetworksVDSCommand] (EE-ManagedThreadFactory-engine-Thread-1) [34453adc] START, HostSetupNetworksVDSCommand(HostName = c75he20180605h1, HostSetupNetworksVdsCommandParameters:{hostId='e817fa00-432f-4cab-9589-6b3e97820edc', vds='Host[c75he20180605h1,e817fa00-432f-4cab-9589-6b3e97820edc]', rollbackOnFailure='true', connectivityTimeout='120', networks='[HostNetwork:{defaultRoute='true', bonding='true', networkName='ovirtmgmt', vdsmName='ovirtmgmt', nicName='bond0', vlan='123', mtu='0', vmNetwork='true', stp='false', properties='null', ipv4BootProtocol='STATIC_IP', ipv4Address='192.168.2.14', ipv4Netmask='255.255.255.0', ipv4Gateway='null', ipv6BootProtocol='AUTOCONF', ipv6Address='null', ipv6Prefix='null', ipv6Gateway='null', nameServers='null'}]', removedNetworks='[]', bonds='[]', removedBonds='[]', clusterSwitchType='LEGACY'}), log id: 7bca7ce3 which looks correct since it includes all the relevant info. and vdsm correctly received it: MainProcess|jsonrpc/6::INFO::2018-06-05 18:07:25,270::netconfpersistence::68::root::(setBonding) Adding bond0({'nics': ['eth1', 'eth2'], 'switch': 'legacy', 'options': 'miimon=100 mode=4'}) MainProcess|jsonrpc/6::INFO::2018-06-05 18:07:25,271::netconfpersistence::57::root::(setNetwork) Adding network ovirtmgmt({u'ipv6autoconf': True, 'nameservers': ['192.168.1.1'], u'vlan': 123, u'ipaddr': u'192.168.2.14', u'bonding': u'bond0', u'mtu': 1500, u'switch': u'legacy', u'dhcpv6': False, 'stp': False, u'bridged': True, u'netmask': u'255.255.255.0', u'defaultRoute': True, 'bootproto': 'none'}) so supervdsm took a backup of my ifcfg-bond0 MainProcess|jsonrpc/6::DEBUG::2018-06-05 18:07:28,488::ifcfg::512::root::(_persistentBackup) backing up ifcfg-bond0: BONDING_OPTS="downdelay=0 miimon=100 mode=802.3ad updelay=0" TYPE=Bond BONDING_MASTER=yes PROXY_METHOD=none BROWSER_ONLY=no BOOTPROTO=dhcp DEFROUTE=yes IPV4_FAILURE_FATAL=no IPV6INIT=yes IPV6_AUTOCONF=yes IPV6_DEFROUTE=yes IPV6_FAILURE_FATAL=no IPV6_ADDR_GEN_MODE=stable-privacy NAME=bond0 UUID=9041fd4a-c5b1-4d79-8a4d-91463d852ba1 DEVICE=bond0 ONBOOT=yes AUTOCONNECT_SLAVES=yes Please not the IPV4_FAILURE_FATAL=no and replaced it with MainProcess|jsonrpc/6::DEBUG::2018-06-05 18:07:28,488::ifcfg::569::root::(writeConfFile) Writing to file /etc/sysconfig/network-scripts/ifcfg-bond0 configuration: # Generated by VDSM version 4.20.29-1.git58011fd.el7 DEVICE=bond0 BONDING_OPTS='mode=4 miimon=100' MACADDR=00:1a:4a:16:01:71 ONBOOT=yes BOOTPROTO=dhcp MTU=1500 DEFROUTE=yes NM_CONTROLLED=no IPV6INIT=yes IPV6_AUTOCONF=yes which lacks IPV4_FAILURE_FATAL=no Then, before creating ifcfg-bond0.123 it tried to bring up bond0 but it failed since bond0 didn't received any IPv4 address from DHCP. MainProcess|jsonrpc/6::DEBUG::2018-06-05 18:08:36,078::cmdutils::150::root::(exec_cmd) /usr/bin/systemd-run --scope --unit=bfd052cf-c487-427d-868e-e6b5fd63384a --slice=vdsm-dhclient /sbin/ifup bond0 (cwd None) MainProcess|jsonrpc/6::DEBUG::2018-06-05 18:08:36,315::cmdutils::158::root::(exec_cmd) SUCCESS: <err> = 'Running scope as unit bfd052cf-c487-427d-868e-e6b5fd63384a.scope.\n'; <rc> = 0 MainProcess|jsonrpc/6::ERROR::2018-06-05 18:08:36,316::supervdsm_server::100::SuperVdsm.ServerCallback::(wrapper) Error in setupNetworks Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/supervdsm_server.py", line 98, in wrapper res = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/network/api.py", line 217, in setupNetworks _setup_networks(networks, bondings, options) File "/usr/lib/python2.7/site-packages/vdsm/network/api.py", line 238, in _setup_networks netswitch.configurator.setup(networks, bondings, options, in_rollback) File "/usr/lib/python2.7/site-packages/vdsm/network/netswitch/configurator.py", line 140, in setup _setup_legacy(legacy_nets, legacy_bonds, options, in_rollback) File "/usr/lib/python2.7/site-packages/vdsm/network/netswitch/configurator.py", line 157, in _setup_legacy in_rollback) File "/usr/lib/python2.7/site-packages/vdsm/network/legacy_switch.py", line 488, in bonds_setup _bonds_edit(edit, configurator, _netinfo) File "/usr/lib/python2.7/site-packages/vdsm/network/legacy_switch.py", line 560, in _bonds_edit configurator.editBonding(bond, _netinfo) File "/usr/lib/python2.7/site-packages/vdsm/network/configurators/ifcfg.py", line 196, in editBonding _exec_ifup(bond) File "/usr/lib/python2.7/site-packages/vdsm/network/configurators/ifcfg.py", line 942, in _exec_ifup _exec_ifup_by_name(iface.name, cgroup) File "/usr/lib/python2.7/site-packages/vdsm/network/configurators/ifcfg.py", line 928, in _exec_ifup_by_name raise ConfigNetworkError(ERR_FAILED_IFUP, out[-1] if out else '') ConfigNetworkError: (29, '\n') And so vdsm started rolling back network configuration. Version-Release number of selected component (if applicable): How reproducible: 100% Steps to Reproduce: 1. create a bond from cockpit without configuring any IPv4 address on it 2. nothing will provide an IPv4 address for that bond over untagged traffic 2. create a vlan over that bond and set a static address on it (I suppose we are going to hit the bug also with DHCP here) 3. try to deploy hosted-engine selecting the vlan over bond interface Actual results: the host receive the right configuration from the engine bu it tried to bring up the untagged bond before creating the vlan over it and so if the untagged bond fails to get an IPv4 address everything got rolled back. Expected results: the user could got a management bridge over a vlan over a bond also if the untagged bond has no IPv4 address. Additional info: Workaround: set a static IPv4 address from an unused subnet for bond0 and the management bridge got correctly created over the vlan over the bond.
Created attachment 1447994 [details] supervdsm logs
We are unlikely to support IPV4_FAILURE_FATAL=no in Vdsm. If dhcp is not active on the bond, I expect us to remove it when consuming the bond (but I'm not 100% sure) On the mean while, I suggest users to work around this bug by removing dhcp from the bond before adding the host to Engine.
Seems like a edge case. Closing for now.