Created attachment 941994 [details] logs Description of problem: [Hosted-engine --deploy] > Can't configure management bridge over Vlan, default gateway is removed during deployment. While running hosted-engine deployment, host loosing connectivity and deployment script is broken. Stucked in this stage: [ INFO ] Configuring the management bridge.... For some reason in this stage the default gateway is removed. Version-Release number of selected component (if applicable): 3.5.0-0.13.beta.el6ev ovirt-hosted-engine-setup-1.2.0-1.el6ev.noarch How reproducible: always Steps to Reproduce: 1. yum install ovirt-hosted-engine-setup 2. hosted-engine --deploy 3. configure management bridge(rhevm) over Vlan tagged interface Actual results: Deployment script is broken and host lost connectivity, because default gateway was removed Expected results: hosted-engine --deploy should work. Should be able to configure mgmt bridge over Vlan tagged interface Additional info:
I haven't mentioned that i'm using static IP configured over Vlan.
Antoni, Dan, is something changed recently in VDSM that may lead to this or ahve you already seen something like this out of hosted engine? The code handling that part is not changed on setup side.
In vdsm I see: 'vlans': {'em1.162': {'iface': 'em1', 'addr': '10.35.129.7', 'cfg': {'VLAN': 'yes', 'IPADDR': '10.35.129.7', 'ONBOOT': 'yes', 'NM_CONTROLLED': 'no', 'NETMASK': '255.255.255.0', 'BOOTPROTO': 'static', 'DEVICE': 'em1.162', 'GATEWAY': '10.35.129.254'}, 'ipv6addrs': ['fe80::d6ae:52ff:feb9:c0c5/64'], 'vlanid': 162, 'mtu': '1500', 'netmask': '255.255.255.0', 'ipv4addrs': ['10.35.129.7/24']}}, 'cpuCores': '4', 'kvmEnabled': 'true', 'guestOverhead': '65', 'supportedRHEVMs': ['3.0'], 'cpuThreads': '8', 'emulatedMachines': [u'rhel6.5.0', u'pc', u'rhel6.4.0', u'rhel6.3.0', u'rhel6.2.0', u'rhel6.1.0', u'rhel6.0.0', u'rhel5.5.0', u'rhel5.4.4', u'rhel5.4.0'], 'operatingSystem': {'release': '6.5.0.1.el6', 'version': '6Server', 'name': 'RHEL'}, 'lastClient': '127.0.0.1'}} Detector thread::DEBUG::2014-09-28 15:26:20,367::protocoldetector::166::vds.MultiProtocolAcceptor::(_add_connection) Adding connection from 127.0.0.1:52023 Detector thread::DEBUG::2014-09-28 15:26:20,368::protocoldetector::177::vds.MultiProtocolAcceptor::(_remove_connection) Connection removed from 127.0.0.1:52023 Detector thread::DEBUG::2014-09-28 15:26:20,368::protocoldetector::203::vds.MultiProtocolAcceptor::(_handle_connection_read) Detected protocol xml from 127.0.0.1:52023 Detector thread::DEBUG::2014-09-28 15:26:20,369::BindingXMLRPC::1172::XmlDetector::(handleSocket) xml over http detected from ('127.0.0.1', 52023) Thread-15::DEBUG::2014-09-28 15:26:20,408::BindingXMLRPC::1132::vds::(wrapper) client [127.0.0.1]::call setupNetworks with ({'rhevm': {'nic': 'em1', 'netmask': '255.255.255.0', 'vlan': 162, 'ipaddr': '10.35.129.7', 'bootproto': 'static'}}, {}, {'connectivityCheck': False}) {} Thread-15::DEBUG::2014-09-28 15:26:23,148::BindingXMLRPC::1139::vds::(wrapper) return setupNetworks with {'status': {'message': 'Done', 'code': 0}} Detector thread::DEBUG::2014-09-28 15:26:23,171::protocoldetector::166::vds.MultiProtocolAcceptor::(_add_connection) Adding connection from 127.0.0.1:52024 Detector thread::DEBUG::2014-09-28 15:26:23,172::protocoldetector::177::vds.MultiProtocolAcceptor::(_remove_connection) Connection removed from 127.0.0.1:52024 Detector thread::DEBUG::2014-09-28 15:26:23,172::protocoldetector::203::vds.MultiProtocolAcceptor::(_handle_connection_read) Detected protocol xml from 127.0.0.1:52024 Detector thread::DEBUG::2014-09-28 15:26:23,172::BindingXMLRPC::1172::XmlDetector::(handleSocket) xml over http detected from ('127.0.0.1', 52024) Thread-16::DEBUG::2014-09-28 15:26:23,211::BindingXMLRPC::1132::vds::(wrapper) client [127.0.0.1]::call setSafeNetworkConfig with () {} Thread-16::DEBUG::2014-09-28 15:26:23,230::BindingXMLRPC::1139::vds::(wrapper) return setSafeNetworkConfig with {'status': {'message': 'Done', 'code': 0}} After that it tries to connect to the storage over NFS and fails (due to missing gateway probably)
Thanks for the paste, Sandro, if fails to get the GATEWAY into the setupNetworks command. It is most likely a bug in my code in the hosted engine deploy plugin (regrettably, I only tested it with dhcp).
Tanks Antoni, will look into that.
I think it may be something related to the case of the configuration key: if 'BOOTPROTO' in port_info['cfg']: attrs['bootproto'] = port_info['cfg']['BOOTPROTO'] if attrs.get('bootproto') == 'dhcp': attrs['blockingdhcp'] = True else: attrs['ipaddr'] = port_info['addr'] attrs['netmask'] = port_info['netmask'] gateway = port_info.get('gateway') if gateway is not None: attrs['gateway'] = gateway looks like we had something similar with BOOTPROTO in the past, trying to verify.
Verified on - 3.5.0-0.14.beta.el6ev and ovirt-hosted-engine-setup-1.2.1-1.el6ev.noarch
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-0161.html